ModernBERT

The original BERT paper came out in 2018, around 7 years ago at time of writing. However, it is still referred to and used as a strong baseline in a number of NLP tasks. ModernBERT was created by HuggingFace. ModernBERT is a drop-in replacement for use in problems where BERT may have previously been used and like the original, has a base and a large variant.

Comparison Table

			BERT Base		ModernBERT Base	~~Bert~~BERT ~~LARGE~~Large	ModernBERT Large
# Params	110M	149M	340M	395M
Context Size	512	8192	512	8192
BEIR	38.9	41.6	38.9	44.0
GLUE	38.9	41.6	38.9	44.0
MMLU	41.6	38.9	44.0	38.9
Hellaswag	41.6	38.9	44.0	38.9
PIQA	41.6	38.9	44.0	38.9
ARC Challenge	41.6	38.9	44.0	38.9
TruthfulQA	41.6	38.9	44.0	38.9
CoQA	41.6	38.9	44.0	38.9
SQuAD	89.6	91.6	89.6	91.6

mmBERT is a modern multi-lingual encoder-only model based on ModernBERT.