Skip to main content

ModernBERT

The original BERT paper came out in 2018, around 7 years ago at time of writing. However, it is still referred to and used as a strong baseline in a number of NLP tasks. ModernBERT was created by HuggingFace. ModernBERT is a drop-in replacement for use in problems where BERT may have previously been used and like the original, has a base and a large variant.

Comparison Table

 
BERT Base
ModernBERT Base
BERT Large
ModernBERT Large
# Params
110M
149M
340M
395M
Context Size
512
8192
512
8192
BEIR
38.9
41.6
38.9
44.0
GLUE
38.9
41.6
38.9
44.0
MMLU
41.6
38.9
44.0
38.9
Hellaswag
41.6
38.9
44.0
38.9
PIQA
41.6
38.9
44.0
38.9
ARC Challenge
41.6
38.9
44.0
38.9
TruthfulQA
41.6
38.9
44.0
38.9
CoQA
41.6
38.9
44.0
38.9
SQuAD
89.6
91.6
89.6
91.6

mmBERT is a modern multi-lingual encoder-only model based on ModernBERT.