ModernBERT
The original BERT paper came out in 2018, around 7 years ago at time of writing. However, it is still referred to and used as a strong baseline in a number of NLP tasks. ModernBERT was created by HuggingFace. ModernBERT is a drop-in replacement for use in problems where BERT may have previously been used and like the original, has a base and a large variant.
Comparison Table
BERT Base
| |
ModernBERT Base
| |
| |
ModernBERT Large
|
|||
---|---|---|---|---|---|---|---|---|---|
# Params
|
110M
|
149M
|
340M
|
395M
|
|||||
Context Size
|
512
|
8192
|
512
|
8192
|
|||||
BEIR
|
38.9
|
41.6
|
38.9
|
44.0
|
|||||
GLUE
|
38.9
|
41.6
|
38.9
|
44.0
|
|||||
MMLU
|
41.6
|
38.9
|
44.0
|
38.9
|
|||||
Hellaswag
|
41.6
|
38.9
|
44.0
|
38.9
|
|||||
PIQA
|
41.6
|
38.9
|
44.0
|
38.9
|
|||||
ARC Challenge
|
41.6
|
38.9
|
44.0
|
38.9
|
|||||
TruthfulQA
|
41.6
|
38.9
|
44.0
|
38.9
|
|||||
CoQA
|
41.6
|
38.9
|
44.0
|
38.9
|
|||||
SQuAD
|
89.6
|
91.6
|
89.6
|
91.6
|
mmBERT is a modern multi-lingual encoder-only model based on ModernBERT.