Skip to main content

ModernBERT

The original BERT paper came out in 2018, around 7 years ago at time of writing. However, it is still referred to and used as a strong baseline in a number of NLP tasks. ModernBERT was created by HuggingFace. ModernBERT is a drop-in replacement for use in problems where BERT may have previously been used and like the original, has a base and a large variant.

Comparison Table

 
BERT Base
ModernBERT Base
BERT Large
ModernBERT Large
# Params
110M
149M
340M
395M
Context Size
512
8192
512
8192
BEIR
38.9
41.6
38.9
44.0
MLDROOD
23.9
27.4
23.3
34.3
MLDRID
32.2
44.0
31.7
48.6
BEIR (ColBERT)
49.0
51.3
49.5
52.4
MLDROOD (ColBERT)
28.1
80.2
28.5
80.4
GLUE
84.7
88.5
85.2
90.4
CSN
41.2
56.4
41.6
59.5
SQA
59.5
73.6
60.8
83.9

mmBERT is a modern multi-lingual encoder-only model based on ModernBERT.