# ModernBERT

The[ original BERT paper](https://arxiv.org/abs/1810.04805) came out in 2018, around 7 years ago at time of writing. However, it is still referred to and used as a strong baseline in a number of NLP tasks. [ModernBERT](https://huggingface.co/blog/modernbert) was created by HuggingFace. ModernBERT is a drop-in replacement for use in problems where BERT may have previously been used and like the original, has a base and a large variant.

ModernBERT also outperforms [DEBERTA-v3-base](https://huggingface.co/microsoft/deberta-v3-base) which has been a favourite of NLP practitioners for a few years thanks to its few-shot and zero-shot capabilities.

Comparison Table

<table border="1" id="bkmrk-bert-base-modernbert" style="border-collapse: collapse; width: 100%; height: 149px;"><colgroup><col style="width: 20.0219%;"></col><col style="width: 20.0219%;"></col><col style="width: 20.0219%;"></col><col style="width: 20.0219%;"></col><col style="width: 20.0219%;"></col></colgroup><thead><tr style="height: 29.8px;"><th class="px-3! py-1.5! cursor-pointer border border-gray-100 dark:border-gray-850" scope="col" style="text-align: null;"> </th><th class="px-3! py-1.5! cursor-pointer border border-gray-100 dark:border-gray-850" scope="col" style="text-align: null;"><div class="gap-1.5 text-left"><div class="shrink-0 break-normal">**BERT Base**</div></div></th><th class="px-3! py-1.5! cursor-pointer border border-gray-100 dark:border-gray-850" scope="col" style="text-align: null;"><div class="gap-1.5 text-left"><div class="shrink-0 break-normal">**ModernBERT Base**</div></div></th><th class="px-3! py-1.5! cursor-pointer border border-gray-100 dark:border-gray-850" scope="col" style="text-align: null;"><div class="gap-1.5 text-left"><div class="shrink-0 break-normal">**BERT Large**</div></div></th><th class="px-3! py-1.5! cursor-pointer border border-gray-100 dark:border-gray-850" scope="col" style="text-align: null;"><div class="gap-1.5 text-left"><div class="shrink-0 break-normal">**ModernBERT Large**</div></div></th></tr></thead><tbody><tr style="height: 29.8px;"><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**\# Params**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">110M</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">149M</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">340M</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">395M</div></td></tr><tr style="height: 29.8px;"><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**Context Size**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">512</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">8192</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">512</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">8192</div></td></tr><tr style="height: 29.8px;"><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**BEIR**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">38.9</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">41.6</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">38.9</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">44.0</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**[MLDR](https://infiniflow.org/blog/multi-way-retrieval-evaluations-on-infinity-database)<sub>OOD </sub>**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">23.9</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">27.4</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">23.3</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">34.3</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**MLDR<sub>ID</sub>**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">32.2</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">44.0</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">31.7</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">48.6</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**BEIR (ColBERT)**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">49.0</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">51.3</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">49.5</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">52.4</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**MLDR<sub>OOD</sub> (ColBERT)**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">28.1</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">80.2</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">28.5</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">80.4</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**GLUE**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">84.7</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">88.5</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">85.2</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">90.4</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**CSN**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">41.2</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">56.4</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">41.6</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">59.5</div></td></tr><tr><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">**SQA**</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">59.5</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">73.6</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">60.8</div></td><td class="px-3! py-1.5! text-gray-900 dark:text-white w-max border border-gray-100 dark:border-gray-850"><div class="break-normal">83.9</div></td></tr></tbody></table>

OOD=out of domain ID=In Domain

[mmBERT](https://arxiv.org/abs/2509.06888) is a modern multi-lingual encoder-only model based on ModernBERT.