Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
MoritzLaurer 
posted an update 15 days ago
Post
2498
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here 👇https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb

Any plans to train a version of your zero-shot models on ModernBERT? I'm finding that ModernBERT is a huge boost in speed, and a slight drop in performance vs. DeBERTa when I tune it. Not sure if the performance drop is because your zero-shot models were such a strong foundation for transfer learning, or the strength of DeBERTa architecture on NLI.