--- license: mit datasets: - dleemiller/wiki-sim - sentence-transformers/stsb language: - en metrics: - spearmanr - pearsonr base_model: - answerdotai/ModernBERT-base pipeline_tag: text-classification library_name: sentence-transformers tags: - cross-encoder - modernbert - sts - stsb - stsbenchmark-sts model-index: - name: CrossEncoder based on answerdotai/ModernBERT-base results: - task: type: semantic-similarity name: Semantic Similarity dataset: name: sts test type: sts-test metrics: - type: pearson_cosine value: 0.9162245947821821 name: Pearson Cosine - type: spearman_cosine value: 0.9121555789491528 name: Spearman Cosine - task: type: semantic-similarity name: Semantic Similarity dataset: name: sts dev type: sts-dev metrics: - type: pearson_cosine value: 0.9260833551026787 name: Pearson Cosine - type: spearman_cosine value: 0.9236030687487745 name: Spearman Cosine --- # ModernBERT Cross-Encoder: Semantic Similarity (STS) Cross encoders are high performing encoder models that compare two texts and output a 0-1 score. I've found the `cross-encoders/roberta-large-stsb` model to be very useful in creating evaluators for LLM outputs. They're simple to use, fast and very accurate. Like many people, I was excited about the architecture and training uplift from the ModernBERT architecture (`answerdotai/ModernBERT-base`). So I've applied it to the stsb cross encoder, which is a very handy model. Additionally, I've added pretraining from a much larger semi-synthetic dataset `dleemiller/wiki-sim` that targets this kind of objective. The inference performance efficiency, expanded context and simplicity make this a really nice platform as an evaluator model. --- ## Features - **High performing:** Achieves **Pearson: 0.9162** and **Spearman: 0.9122** on the STS-Benchmark test set. - **Efficient architecture:** Based on the ModernBERT-base design (149M parameters), offering faster inference speeds. - **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals. - **Diversified training:** Pretrained on `dleemiller/wiki-sim` and fine-tuned on `sentence-transformers/stsb`. --- ## Performance | Model | STS-B Test Pearson | STS-B Test Spearman | Context Length | Parameters | Speed | |--------------------------------|--------------------|---------------------|----------------|------------|---------| | `ModernCE-large-sts` | **0.9256** | **0.9215** | **8192** | 395M | **Medium** | | `ModernCE-base-sts` | **0.9162** | **0.9122** | **8192** | 149M | **Fast** | | `stsb-roberta-large` | 0.9147 | - | 512 | 355M | Slow | | `stsb-distilroberta-base` | 0.8792 | - | 512 | 82M | Fast | --- ## Usage To use ModernCE for semantic similarity tasks, you can load the model with the Hugging Face `sentence-transformers` library: ```python from sentence_transformers import CrossEncoder # Load ModernCE model model = CrossEncoder("dleemiller/ModernCE-base-sts") # Predict similarity scores for sentence pairs sentence_pairs = [ ("It's a wonderful day outside.", "It's so sunny today!"), ("It's a wonderful day outside.", "He drove to work earlier."), ] scores = model.predict(sentence_pairs) print(scores) # Outputs: array([0.9184, 0.0123], dtype=float32) ``` ### Output The model returns similarity scores in the range `[0, 1]`, where higher scores indicate stronger semantic similarity. --- ## Training Details ### Pretraining The model was pretrained on the `pair-score-sampled` subset of the [`dleemiller/wiki-sim`](https://huggingface.co/datasets/dleemiller/wiki-sim) dataset. This dataset provides diverse sentence pairs with semantic similarity scores, helping the model build a robust understanding of relationships between sentences. - **Classifier Dropout:** a somewhat large classifier dropout of 0.3, to reduce overreliance on teacher scores. - **Objective:** STS-B scores from `cross-encoder/stsb-roberta-large`. ### Fine-Tuning Fine-tuning was performed on the [`sentence-transformers/stsb`](https://huggingface.co/datasets/sentence-transformers/stsb) dataset. ### Validation Results The model achieved the following test set performance after fine-tuning: - **Pearson Correlation:** 0.9162 - **Spearman Correlation:** 0.9122 --- ## Model Card - **Architecture:** ModernBERT-base - **Tokenizer:** Custom tokenizer trained with modern techniques for long-context handling. - **Pretraining Data:** `dleemiller/wiki-sim (pair-score-sampled)` - **Fine-Tuning Data:** `sentence-transformers/stsb` --- ## Thank You Thanks to the AnswerAI team for providing the ModernBERT models, and the Sentence Transformers team for their leadership in transformer encoder models. --- ## Citation If you use this model in your research, please cite: ```bibtex @misc{moderncestsb2025, author = {Miller, D. Lee}, title = {ModernCE STS: An STS cross encoder model}, year = {2025}, publisher = {Hugging Face Hub}, url = {https://huggingface.co/dleemiller/ModernCE-base-sts}, } ``` --- ## License This model is licensed under the [MIT License](LICENSE).