File size: 1,633 Bytes
26e737a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
language:
- de
tags:
- cross-encoder
widget:
- text: "Was sind Lamas. Das Lama (Lama glama) ist eine Art der Kamele. Es ist in den südamerikanischen Anden verbreitet und eine vom Guanako abstammende Haustierform."
example_title: "Example Query / Paragraph"
license: apache-2.0
metrics:
- Rouge-Score
---
# cross-encoder-mmarco-german-distilbert-base
## Model description:
This model is a fine-tuned [cross-encoder](https://www.sbert.net/examples/training/cross-encoder/README.html) on the [MMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco) which is the machine translated version of the MS MARCO dataset.
As base model for the fine-tuning we use [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased)
Model input samples are tuples of the following format, either
`<query, positive_paragraph>` assigned to 1 or `<query, negative_paragraph>` assigned to 0.
The model was trained for 1 epoch.
## Model usage
The cross-encoder model can be used like this:
```
from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name')
scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')])
```
The model will predict scores for the pairs `('Query 1', 'Paragraph 1')` and `('Query 2', 'Paragraph 2')`.
For more details on the usage of the cross-encoder models have a look into the [Sentence-Transformers](https://www.sbert.net/)
## Model Performance:
Model evaluation was done on 2000 evaluation paragraphs of the dataset.
| Accuracy | F1-Score | Precision | Recall |
| --- | --- | --- | --- |
| 89.70 | 86.82 | 86.82 | 93.50 | |