File size: 1,633 Bytes

26e737a

---
language: 
- de
tags:
- cross-encoder
widget:
- text: "Was sind Lamas. Das Lama (Lama glama) ist eine Art der Kamele. Es ist in den südamerikanischen Anden verbreitet und eine vom Guanako abstammende Haustierform."
  example_title: "Example Query / Paragraph"
license: apache-2.0
metrics:
- Rouge-Score
---
# cross-encoder-mmarco-german-distilbert-base

## Model description:
This model is a fine-tuned [cross-encoder](https://www.sbert.net/examples/training/cross-encoder/README.html) on the [MMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco) which is the machine translated version of the MS MARCO dataset.
As base model for the fine-tuning we use [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased)

Model input samples are tuples of the following format, either 
`<query, positive_paragraph>` assigned to 1 or `<query, negative_paragraph>` assigned to 0.
 
The model was trained for 1 epoch.

## Model usage
The cross-encoder model can be used like this:

```
from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name')
scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')])
```

The model will predict scores for the pairs `('Query 1', 'Paragraph 1')` and `('Query 2', 'Paragraph 2')`.

For more details on the usage of the cross-encoder models have a look into the [Sentence-Transformers](https://www.sbert.net/)

## Model Performance:
Model evaluation was done on 2000 evaluation paragraphs of the dataset.

| Accuracy | F1-Score | Precision | Recall |
| --- | --- | --- | --- |
| 89.70 | 86.82 | 86.82 | 93.50 |