ml6team
/

cross-encoder-mmarco-german-distilbert-base

Text Classification

Inference Endpoints

Model card Files Files and versions Community

mrchtr commited on Apr 27, 2022

Commit

26e737a

·

1 Parent(s): 04382e8

Update model card

Files changed (1) hide show

README.md +42 -0

README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+language:
+- de
+tags:
+- cross-encoder
+widget:
+- text: "Was sind Lamas. Das Lama (Lama glama) ist eine Art der Kamele. Es ist in den südamerikanischen Anden verbreitet und eine vom Guanako abstammende Haustierform."
+  example_title: "Example Query / Paragraph"
+license: apache-2.0
+metrics:
+- Rouge-Score
+---
+# cross-encoder-mmarco-german-distilbert-base
+## Model description:
+This model is a fine-tuned [cross-encoder](https://www.sbert.net/examples/training/cross-encoder/README.html) on the [MMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco) which is the machine translated version of the MS MARCO dataset.
+As base model for the fine-tuning we use [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased)
+Model input samples are tuples of the following format, either
+`<query, positive_paragraph>` assigned to 1 or `<query, negative_paragraph>` assigned to 0.
+The model was trained for 1 epoch.
+## Model usage
+The cross-encoder model can be used like this:
+```
+from sentence_transformers import CrossEncoder
+model = CrossEncoder('model_name')
+scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')])
+```
+The model will predict scores for the pairs `('Query 1', 'Paragraph 1')` and `('Query 2', 'Paragraph 2')`.
+For more details on the usage of the cross-encoder models have a look into the [Sentence-Transformers](https://www.sbert.net/)
+## Model Performance:
+Model evaluation was done on 2000 evaluation paragraphs of the dataset.
+| Accuracy | F1-Score | Precision | Recall |
+| --- | --- | --- | --- |
+| 89.70 | 86.82 | 86.82 | 93.50 |