ml6team
/

cross-encoder-mmarco-german-distilbert-base

Text Classification

Inference Endpoints

Model card Files Files and versions Community

cross-encoder-mmarco-german-distilbert-base / README.md

mrchtr's picture

Update model card

26e737a over 2 years ago

|

history blame contribute delete

1.63 kB

	---
	language:
	- de
	tags:
	- cross-encoder
	widget:
	- text: "Was sind Lamas. Das Lama (Lama glama) ist eine Art der Kamele. Es ist in den südamerikanischen Anden verbreitet und eine vom Guanako abstammende Haustierform."
	example_title: "Example Query / Paragraph"
	license: apache-2.0
	metrics:
	- Rouge-Score
	---
	# cross-encoder-mmarco-german-distilbert-base

	## Model description:
	This model is a fine-tuned [cross-encoder](https://www.sbert.net/examples/training/cross-encoder/README.html) on the [MMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco) which is the machine translated version of the MS MARCO dataset.
	As base model for the fine-tuning we use [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased)

	Model input samples are tuples of the following format, either
	`<query, positive_paragraph>` assigned to 1 or `<query, negative_paragraph>` assigned to 0.

	The model was trained for 1 epoch.

	## Model usage
	The cross-encoder model can be used like this:

	```
	from sentence_transformers import CrossEncoder
	model = CrossEncoder('model_name')
	scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')])
	```

	The model will predict scores for the pairs `('Query 1', 'Paragraph 1')` and `('Query 2', 'Paragraph 2')`.

	For more details on the usage of the cross-encoder models have a look into the [Sentence-Transformers](https://www.sbert.net/)

	## Model Performance:
	Model evaluation was done on 2000 evaluation paragraphs of the dataset.

	\| Accuracy \| F1-Score \| Precision \| Recall \|
	\| --- \| --- \| --- \| --- \|
	\| 89.70 \| 86.82 \| 86.82 \| 93.50 \|