Update README.md

5fe900b about 1 year ago

4.64 kB

	---
	license: apache-2.0
	base_model: Geotrend/distilbert-base-nl-cased
	model-index:
	- name: distilbert-base-nl-cased-finetuned-squad
	results: []
	language:
	- nl
	pipeline_tag: question-answering
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# distilbert-base-nl-cased-finetuned-squad

	This model is a fine-tuned version of [Geotrend/distilbert-base-nl-cased](https://huggingface.co/Geotrend/distilbert-base-nl-cased) on the [Dutch Squad V2 Dataset](https://gitlab.com/niels.rouws/dutch-squad-v2.0) dataset [1], specifically tailored for the Question Answer task.
	It achieves the following results on the evaluation set:
	- Loss: 1.2834

	## Model description

	The base model, distilbert-base-nl-cased, is a smaller version of distilbert-base-multilingual-cased, designed to handle a custom number of languages (only Dutch in this case) while preserving the original model's accuracy. It is based on the principles outlined in the paper "Load What You Need: Smaller Versions of Multilingual BERT" by Abdaoui, Pradel, and Sigel (2020) [2].

	## Intended uses & limitations

	This fine-tuned model is optimized for Dutch Question Answering tasks. While it may perform well on similar tasks in other languages, its primary strength lies in extracting answers from Dutch-language contexts. Users are encouraged to consider the model's specific training focus when applying it to different language or task scenarios.

	## Training and evaluation data

	The model was trained on the Dutch Squad V2.0 Dataset, a machine-translated version of the original SQuAD v2.0 dataset. The statistics for both datasets are as follows:

	### Statistics
	```
	\| \| SQuAD v2.0 \| Dutch SQuAD v2.0 \|
	\|-------------------------\|------------------\|------------------\|
	\| Train \| \| \|
	\| Total examples \| 130,319 \| 95,054 \|
	\| Positive examples \| 86,821 \| 53,376 \|
	\| Negative examples \| 43,498 \| 41,768 \|
	\| Development \| \| \|
	\| Total examples \| 11,873 \| 9,294 \|
	\| Positive examples \| 5,928 \| 3,588 \|
	\| Negative examples \| 5,945 \| 5,706 \|
	```


	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training results
	```
	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 1.4685 \| 1.0 \| 6229 \| 1.2709 \|
	\| 1.1882 \| 2.0 \| 12458 \| 1.1931 \|
	\| 0.9488 \| 3.0 \| 18687 \| 1.2834 \|
	```

	### Raw Results
	```python
	{
	'exact': 59.479233914353344,
	'f1': 62.56163022484813,
	'total': 9294,
	'HasAns_exact': 38.405797101449274,
	'HasAns_f1': 46.390131357228995,
	'HasAns_total': 3588,
	'NoAns_exact': 72.73045916579039,
	'NoAns_f1': 72.73045916579039,
	'NoAns_total': 5706,
	'best_exact': 61.58812136862492,
	'best_exact_thresh': 0.0,
	'best_f1': 63.337535221120724,
	'best_f1_thresh': 0.0
	}
	```

	## Model Usage

	To use this model, you can follow the example below:

	```python
	from transformers import pipeline

	qa_pipeline = pipeline(
	"question-answering",
	model="tclungu/distilbert-base-nl-cased-finetuned-squad",
	tokenizer="tclungu/distilbert-base-nl-cased-finetuned-squad"
	)

	qa_pipeline({
	'context': "Amsterdam is de hoofdstad en de dichtstbevolkte stad van Nederland.",
	'question': "Wat is de hoofdstad van Nederland?"})

	```

	### Output
	```python
	{'score': 0.9984413385391235, 'start': 0, 'end': 9, 'answer': 'Amsterdam'}
	```

	## Framework versions

	- Transformers 4.33.3
	- Pytorch 2.0.1
	- Datasets 2.14.5
	- Tokenizers 0.13.3

	### References
	```
	[1] Rouws, N. J., Vakulenko, S., & Katrenko, S. (2022). Dutch squad and ensemble learning for question answering from labour agreements. In Artificial Intelligence and Machine Learning: 33rd Benelux Conference on Artificial Intelligence, BNAIC/Benelearn 2021, Esch-sur-Alzette, Luxembourg, November 10–12, 2021, Revised Selected Papers 33 (pp. 155-169). Springer International Publishing.
	[2] Abdaoui, A., Pradel, C., & Sigel, G. (2020). Load what you need: Smaller versions of multilingual bert. arXiv preprint arXiv:2010.05609.
	```