tclungu's picture
Update README.md
5fe900b
---
license: apache-2.0
base_model: Geotrend/distilbert-base-nl-cased
model-index:
- name: distilbert-base-nl-cased-finetuned-squad
results: []
language:
- nl
pipeline_tag: question-answering
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# distilbert-base-nl-cased-finetuned-squad
This model is a fine-tuned version of [Geotrend/distilbert-base-nl-cased](https://huggingface.co/Geotrend/distilbert-base-nl-cased) on the [Dutch Squad V2 Dataset](https://gitlab.com/niels.rouws/dutch-squad-v2.0) dataset [1], specifically tailored for the Question Answer task.
It achieves the following results on the evaluation set:
- Loss: 1.2834
## Model description
The base model, distilbert-base-nl-cased, is a smaller version of distilbert-base-multilingual-cased, designed to handle a custom number of languages (only Dutch in this case) while preserving the original model's accuracy. It is based on the principles outlined in the paper "Load What You Need: Smaller Versions of Multilingual BERT" by Abdaoui, Pradel, and Sigel (2020) [2].
## Intended uses & limitations
This fine-tuned model is optimized for Dutch Question Answering tasks. While it may perform well on similar tasks in other languages, its primary strength lies in extracting answers from Dutch-language contexts. Users are encouraged to consider the model's specific training focus when applying it to different language or task scenarios.
## Training and evaluation data
The model was trained on the Dutch Squad V2.0 Dataset, a machine-translated version of the original SQuAD v2.0 dataset. The statistics for both datasets are as follows:
### Statistics
```
| | SQuAD v2.0 | Dutch SQuAD v2.0 |
|-------------------------|------------------|------------------|
| **Train** | | |
| Total examples | 130,319 | 95,054 |
| Positive examples | 86,821 | 53,376 |
| Negative examples | 43,498 | 41,768 |
| **Development** | | |
| Total examples | 11,873 | 9,294 |
| Positive examples | 5,928 | 3,588 |
| Negative examples | 5,945 | 5,706 |
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
### Training results
```
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 1.4685 | 1.0 | 6229 | 1.2709 |
| 1.1882 | 2.0 | 12458 | 1.1931 |
| 0.9488 | 3.0 | 18687 | 1.2834 |
```
### Raw Results
```python
{
'exact': 59.479233914353344,
'f1': 62.56163022484813,
'total': 9294,
'HasAns_exact': 38.405797101449274,
'HasAns_f1': 46.390131357228995,
'HasAns_total': 3588,
'NoAns_exact': 72.73045916579039,
'NoAns_f1': 72.73045916579039,
'NoAns_total': 5706,
'best_exact': 61.58812136862492,
'best_exact_thresh': 0.0,
'best_f1': 63.337535221120724,
'best_f1_thresh': 0.0
}
```
## Model Usage
To use this model, you can follow the example below:
```python
from transformers import pipeline
qa_pipeline = pipeline(
"question-answering",
model="tclungu/distilbert-base-nl-cased-finetuned-squad",
tokenizer="tclungu/distilbert-base-nl-cased-finetuned-squad"
)
qa_pipeline({
'context': "Amsterdam is de hoofdstad en de dichtstbevolkte stad van Nederland.",
'question': "Wat is de hoofdstad van Nederland?"})
```
### Output
```python
{'score': 0.9984413385391235, 'start': 0, 'end': 9, 'answer': 'Amsterdam'}
```
## Framework versions
- Transformers 4.33.3
- Pytorch 2.0.1
- Datasets 2.14.5
- Tokenizers 0.13.3
### References
```
[1] Rouws, N. J., Vakulenko, S., & Katrenko, S. (2022). Dutch squad and ensemble learning for question answering from labour agreements. In Artificial Intelligence and Machine Learning: 33rd Benelux Conference on Artificial Intelligence, BNAIC/Benelearn 2021, Esch-sur-Alzette, Luxembourg, November 10–12, 2021, Revised Selected Papers 33 (pp. 155-169). Springer International Publishing.
[2] Abdaoui, A., Pradel, C., & Sigel, G. (2020). Load what you need: Smaller versions of multilingual bert. arXiv preprint arXiv:2010.05609.
```