|
--- |
|
library_name: transformers |
|
base_model: bert-base-chinese |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- cmrc2018 |
|
model-index: |
|
- name: chinese_qa |
|
results: [] |
|
--- |
|
|
|
# bert-base-chinese-finetuned-cmrc2018 |
|
|
|
This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on the CMRC2018 (Chinese Machine Reading Comprehension) dataset. |
|
|
|
## Model Description |
|
|
|
This is a BERT-based extractive question answering model for Chinese text. The model is designed to locate and extract answer spans from given contexts in response to questions. |
|
|
|
Key Features: |
|
- Base Model: bert-base-chinese |
|
- Task: Extractive Question Answering |
|
- Language: Chinese |
|
- Training Dataset: CMRC2018 |
|
|
|
## Performance Metrics |
|
|
|
Evaluation results on the test set: |
|
- Exact Match: 59.708 |
|
- F1 Score: 60.0723 |
|
- Number of evaluation samples: 6,254 |
|
- Evaluation speed: 283.054 samples/second |
|
|
|
## Intended Uses & Limitations |
|
|
|
### Intended Uses |
|
- Chinese reading comprehension tasks |
|
- Answer extraction from given documents |
|
- Context-based question answering systems |
|
|
|
### Limitations |
|
- Only supports extractive QA (cannot generate new answers) |
|
- Answers must be present in the context |
|
- Does not support multi-hop reasoning |
|
- Cannot handle unanswerable questions |
|
|
|
## Training Details |
|
|
|
### Training Hyperparameters |
|
- Learning rate: 3e-05 |
|
- Train batch size: 12 |
|
- Eval batch size: 8 |
|
- Seed: 42 |
|
- Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08) |
|
- LR scheduler: linear |
|
- Number of epochs: 5.0 |
|
|
|
### Training Results |
|
- Training time: 892.86 seconds |
|
- Training samples: 18,960 |
|
- Training speed: 106.175 samples/second |
|
- Training loss: 0.5625 |
|
|
|
### Framework Versions |
|
- Transformers: 4.47.0.dev0 |
|
- Pytorch: 2.5.1+cu124 |
|
- Datasets: 3.1.0 |
|
- Tokenizers: 20.3 |
|
|
|
## Usage |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForQuestionAnswering, AutoTokenizer |
|
|
|
# Load model and tokenizer |
|
model = AutoModelForQuestionAnswering.from_pretrained("real-jiakai/bert-base-chinese-finetuned-cmrc2018") |
|
tokenizer = AutoTokenizer.from_pretrained("real-jiakai/bert-base-chinese-finetuned-cmrc2018") |
|
|
|
# Prepare inputs |
|
question = "长城有多长?" |
|
context = "长城是中国古代的伟大建筑工程,全长超过2万公里,横跨中国北部多个省份。" |
|
|
|
# Tokenize inputs |
|
inputs = tokenizer( |
|
question, |
|
context, |
|
return_tensors="pt", |
|
max_length=384, |
|
truncation=True |
|
) |
|
|
|
# Get answer |
|
outputs = model(**inputs) |
|
answer_start = torch.argmax(outputs.start_logits) |
|
answer_end = torch.argmax(outputs.end_logits) + 1 |
|
answer = tokenizer.decode(inputs["input_ids"][0][answer_start:answer_end]) |
|
print("Answer:", answer) |
|
``` |
|
|
|
## Citation |
|
|
|
If you use this model, please cite the CMRC2018 dataset: |
|
|
|
```bibtex |
|
@inproceedings{cui-emnlp2019-cmrc2018, |
|
title = "A Span-Extraction Dataset for {C}hinese Machine Reading Comprehension", |
|
author = "Cui, Yiming and |
|
Liu, Ting and |
|
Che, Wanxiang and |
|
Xiao, Li and |
|
Chen, Zhipeng and |
|
Ma, Wentao and |
|
Wang, Shijin and |
|
Hu, Guoping", |
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)", |
|
month = nov, |
|
year = "2019", |
|
address = "Hong Kong, China", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://www.aclweb.org/anthology/D19-1600", |
|
doi = "10.18653/v1/D19-1600", |
|
pages = "5886--5891", |
|
} |
|
``` |