File size: 3,460 Bytes
20f8e26 a8005a7 20f8e26 a8005a7 20f8e26 a8005a7 20f8e26 a8005a7 20f8e26 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
library_name: transformers
base_model: bert-base-chinese
tags:
- generated_from_trainer
datasets:
- cmrc2018
model-index:
- name: chinese_qa
results: []
---
# bert-base-chinese-finetuned-cmrc2018
This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on the CMRC2018 (Chinese Machine Reading Comprehension) dataset.
## Model Description
This is a BERT-based extractive question answering model for Chinese text. The model is designed to locate and extract answer spans from given contexts in response to questions.
Key Features:
- Base Model: bert-base-chinese
- Task: Extractive Question Answering
- Language: Chinese
- Training Dataset: CMRC2018
## Performance Metrics
Evaluation results on the test set:
- Exact Match: 59.708
- F1 Score: 60.0723
- Number of evaluation samples: 6,254
- Evaluation speed: 283.054 samples/second
## Intended Uses & Limitations
### Intended Uses
- Chinese reading comprehension tasks
- Answer extraction from given documents
- Context-based question answering systems
### Limitations
- Only supports extractive QA (cannot generate new answers)
- Answers must be present in the context
- Does not support multi-hop reasoning
- Cannot handle unanswerable questions
## Training Details
### Training Hyperparameters
- Learning rate: 3e-05
- Train batch size: 12
- Eval batch size: 8
- Seed: 42
- Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
- LR scheduler: linear
- Number of epochs: 5.0
### Training Results
- Training time: 892.86 seconds
- Training samples: 18,960
- Training speed: 106.175 samples/second
- Training loss: 0.5625
### Framework Versions
- Transformers: 4.47.0.dev0
- Pytorch: 2.5.1+cu124
- Datasets: 3.1.0
- Tokenizers: 20.3
## Usage
```python
import torch
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
# Load model and tokenizer
model = AutoModelForQuestionAnswering.from_pretrained("real-jiakai/bert-base-chinese-finetuned-cmrc2018")
tokenizer = AutoTokenizer.from_pretrained("real-jiakai/bert-base-chinese-finetuned-cmrc2018")
# Prepare inputs
question = "长城有多长?"
context = "长城是中国古代的伟大建筑工程,全长超过2万公里,横跨中国北部多个省份。"
# Tokenize inputs
inputs = tokenizer(
question,
context,
return_tensors="pt",
max_length=384,
truncation=True
)
# Get answer
outputs = model(**inputs)
answer_start = torch.argmax(outputs.start_logits)
answer_end = torch.argmax(outputs.end_logits) + 1
answer = tokenizer.decode(inputs["input_ids"][0][answer_start:answer_end])
print("Answer:", answer)
```
## Citation
If you use this model, please cite the CMRC2018 dataset:
```bibtex
@inproceedings{cui-emnlp2019-cmrc2018,
title = "A Span-Extraction Dataset for {C}hinese Machine Reading Comprehension",
author = "Cui, Yiming and
Liu, Ting and
Che, Wanxiang and
Xiao, Li and
Chen, Zhipeng and
Ma, Wentao and
Wang, Shijin and
Hu, Guoping",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1600",
doi = "10.18653/v1/D19-1600",
pages = "5886--5891",
}
``` |