real-jiakai
/

bert-base-chinese-finetuned-cmrc2018

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

bert-base-chinese-finetuned-cmrc2018 / README.md

real-jiakai's picture

Update README.md

b9b6183 verified 3 months ago

|

history blame contribute delete

3.46 kB

	---
	library_name: transformers
	base_model: bert-base-chinese
	tags:
	- generated_from_trainer
	datasets:
	- cmrc2018
	model-index:
	- name: chinese_qa
	results: []
	---

	# bert-base-chinese-finetuned-cmrc2018

	This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on the CMRC2018 (Chinese Machine Reading Comprehension) dataset.

	## Model Description

	This is a BERT-based extractive question answering model for Chinese text. The model is designed to locate and extract answer spans from given contexts in response to questions.

	Key Features:
	- Base Model: bert-base-chinese
	- Task: Extractive Question Answering
	- Language: Chinese
	- Training Dataset: CMRC2018

	## Performance Metrics

	Evaluation results on the test set:
	- Exact Match: 59.708
	- F1 Score: 60.0723
	- Number of evaluation samples: 6,254
	- Evaluation speed: 283.054 samples/second

	## Intended Uses & Limitations

	### Intended Uses
	- Chinese reading comprehension tasks
	- Answer extraction from given documents
	- Context-based question answering systems

	### Limitations
	- Only supports extractive QA (cannot generate new answers)
	- Answers must be present in the context
	- Does not support multi-hop reasoning
	- Cannot handle unanswerable questions

	## Training Details

	### Training Hyperparameters
	- Learning rate: 3e-05
	- Train batch size: 12
	- Eval batch size: 8
	- Seed: 42
	- Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
	- LR scheduler: linear
	- Number of epochs: 5.0

	### Training Results
	- Training time: 892.86 seconds
	- Training samples: 18,960
	- Training speed: 106.175 samples/second
	- Training loss: 0.5625

	### Framework Versions
	- Transformers: 4.47.0.dev0
	- Pytorch: 2.5.1+cu124
	- Datasets: 3.1.0
	- Tokenizers: 20.3

	## Usage

	```python
	import torch
	from transformers import AutoModelForQuestionAnswering, AutoTokenizer

	# Load model and tokenizer
	model = AutoModelForQuestionAnswering.from_pretrained("real-jiakai/bert-base-chinese-finetuned-cmrc2018")
	tokenizer = AutoTokenizer.from_pretrained("real-jiakai/bert-base-chinese-finetuned-cmrc2018")

	# Prepare inputs
	question = "长城有多长？"
	context = "长城是中国古代的伟大建筑工程，全长超过2万公里，横跨中国北部多个省份。"

	# Tokenize inputs
	inputs = tokenizer(
	question,
	context,
	return_tensors="pt",
	max_length=384,
	truncation=True
	)

	# Get answer
	outputs = model(**inputs)
	answer_start = torch.argmax(outputs.start_logits)
	answer_end = torch.argmax(outputs.end_logits) + 1
	answer = tokenizer.decode(inputs["input_ids"][0][answer_start:answer_end])
	print("Answer:", answer)
	```

	## Citation

	If you use this model, please cite the CMRC2018 dataset:

	```bibtex
	@inproceedings{cui-emnlp2019-cmrc2018,
	title = "A Span-Extraction Dataset for {C}hinese Machine Reading Comprehension",
	author = "Cui, Yiming and
	Liu, Ting and
	Che, Wanxiang and
	Xiao, Li and
	Chen, Zhipeng and
	Ma, Wentao and
	Wang, Shijin and
	Hu, Guoping",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
	month = nov,
	year = "2019",
	address = "Hong Kong, China",
	publisher = "Association for Computational Linguistics",
	url = "https://www.aclweb.org/anthology/D19-1600",
	doi = "10.18653/v1/D19-1600",
	pages = "5886--5891",
	}
	```