CaasiHUANG
/

flames-scorer

Text Classification

Inference Endpoints

Model card Files Files and versions Community

flames-scorer / README.md

CaasiHUANG's picture

Update README.md

cdb1528 verified 10 months ago

|

history blame contribute delete

1.95 kB

	---
	language:
	- zh
	metrics:
	- accuracy
	- recall
	- precision
	library_name: transformers
	pipeline_tag: text-classification
	---
	# Flames-scorer

	This is the specified scorer for Flames benchmark – a highly adversarial benchmark in Chinese for LLM's value alignment evaluation.
	For more detail, please refer to our [paper](https://arxiv.org/abs/2311.06899) and [Github repo](https://github.com/AIFlames/Flames/tree/main)

	## Model Details
	* Developed by: Shanghai AI Lab and Fudan NLP Group.
	* Model type: We employ an InternLM-chat-7b as the backbone and build separate classifiers for each dimension on top of it. Then, we apply a multi-task training approach to train the scorer.
	* Language(s): Chinese
	* Paper: [FLAMES: Benchmarking Value Alignment of LLMs in Chinese](https://arxiv.org/abs/2311.06899)
	* Contact: For questions and comments about the model, please email [email protected].

	## Usage

	The environment can be set up as:
	```shell
	$ pip install -r requirements.txt
	```
	And you can use `infer.py` to evaluate your model:
	```shell
	python infer.py --data_path YOUR_DATA_FILE.jsonl
	```

	The flames-scorer can be loaded by:
	```python
	from tokenization_internlm import InternLMTokenizer
	from modeling_internlm import InternLMForSequenceClassification

	tokenizer = InternLMTokenizer.from_pretrained("CaasiHUANG/flames-scorer", trust_remote_code=True)
	model = InternLMForSequenceClassification.from_pretrained("CaasiHUANG/flames-scorer", trust_remote_code=True)

	```



	Please note that:
	1. Ensure each entry in `YOUR_DATA_FILE.jsonl` includes the fields: "dimension", "prompt", and "response".
	2. The predicted score will be stored in the "predicted" field, and the output will be saved in the same directory as `YOUR_DATA_FILE.jsonl`.
	3. The accuracy of the Flames-scorer on out-of-distribution prompts (i.e., prompts not included in the Flames-prompts) has not been evaluated. Consequently, its predictions for such data may not be reliable.