Em2-bloomz-7b / README.md

Update README.md

9a86a65 verified 7 months ago

11 kB

	---
	language:
	- en
	- ar
	license: gpl
	tags:
	- autograding
	- essay quetion
	- sentence similarity
	metrics:
	- accuracy
	library_name: peft
	datasets:
	- mohamedemam/Essay-quetions-auto-grading
	---
	# Model Card for Model ID

	Fine tuned version of Bloomz on Essay-quetions-auto-grading dataset
	this model support autograding for Arabic and English only


	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	We are thrilled to introduce our graduation project, the EM2 model, designed for automated essay grading in both Arabic and English. 📝✨

	To develop this model, we first created a custom dataset for training. We adapted the QuAC and OpenOrca datasets to make them suitable for our automated essay grading application.

	Our model utilizes the following impressive models:

	Mistral: 96%
	LLaMA: 93%
	FLAN-T5: 93%
	BLOOMZ (Arabic): 86%
	MT0 (Arabic): 84%

	You can try our models for auto-grading on Hugging Face! 🌐

	We then deployed these models for practical use. We are proud of our team's hard work and the potential impact of the EM2 model in the field of education. 🌟

	#MachineLearning #AI #Education #EssayGrading #GraduationProject

	- Developed by: mohamed emam
	- Model type: decoder only
	- Language(s) (NLP): English
	- License: gpl
	- Finetuned from model : bloomz

	### Explain how it work
	- model take three inputs first context or perfect answer + quetion on context + student answer
	then model output the result
	<!-- Provide the basic links for the model. -->

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6456f2eca9b8e1fd4cbe5ebe/1vEkwn5Mj_0BJ08kU6J57.png)
	- Hugging Face https://huggingface.co/mohamedemam/Em2-bloomz-7b
	- Repository: https://github.com/mohamed-em2m/Automatic-Grading-AI
	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	auto grading for essay quetions

	### Downstream Use [optional]

	<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

	Text generation

	[More Information Needed]

	### Training Data
	- mohamedemam/Essay-quetions-auto-grading-arabic


	### Training Procedure

	using Trl
	### Pipline
	```python
	from transformers import Pipeline
	import torch.nn.functional as F


	class MyPipeline:

	def __init__(self,model,tokenizer):
	self.model=model
	self.tokenizer=tokenizer

	def chat_Format(self,context, quetion, answer):
	return "Instruction:/n check answer is true or false of next quetion using context below:\n" + "#context: " + context + f".\n#quetion: " + quetion + f".\n#student answer: " + answer + ".\n#response:"


	def __call__(self, context, quetion, answer,generate=1,max_new_tokens=4, num_beams=2, do_sample=False,num_return_sequences=1):
	inp=self.chat_Format(context, quetion, answer)
	w = self.tokenizer(inp, add_special_tokens=True,
	pad_to_max_length=True,
	return_attention_mask=True,
	return_tensors='pt')
	response=""
	if(generate):
	outputs = self.tokenizer.batch_decode(self.model.generate(input_ids=w['input_ids'].cuda(), attention_mask=w['attention_mask'].cuda(), max_new_tokens=max_new_tokens, num_beams=num_beams, do_sample=do_sample, num_return_sequences=num_return_sequences), skip_special_tokens=True)
	response = outputs

	s =self.model(input_ids=w['input_ids'].cuda(), attention_mask=w['attention_mask'].cuda())['logits'][0][-1]
	s = F.softmax(s, dim=-1)
	yes_token_id = self.tokenizer.convert_tokens_to_ids(self.tokenizer.tokenize("True")[0])
	no_token_id = self.tokenizer.convert_tokens_to_ids(self.tokenizer.tokenize("False")[0])

	for i in ["Yes", "yes", "True", "true","صحيح"]:
	for word in self.tokenizer.tokenize(i):
	s[yes_token_id] += s[self.tokenizer.convert_tokens_to_ids(word)]
	for i in ["No", "no", "False", "false","خطأ"]:
	for word in self.tokenizer.tokenize(i):

	s[no_token_id] += s[self.tokenizer.convert_tokens_to_ids(word)]
	true = (s[yes_token_id] / (s[no_token_id] + s[yes_token_id])).item()
	return {"response": response, "true": true}
	context="""Large language models, such as GPT-4, are trained on vast amounts of text data to understand and generate human-like text. The deployment of these models involves several steps:

	Model Selection: Choosing a pre-trained model that fits the application's needs.
	Infrastructure Setup: Setting up the necessary hardware and software infrastructure to run the model efficiently, including cloud services, GPUs, and necessary libraries.
	Integration: Integrating the model into an application, which can involve setting up APIs or embedding the model directly into the software.
	Optimization: Fine-tuning the model for specific tasks or domains and optimizing it for performance and cost-efficiency.
	Monitoring and Maintenance: Ensuring the model performs well over time, monitoring for biases, and updating the model as needed."""
	quetion="What are the key considerations when choosing a cloud service provider for deploying a large language model like GPT-4?"
	answer="""When choosing a cloud service provider for deploying a large language model like GPT-4, the key considerations include:
	Compute Power: Ensure the provider offers high-performance GPUs or TPUs capable of handling the computational requirements of the model.
	Scalability: The ability to scale resources up or down based on the application's demand to handle varying workloads efficiently.
	Cost: Analyze the pricing models to understand the costs associated with compute time, storage, data transfer, and any other services.
	Integration and Support: Availability of tools and libraries that support easy integration of the model into your applications, along with robust technical support and documentation.
	Security and Compliance: Ensure the provider adheres to industry standards for security and compliance, protecting sensitive data and maintaining privacy.
	Latency and Availability: Consider the geographical distribution of data centers to ensure low latency and high availability for your end-users.

	By evaluating these factors, you can select a cloud service provider that aligns with your deployment needs, ensuring efficient and cost-effective operation of your large language model."""
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM,AutoTokenizer

	config = PeftConfig.from_pretrained("mohamedemam/Em2-bloomz-7b")
	base_model = AutoModelForCausalLM.from_pretrained("mohamedemam/Em2-bloomz-7b")
	model = PeftModel.from_pretrained(base_model, "mohamedemam/Em2-bloomz-7b")
	tokenizer = AutoTokenizer.from_pretrained("mohamedemam/Em2-bloomz-7b", trust_remote_code=True)
	pipe=MyPipeline(model,tokenizer)
	print(pipe(context,quetion,answer,generate=True,max_new_tokens=4, num_beams=2, do_sample=False,num_return_sequences=1))
	```
	- output:{'response': ["Instruction:/n check answer is true or false of next quetion using context below:\n#context: Large language models, such as GPT-4, are trained on vast amounts of text data to understand and generate human-like text. The deployment of these models involves several steps:\n\n Model Selection: Choosing a pre-trained model that fits the application's needs.\n Infrastructure Setup: Setting up the necessary hardware and software infrastructure to run the model efficiently, including cloud services, GPUs, and necessary libraries.\n Integration: Integrating the model into an application, which can involve setting up APIs or embedding the model directly into the software.\n Optimization: Fine-tuning the model for specific tasks or domains and optimizing it for performance and cost-efficiency.\n Monitoring and Maintenance: Ensuring the model performs well over time, monitoring for biases, and updating the model as needed..\n#quetion: What are the key considerations when choosing a cloud service provider for deploying a large language model like GPT-4?.\n#student answer: When choosing a cloud service provider for deploying a large language model like GPT-4, the key considerations include:\n Compute Power: Ensure the provider offers high-performance GPUs or TPUs capable of handling the computational requirements of the model.\n Scalability: The ability to scale resources up or down based on the application's demand to handle varying workloads efficiently.\n Cost: Analyze the pricing models to understand the costs associated with compute time, storage, data transfer, and any other services.\n Integration and Support: Availability of tools and libraries that support easy integration of the model into your applications, along with robust technical support and documentation.\n Security and Compliance: Ensure the provider adheres to industry standards for security and compliance, protecting sensitive data and maintaining privacy.\n Latency and Availability: Consider the geographical distribution of data centers to ensure low latency and high availability for your end-users.\n\nBy evaluating these factors, you can select a cloud service provider that aligns with your deployment needs, ensuring efficient and cost-effective operation of your large language model..\n#response: true the answer is"], 'true': 0.943033754825592}

	### Chat Format Function
	This function formats the input context, question, and answer into a specific structure for the model to process.

	```python
	def chat_Format(self, context, question, answer):
	return "Instruction:/n check answer is true or false of next question using context below:\n" + "#context: " + context + f".\n#question: " + question + f".\n#student answer: " + answer + ".\n#response:"
	```


	## Configuration

	### Dropout Probability for LoRA Layers
	- lora_dropout: 0.05

	### Quantization Settings
	- use_4bit: True
	- bnb_4bit_compute_dtype: "float16"
	- bnb_4bit_quant_type: "nf4"
	- use_nested_quant: False

	### Output Directory
	- output_dir: "./results"

	### Training Parameters
	- num_train_epochs: 1
	- fp16: False
	- bf16: False
	- per_device_train_batch_size: 1
	- per_device_eval_batch_size: 4
	- gradient_accumulation_steps: 8
	- gradient_checkpointing: True
	- max_grad_norm: 0.3
	- learning_rate: 5e-5
	- weight_decay: 0.001
	- optim: "paged_adamw_8bit"
	- lr_scheduler_type: "constant"
	- max_steps: -1
	- warmup_ratio: 0.03
	- group_by_length: True

	### Logging and Saving
	- save_steps: 100
	- logging_steps: 25
	- max_seq_length: False