File size: 10,978 Bytes
46e31a5 44f6b90 5a235f6 46e31a5 44f6b90 46e31a5 5a235f6 fff53d5 44f6b90 46e31a5 e7ce087 44f6b90 e7ce087 fead70b 44f6b90 307bd57 bac397e 44f6b90 bac397e 44f6b90 119ac8a 44f6b90 56060b5 44f6b90 c27bd7b 9a86a65 3c0c731 44f6b90 307bd57 44f6b90 c2d685e 44f6b90 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 |
---
language:
- en
- ar
license: gpl
tags:
- autograding
- essay quetion
- sentence similarity
metrics:
- accuracy
library_name: peft
datasets:
- mohamedemam/Essay-quetions-auto-grading
---
# Model Card for Model ID
Fine tuned version of Bloomz on Essay-quetions-auto-grading dataset
this model support autograding for Arabic and English only
### Model Description
<!-- Provide a longer summary of what this model is. -->
We are thrilled to introduce our graduation project, the EM2 model, designed for automated essay grading in both Arabic and English. 📝✨
To develop this model, we first created a custom dataset for training. We adapted the QuAC and OpenOrca datasets to make them suitable for our automated essay grading application.
Our model utilizes the following impressive models:
Mistral: 96%
LLaMA: 93%
FLAN-T5: 93%
BLOOMZ (Arabic): 86%
MT0 (Arabic): 84%
You can try our models for auto-grading on Hugging Face! 🌐
We then deployed these models for practical use. We are proud of our team's hard work and the potential impact of the EM2 model in the field of education. 🌟
#MachineLearning #AI #Education #EssayGrading #GraduationProject
- **Developed by:** mohamed emam
- **Model type:** decoder only
- **Language(s) (NLP):** English
- **License:** gpl
- **Finetuned from model :** bloomz
### Explain how it work
- model take three inputs first context or perfect answer + quetion on context + student answer
then model output the result
<!-- Provide the basic links for the model. -->
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6456f2eca9b8e1fd4cbe5ebe/1vEkwn5Mj_0BJ08kU6J57.png)
- **Hugging Face** https://huggingface.co/mohamedemam/Em2-bloomz-7b
- **Repository:** https://github.com/mohamed-em2m/Automatic-Grading-AI
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
auto grading for essay quetions
### Downstream Use [optional]
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
Text generation
[More Information Needed]
### Training Data
- **mohamedemam/Essay-quetions-auto-grading-arabic**
### Training Procedure
using Trl
### Pipline
```python
from transformers import Pipeline
import torch.nn.functional as F
class MyPipeline:
def __init__(self,model,tokenizer):
self.model=model
self.tokenizer=tokenizer
def chat_Format(self,context, quetion, answer):
return "Instruction:/n check answer is true or false of next quetion using context below:\n" + "#context: " + context + f".\n#quetion: " + quetion + f".\n#student answer: " + answer + ".\n#response:"
def __call__(self, context, quetion, answer,generate=1,max_new_tokens=4, num_beams=2, do_sample=False,num_return_sequences=1):
inp=self.chat_Format(context, quetion, answer)
w = self.tokenizer(inp, add_special_tokens=True,
pad_to_max_length=True,
return_attention_mask=True,
return_tensors='pt')
response=""
if(generate):
outputs = self.tokenizer.batch_decode(self.model.generate(input_ids=w['input_ids'].cuda(), attention_mask=w['attention_mask'].cuda(), max_new_tokens=max_new_tokens, num_beams=num_beams, do_sample=do_sample, num_return_sequences=num_return_sequences), skip_special_tokens=True)
response = outputs
s =self.model(input_ids=w['input_ids'].cuda(), attention_mask=w['attention_mask'].cuda())['logits'][0][-1]
s = F.softmax(s, dim=-1)
yes_token_id = self.tokenizer.convert_tokens_to_ids(self.tokenizer.tokenize("True")[0])
no_token_id = self.tokenizer.convert_tokens_to_ids(self.tokenizer.tokenize("False")[0])
for i in ["Yes", "yes", "True", "true","صحيح"]:
for word in self.tokenizer.tokenize(i):
s[yes_token_id] += s[self.tokenizer.convert_tokens_to_ids(word)]
for i in ["No", "no", "False", "false","خطأ"]:
for word in self.tokenizer.tokenize(i):
s[no_token_id] += s[self.tokenizer.convert_tokens_to_ids(word)]
true = (s[yes_token_id] / (s[no_token_id] + s[yes_token_id])).item()
return {"response": response, "true": true}
context="""Large language models, such as GPT-4, are trained on vast amounts of text data to understand and generate human-like text. The deployment of these models involves several steps:
Model Selection: Choosing a pre-trained model that fits the application's needs.
Infrastructure Setup: Setting up the necessary hardware and software infrastructure to run the model efficiently, including cloud services, GPUs, and necessary libraries.
Integration: Integrating the model into an application, which can involve setting up APIs or embedding the model directly into the software.
Optimization: Fine-tuning the model for specific tasks or domains and optimizing it for performance and cost-efficiency.
Monitoring and Maintenance: Ensuring the model performs well over time, monitoring for biases, and updating the model as needed."""
quetion="What are the key considerations when choosing a cloud service provider for deploying a large language model like GPT-4?"
answer="""When choosing a cloud service provider for deploying a large language model like GPT-4, the key considerations include:
Compute Power: Ensure the provider offers high-performance GPUs or TPUs capable of handling the computational requirements of the model.
Scalability: The ability to scale resources up or down based on the application's demand to handle varying workloads efficiently.
Cost: Analyze the pricing models to understand the costs associated with compute time, storage, data transfer, and any other services.
Integration and Support: Availability of tools and libraries that support easy integration of the model into your applications, along with robust technical support and documentation.
Security and Compliance: Ensure the provider adheres to industry standards for security and compliance, protecting sensitive data and maintaining privacy.
Latency and Availability: Consider the geographical distribution of data centers to ensure low latency and high availability for your end-users.
By evaluating these factors, you can select a cloud service provider that aligns with your deployment needs, ensuring efficient and cost-effective operation of your large language model."""
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM,AutoTokenizer
config = PeftConfig.from_pretrained("mohamedemam/Em2-bloomz-7b")
base_model = AutoModelForCausalLM.from_pretrained("mohamedemam/Em2-bloomz-7b")
model = PeftModel.from_pretrained(base_model, "mohamedemam/Em2-bloomz-7b")
tokenizer = AutoTokenizer.from_pretrained("mohamedemam/Em2-bloomz-7b", trust_remote_code=True)
pipe=MyPipeline(model,tokenizer)
print(pipe(context,quetion,answer,generate=True,max_new_tokens=4, num_beams=2, do_sample=False,num_return_sequences=1))
```
- **output:**{'response': ["Instruction:/n check answer is true or false of next quetion using context below:\n#context: Large language models, such as GPT-4, are trained on vast amounts of text data to understand and generate human-like text. The deployment of these models involves several steps:\n\n Model Selection: Choosing a pre-trained model that fits the application's needs.\n Infrastructure Setup: Setting up the necessary hardware and software infrastructure to run the model efficiently, including cloud services, GPUs, and necessary libraries.\n Integration: Integrating the model into an application, which can involve setting up APIs or embedding the model directly into the software.\n Optimization: Fine-tuning the model for specific tasks or domains and optimizing it for performance and cost-efficiency.\n Monitoring and Maintenance: Ensuring the model performs well over time, monitoring for biases, and updating the model as needed..\n#quetion: What are the key considerations when choosing a cloud service provider for deploying a large language model like GPT-4?.\n#student answer: When choosing a cloud service provider for deploying a large language model like GPT-4, the key considerations include:\n Compute Power: Ensure the provider offers high-performance GPUs or TPUs capable of handling the computational requirements of the model.\n Scalability: The ability to scale resources up or down based on the application's demand to handle varying workloads efficiently.\n Cost: Analyze the pricing models to understand the costs associated with compute time, storage, data transfer, and any other services.\n Integration and Support: Availability of tools and libraries that support easy integration of the model into your applications, along with robust technical support and documentation.\n Security and Compliance: Ensure the provider adheres to industry standards for security and compliance, protecting sensitive data and maintaining privacy.\n Latency and Availability: Consider the geographical distribution of data centers to ensure low latency and high availability for your end-users.\n\nBy evaluating these factors, you can select a cloud service provider that aligns with your deployment needs, ensuring efficient and cost-effective operation of your large language model..\n#response: true the answer is"], 'true': 0.943033754825592}
### Chat Format Function
This function formats the input context, question, and answer into a specific structure for the model to process.
```python
def chat_Format(self, context, question, answer):
return "Instruction:/n check answer is true or false of next question using context below:\n" + "#context: " + context + f".\n#question: " + question + f".\n#student answer: " + answer + ".\n#response:"
```
## Configuration
### Dropout Probability for LoRA Layers
- **lora_dropout:** 0.05
### Quantization Settings
- **use_4bit:** True
- **bnb_4bit_compute_dtype:** "float16"
- **bnb_4bit_quant_type:** "nf4"
- **use_nested_quant:** False
### Output Directory
- **output_dir:** "./results"
### Training Parameters
- **num_train_epochs:** 1
- **fp16:** False
- **bf16:** False
- **per_device_train_batch_size:** 1
- **per_device_eval_batch_size:** 4
- **gradient_accumulation_steps:** 8
- **gradient_checkpointing:** True
- **max_grad_norm:** 0.3
- **learning_rate:** 5e-5
- **weight_decay:** 0.001
- **optim:** "paged_adamw_8bit"
- **lr_scheduler_type:** "constant"
- **max_steps:** -1
- **warmup_ratio:** 0.03
- **group_by_length:** True
### Logging and Saving
- **save_steps:** 100
- **logging_steps:** 25
- **max_seq_length:** False |