ymoslem commited on
Commit
e244648
·
verified ·
1 Parent(s): df1c4a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -10
README.md CHANGED
@@ -37,7 +37,29 @@ datasets:
37
  - ymoslem/wmt-da-human-evaluation
38
  model-index:
39
  - name: Quality Estimation for Machine Translation
40
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  ---
42
 
43
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -51,15 +73,7 @@ It achieves the following results on the evaluation set:
51
 
52
  ## Model description
53
 
54
- More information needed
55
-
56
- ## Intended uses & limitations
57
-
58
- More information needed
59
-
60
- ## Training and evaluation data
61
-
62
- More information needed
63
 
64
  ## Training procedure
65
 
@@ -106,3 +120,126 @@ The following hyperparameters were used during training:
106
  - Pytorch 2.4.1+cu124
107
  - Datasets 3.2.0
108
  - Tokenizers 0.21.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  - ymoslem/wmt-da-human-evaluation
38
  model-index:
39
  - name: Quality Estimation for Machine Translation
40
+ results:
41
+ - task:
42
+ type: regression
43
+ dataset:
44
+ name: ymoslem/wmt-da-human-evaluation
45
+ type: QE
46
+ metrics:
47
+ - name: Pearson Correlation
48
+ type: Pearson
49
+ value: 0.422
50
+ - name: Mean Absolute Error
51
+ type: MAE
52
+ value: 0.196
53
+ - name: Root Mean Squared Error
54
+ type: RMSE
55
+ value: 0.245
56
+ - name: R-Squared
57
+ type: R2
58
+ value: 0.245
59
+ metrics:
60
+ - perplexity
61
+ - mae
62
+ - r_squared
63
  ---
64
 
65
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
73
 
74
  ## Model description
75
 
76
+ This model is for reference-free quality estimation (QE) of machine translation (MT) systems.
 
 
 
 
 
 
 
 
77
 
78
  ## Training procedure
79
 
 
120
  - Pytorch 2.4.1+cu124
121
  - Datasets 3.2.0
122
  - Tokenizers 0.21.0
123
+
124
+ ## Inference
125
+
126
+ 1. Install the required libraries.
127
+
128
+ ```bash
129
+ pip3 install --upgrade datasets accelerate transformers
130
+ pip3 install --upgrade flash_attn triton
131
+ ```
132
+
133
+ 2. Load the test dataset.
134
+
135
+ ```python
136
+ from datasets import load_dataset
137
+
138
+ test_dataset = load_dataset("ymoslem/wmt-da-human-evaluation",
139
+ split="test",
140
+ trust_remote_code=True
141
+ )
142
+ print(test_dataset)
143
+ ```
144
+
145
+ 3. Load the model and tokenizer:
146
+
147
+ ```python
148
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
149
+ import torch
150
+
151
+ # Load the fine-tuned model and tokenizer
152
+ model_name = "ymoslem/ModernBERT-large-qe-v1"
153
+ model = AutoModelForSequenceClassification.from_pretrained(
154
+ model_name,
155
+ device_map="auto",
156
+ torch_dtype=torch.bfloat16,
157
+ attn_implementation="flash_attention_2",
158
+ )
159
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
160
+
161
+ # Move model to GPU if available
162
+ device = "cuda" if torch.cuda.is_available() else "cpu"
163
+ model.to(device)
164
+ model.eval()
165
+ ```
166
+
167
+ 4. Prepare the dataset. Each source segment `src` and target segment `tgt` are separated by the `sep_token`, which is `'</s>'` for ModernBERT.
168
+
169
+ ```python
170
+ sep_token = tokenizer.sep_token
171
+ input_test_texts = [f"{src} {sep_token} {tgt}" for src, tgt in zip(test_dataset["src"], test_dataset["mt"])]
172
+ ```
173
+
174
+ 5. Generate predictions.
175
+
176
+ If you print `model.config.problem_type`, the output is `regression`.
177
+ Still, you can use the "text-classification" pipeline as follows (cf. [pipeline documentation](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TextClassificationPipeline)):
178
+
179
+ ```python
180
+ from transformers import pipeline
181
+
182
+ classifier = pipeline("text-classification",
183
+ model=model_name,
184
+ tokenizer=tokenizer,
185
+ device=0,
186
+ )
187
+
188
+ predictions = classifier(input_test_texts,
189
+ batch_size=128,
190
+ truncation=True,
191
+ padding="max_length",
192
+ max_length=tokenizer.model_max_length,
193
+ )
194
+ predictions = [prediction["score"] for prediction in predictions]
195
+
196
+ ```
197
+
198
+ Alternatively, you can use an elaborate version of the code, which is slightly faster and provides more control.
199
+
200
+ ```python
201
+ from torch.utils.data import DataLoader
202
+ import torch
203
+ from tqdm.auto import tqdm
204
+
205
+ # Tokenization function
206
+ def process_batch(batch, tokenizer, device):
207
+ sep_token = tokenizer.sep_token
208
+ input_texts = [f"{src} {sep_token} {tgt}" for src, tgt in zip(batch["src"], batch["mt"])]
209
+ tokens = tokenizer(input_texts,
210
+ truncation=True,
211
+ padding="max_length",
212
+ max_length=tokenizer.model_max_length,
213
+ return_tensors="pt",
214
+ ).to(device)
215
+ return tokens
216
+
217
+
218
+
219
+ # Create a DataLoader for batching
220
+ test_dataloader = DataLoader(test_dataset,
221
+ batch_size=128, # Adjust batch size as needed
222
+ shuffle=False)
223
+
224
+
225
+ # List to store all predictions
226
+ predictions = []
227
+
228
+ with torch.no_grad():
229
+ for batch in tqdm(test_dataloader, desc="Inference Progress", unit="batch"):
230
+
231
+ tokens = process_batch(batch, tokenizer, device)
232
+
233
+ # Forward pass: Generate model's logits
234
+ outputs = model(**tokens)
235
+
236
+ # Get logits (predictions)
237
+ logits = outputs.logits
238
+
239
+ # Extract the regression predicted values
240
+ batch_predictions = logits.squeeze()
241
+
242
+ # Extend the list with the predictions
243
+ predictions.extend(batch_predictions.tolist())
244
+ ```
245
+