thethinkmachine commited on
Commit
99258c6
·
verified ·
1 Parent(s): d82d4d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -8
README.md CHANGED
@@ -49,10 +49,7 @@ $$\text{S}_{predicted} \times (\text{max} - \text{min}) + \text{min}$$
49
  Use the code below to get started with the model.
50
 
51
  ```python
52
- import torch
53
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
54
-
55
- model_name = "thethinkmachine/Maxwell-Task-Complexity-Exp-v1"
56
  tokenizer = AutoTokenizer.from_pretrained(model_name)
57
  model = AutoModelForSequenceClassification.from_pretrained(model_name)
58
 
@@ -66,16 +63,29 @@ def get_deita_complexity_score(question: str) -> int:
66
  final_score = torch.round(final_score)
67
  return final_score.item()
68
 
69
- query = "What is the capital of France?"
70
- final_score = get_deita_complexity_score(query)
71
- print(final_score)
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  ```
73
 
74
  ## Training Details
75
 
76
  ### Training Data
77
 
78
- We use [BhabhaAI/DEITA-Complexity](https://huggingface.co/datasets/BhabhaAI/DEITA-Complexity) 'deita'set for training the model. The dataset contains 66.5K diverse English instructions along with their complexity scores computed using the DEITA-Evol-Complexity scoring scheme which uses an LLM-judge to rank a sextuple containing 1 seed + 5 progressively complexified (*evolved*) instructions based on their contextual complexity & difficulty. The scheme assigns scores within [1, 6] range, with 1 being the least complex and 6 being the most complex.
79
 
80
  However, the training dataset used was observed to have instruction-score pairs across a diversity of scores within the range [0,9]. We suspect that this range includes scoring errors, as anomalous scores (0, 7, 8, 9) account for less than 1% of the total instructions.
81
 
 
49
  Use the code below to get started with the model.
50
 
51
  ```python
52
+ model_name = "thethinkmachine/Maxwell-Task-Complexity-Scorer-v0.2"
 
 
 
53
  tokenizer = AutoTokenizer.from_pretrained(model_name)
54
  model = AutoModelForSequenceClassification.from_pretrained(model_name)
55
 
 
63
  final_score = torch.round(final_score)
64
  return final_score.item()
65
 
66
+ def get_scaled_complexity_score(question: str) -> float:
67
+ inputs = tokenizer(question, return_tensors="pt")
68
+ with torch.no_grad():
69
+ outputs = model(**inputs)
70
+ normalized_pred = outputs.logits.squeeze()
71
+ final_score = normalized_pred * (max_score - min_score) + min_score
72
+ final_score = torch.clamp(final_score, min=min_score, max=max_score)
73
+ final_score = final_score.item()
74
+ return round(final_score, 2)
75
+
76
+ query = "Is learning equivalent to decreasing local entropy?"
77
+ max_score = 100
78
+ min_score = 0
79
+
80
+ print("DEITA Evol-Complexity Score:", get_deita_complexity_score(query)) # 2
81
+ print("Scaled Complexity Score:", get_scaled_complexity_score(query)) # 28.39...
82
  ```
83
 
84
  ## Training Details
85
 
86
  ### Training Data
87
 
88
+ We use [BhabhaAI/DEITA-Complexity](https://huggingface.co/datasets/BhabhaAI/DEITA-Complexity) 'deita'set for training the model. The dataset contains 66.5K diverse English instructions along with their complexity scores computed using the DEITA-Evol-Complexity scoring scheme which uses an LLM-judge to rank a sextuple containing 1 seed + 5 progressively complexified (*evolved*) instructions based on their complexity & difficulty. The scheme assigns scores within [1, 6] range, with 1 being the least complex and 6 being the most complex.
89
 
90
  However, the training dataset used was observed to have instruction-score pairs across a diversity of scores within the range [0,9]. We suspect that this range includes scoring errors, as anomalous scores (0, 7, 8, 9) account for less than 1% of the total instructions.
91