no code yet
Browse files
README.md
CHANGED
@@ -27,43 +27,7 @@ An uncensored reasoning EXAONE 3.5 model trained on reasoning data. Now with a f
|
|
27 |
It has been trained using improved training code, and gives an improved performance.
|
28 |
Here is what inference code you should use:
|
29 |
```py
|
30 |
-
|
31 |
-
|
32 |
-
MAX_REASONING_TOKENS = 1024
|
33 |
-
MAX_RESPONSE_TOKENS = 512
|
34 |
-
|
35 |
-
model_name = "lunahr/thea-pro-2b-100r"
|
36 |
-
|
37 |
-
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto", trust_remote_code=True)
|
38 |
-
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
39 |
-
|
40 |
-
prompt = "Which is greater 9.9 or 9.11 ??"
|
41 |
-
messages = [
|
42 |
-
{"role": "user", "content": prompt}
|
43 |
-
]
|
44 |
-
|
45 |
-
# Generate reasoning
|
46 |
-
input_ids = tokenizer.apply_chat_template(messages, tokenize=False, add_reasoning_prompt=True, return_tensors="pt")
|
47 |
-
output = model.generate(
|
48 |
-
input_ids.to("cuda"),
|
49 |
-
eos_token_id=tokenizer.eos_token_id,
|
50 |
-
max_new_tokens=MAX_REASONING_TOKENS,
|
51 |
-
do_sample=False,
|
52 |
-
)
|
53 |
-
|
54 |
-
print("REASONING: " + tokenizer.decode(output[0]))
|
55 |
-
|
56 |
-
# Generate answer
|
57 |
-
messages.append({"role": "reasoning", "content": reasoning_output})
|
58 |
-
input_ids = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, return_tensors="pt")
|
59 |
-
output = model.generate(
|
60 |
-
input_ids.to("cuda"),
|
61 |
-
eos_token_id=tokenizer.eos_token_id,
|
62 |
-
max_new_tokens=MAX_RESPONSE_TOKENS,
|
63 |
-
do_sample=False,
|
64 |
-
)
|
65 |
-
|
66 |
-
print("REASONING: " + tokenizer.decode(output[0]))
|
67 |
```
|
68 |
|
69 |
- **Trained by:** [Piotr Zalewski](https://huggingface.co/lunahr)
|
|
|
27 |
It has been trained using improved training code, and gives an improved performance.
|
28 |
Here is what inference code you should use:
|
29 |
```py
|
30 |
+
# DEBUGGING IN PROGRESS, check later
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
```
|
32 |
|
33 |
- **Trained by:** [Piotr Zalewski](https://huggingface.co/lunahr)
|