Safetensors
llama
text

Llama-3.1-8B-Chat

meta-llama/Meta-Llama-3.1-8B fine-tuned for chat completions.

Obligatory, this model was Built with Llama.

Quick start

Simply load the model and generate responses:

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)


model = AutoModelForCausalLM.from_pretrained("mathewhe/Llama-3.1-8B-Chat")
tokenizer = AutoTokenizer.from_pretrained("mathewhe/Llama-3.1-8B-Chat")

messages = [
    {"role": "user", "content": "What is an LLM?"},
]

inputs = tokenizer.apply_chat_template(messages)

print(tokenizer.decode(model.generate(**inputs)[0]))

Alternatively, copy the included chat_class.py module into your local directory and just import the Chat class:

from chat_class import Chat
chat = Chat(
    "mathewhe/Llama-3.1-8B-Chat",
    device="cuda",
)

# for one-off instructions
instruction = "Write an ingredient list for banana pudding."
print(chat.instruct(instruction))

# for multi-turn chat
response1 = chat.message("Hi, please explain what DNA is.")
response2 = chat.message("Tell me more about how its discovery affected society.")

# to reset the chat
chat.reset()

Performance

We verified that this model was successfully aligned for both multi-turn dialogue and one-off instruction following.

Model AlpacaEval AlpacaEval-LC
meta-llama/Meta-Llama-3.1-8B-Instruct 21.84 20.85
mathewhe/Llama-3.1-8B-Chat 12.16 20.53

Chat template

This model uses the following chat template and does not support a separate system prompt:

<|begin_of_text|>[INST]<user-message>[/INST][ASST]<llm-response>[/ASST]<|end_of_text|>

The included tokenizer will correctly format messages, so you should not have to manually format the input text.

Instead, use the tokenizer's apply_chat_template() method on a list of messages. Each message should be a dict with two keys:

  • "role": Either "user" or "assistant".
  • "content": The message to include.

For example:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("mathewhe/Llama-3.1-8B-Chat")

messages = [
    {"role": "user", "content": "Solve for x: 3x=4"},
    {"role": "assistant", "content": "3x=4\n(3x)/3=(4)/3\nx=4/3"},
    {"role": "user", "content": "Please explain your work."},
]
print(tokenizer.apply_chat_template(messages, tokenize=False)

outputs

<|begin_of_text|>[INST]Solve for x: 3x=4[/INST][ASST]3x=4
(3x)/3=(4)/3
x=4/3[/ASST]<|end_of_text|><|begin_of_text|>[INST]Please explain your work[/INST]

See the example code in the included chat_class.py module for more details.

Data

This model was trained on the following three datsets:

Downloads last month
32
Safetensors
Model size
8.03B params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for mathewhe/Llama-3.1-8B-Chat

Finetuned
(684)
this model

Datasets used to train mathewhe/Llama-3.1-8B-Chat