Can't get any reasonable output
Hi!
Thanks for your work on improving german capabilities of open LLMs :)
I tried to use your model in a toy example, but I seem to only get repetitions on the input prompt.
I tried several temperatures and prompts. Any hints on what I'm doing wrong?
This is my full code:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
pretrained_model_name_or_path="LeoLM/leo-hessianai-13b",
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("LeoLM/leo-hessianai-13b")
# taken from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/model.py#L20
def get_prompt(message: str, chat_history: list[tuple[str, str]],
system_prompt: str) -> str:
texts = [f'<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n']
# The first user input is _not_ stripped
do_strip = False
for user_input, response in chat_history:
user_input = user_input.strip() if do_strip else user_input
do_strip = True
texts.append(f'{user_input} [/INST] {response.strip()} </s><s>[INST] ')
message = message.strip() if do_strip else message
texts.append(f'{message} [/INST]')
return ''.join(texts)
prompt = get_prompt(message="Hi, kannst du mit mir reden?", chat_history=[], system_prompt="Du bist ein netter, hilfsbereiter Sprachassistent.")
inputs = tokenizer([prompt], return_tensors='pt', add_special_tokens=False)
# Generate
generate_ids = model.generate(inputs.input_ids.to("cuda"),
max_length=300,
do_sample=True,
top_p=0.95,
top_k=50,
temperature=0.8,
num_beams=1
)
print(tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])
Output was
'[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n<</SYS>>\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi'
LeoLM/leo-hessianai-13b
and LeoLM/leo-hessianai-7b
are our base models and are not intended for direct use in a chat format. For chat models, check out LeoLM/leo-hessianai-13b-chat
and LeoLM/leo-hessianai-7b-chat
. Let me know if these work better for you :)
Thanks! Works way better now.
Besides using the wrong model, I also used the wrong prompt template...
If the chat model is used for better results, what would be the general use case of LeoLM/leo-hessianai-13b and LeoLM/leo-hessianai-7b?