stockmark/stockmark-100b-instruct-v0.1
Stockmark-100b-instruct-v0.1 is an instruction tuned version of stockmark-100b, a 100 billion parameter LLM developed by Stockmark Inc.
How to use
import torch
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM
prompt_template = """### 指示:
{instruction}
### 応答:
"""
tokenizer = AutoTokenizer.from_pretrained("stockmark/stockmark-100b-instruct-v0.1")
model = AutoPeftModelForCausalLM.from_pretrained("stockmark/stockmark-100b-instruct-v0.1", device_map="auto", torch_dtype=torch.bfloat16)
instruction = "生成AIとは?"
prompt = prompt_template.format(instruction=instruction)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
with torch.inference_mode():
tokens = model.generate(
input_ids,
max_new_tokens = 256,
do_sample = True,
temperature = 0.7,
top_p = 0.95,
repetition_penalty = 1.08
)
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)
Dataset (fine-tuning)
Performance
Stockmark Business Questions
Dataset: https://huggingface.co/datasets/stockmark/business-questions
model | accuracy |
---|---|
stockmark-100b-instruct | 0.90 |
stockmark-13b-instruct | 0.80 |
GPT-3.5-turbo^1 | 0.42 |
Japanese Vicuna QA Benchmark
We excluded categories that require calculation and coding, and use remaining 60 questions for evaluation.
GitHub: https://github.com/ku-nlp/ja-vicuna-qa-benchmark
model | average score |
---|---|
stockmark-100b-instruct | 5.97 |
tokyotech-llm/Swallow-70b-instruct-hf | 5.59 |
GPT-3.5 (text-davinci-003) | 5.08 |
Inference speed
model | time [s] for genrating 100 characters in Japanese |
---|---|
stockmark-100b-instruct | 1.86 |
gpt-3.5-turbo | 2.15 |
gpt-4-turbo | 5.48 |
tokyotech-llm/Swallow-70b-instruct-hf | 2.22 |
For local LLMs, we measured the inference time using AWS Inferentia2.