You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

image/png

Introduction

We introduce Llama-3-Motif, a new language model family of Moreh, specialized in Korean and English.
Llama-3-Motif-102B-Instruct is a chat model tuned from the base model Llama-3-Motif-102B.

Training Platform

  • Llama-3-Motif-102B model family is trained on MoAI platform, refer to link for more information.

Quick Usage

You can chat directly with our model Llama-3-Motif through our Model hub.

Details

More details will be provided in the upcoming technical report.
Effective context length is 32k(avg 81) based on RULER benchmark.

Release Date

2024.12.02

Benchmark Results

Provider Model kmmlu_direct score
Moreh Llama-3-Motif-102B 64.74 +
Moreh Llama-3-Motif-102B-Instruct 64.81 +
Meta Llama3-70B-instruct 54.5*
Meta Llama3.1-70B-instruct 52.1*
Meta Llama3.1-405B-instruct 65.8*
Alibaba Qwen2-72B-instruct 64.1*
OpenAI GPT-4-0125-preview 59.95*
OpenAI GPT-4o-2024-05-13 64.11**
Google gemini pro 50.18*
LG exaone 3.0 44.5* +
Naver HyperCLOVA X 53.4* +
Upstage SOLAR-10.7B 41.65* +

* : Community report
** : Measured by Moreh
+ : Claimed to have better capability in Korean

How to use

Use with vLLM

  • Refer to this link to install vllm
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

# Change tensor_parallel_size to GPU numbers you can afford
model = LLM("moreh/Motif-102B-Instruct", tensor_parallel_size=4)
tokenizer = AutoTokenizer.from_pretrained("moreh/Llama-3-Motif-102B-Instruct")
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "์œ ์น˜์›์ƒ์—๊ฒŒ ๋น…๋ฑ… ์ด๋ก ์˜ ๊ฐœ๋…์„ ์„ค๋ช…ํ•ด๋ณด์„ธ์š”"},
]

messages_batch = [tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)]

# vllm does not support generation_config of hf. So we have to set it like below
sampling_params = SamplingParams(max_tokens=512, temperature=0, repetition_penalty=1.0, stop_token_ids=[tokenizer.eos_token_id])
responses = model.generate(messages_batch, sampling_params=sampling_params)

print(responses[0].outputs[0].text)

Use with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "moreh/Llama-3-Motif-102B-Instruct"

# all generation configs are set in generation_configs.json
model = AutoModelForCausalLM.from_pretrained(model_id).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_id)
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "์œ ์น˜์›์ƒ์—๊ฒŒ ๋น…๋ฑ… ์ด๋ก ์˜ ๊ฐœ๋…์„ ์„ค๋ช…ํ•ด๋ณด์„ธ์š”"},
]

messages_batch = tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)
input_ids = tokenizer(messages_batch, padding=True, return_tensors='pt')['input_ids'].cuda()

outputs = model.generate(input_ids)
Downloads last month
246
Safetensors
Model size
102B params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for moreh/Llama-3-Motif-102B-Instruct

Finetuned
(1)
this model
Quantizations
2 models

Collection including moreh/Llama-3-Motif-102B-Instruct