Qwen2.5 32B for Japanese to English Light Novel translation
This model was fine-tuned on light and web novel for Japanese to English translation.
It can translate entire chapters (up to 32K tokens total for input and output).
Usage
Load in llama.cpp
Prompt format
<|im_start|>system
Translate this text from Japanese to English.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
Example:
<|im_start|>system
Translate this text from Japanese to English.<|im_end|>
<|im_start|>user
<GLOSSARY>
γγ€γ³ : Myne
</GLOSSARY>
γγ€γ³γγ«γγγθΏγγ«ζ₯γγ<|im_end|>
<|im_start|>assistant
Myne, Lutz is here to take you home.
The glossary is optional. Remove it if not needed.
Text preprocessing
The Japanese text must be preprocessed with the following clean_string
function that replaces some unicode characters
with ASCII equivalents. Failure to do this may cause issues.
import ftfy
FTFY_ADDITIONAL_MAP = {
"β": "--",
"β": "-",
"βΈ»": "----",
"Β«": "\"",
"Β»": "\"",
"γ": "\"",
"γ": "\"",
"β§": "*",
"β½": "*",
"⬀": "*",
"β": "*",
"β΄": "*",
"β΅": "*",
"β©": "*",
"γ": "[",
"γ": "]",
"γ": "[",
"γ": "]",
"γ": "[",
"γ": "]",
"γ": "<",
"γ": ">",
"γ": "<<",
"γ": ">>",
}
def clean_string(text: str, strip: bool = True) -> str:
config = ftfy.TextFixerConfig(normalization="NFC")
s = ftfy.fix_text(text, config=config)
s = "\n".join((x.strip() if strip else x.rstrip()) for x in s.splitlines())
for b, g in FTFY_ADDITIONAL_MAP.items():
s = s.replace(b, g)
return s
Glossary
You can provide up to 30 custom translations for nouns and character names at runtime.
Prefix your chapter with glossary terms (one per line) Japanese term : English term
inside <GLOSSARY></GLOSSARY>
tags.
For example, if you wish to have γγ€γ³
translated as Myne
you can construct the input prompt with:
glossary = [
{"ja": "γγ€γ³", "en": "Myne"},
]
chapter_text = "γγ€γ³γγ«γγγθΏγγ«ζ₯γγ"
def make_glossary_str(glossary: list[dict[str, str]]) -> str:
if glossart is None or len(glossary) == 0:
return ""
unique_glossary = {(term['ja'], term['en']) for term in glossary}
terms = "\n".join([f"{ja} : {en}" for ja, en in unique_glossary])
return f"<GLOSSARY>\n{terms}\n</GLOSSARY>\n"
user_prompt = f"{make_glossary_str(glossary)}{clean_string(chapter_text)}"
<GLOSSARY>
γγ€γ³ : Myne
</GLOSSARY>
γγ€γ³γγ«γγγθΏγγ«ζ₯γγ
- Downloads last month
- 74
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for thefrigidliquidation/lightnovel-translate-Qwen2.5-32B-GGUF
Base model
Qwen/Qwen2.5-32B
Finetuned
Qwen/Qwen2.5-32B-Instruct
Quantized
unsloth/Qwen2.5-32B-Instruct-bnb-4bit