File size: 2,806 Bytes
772adaa 0bf01f9 6251126 0bf01f9 6251126 7e76ee7 0bf01f9 5581995 d84e35f 88fdb98 d84e35f 2da7660 0bf01f9 f0d380e 6c81f4d f0d380e 6c81f4d f0d380e 6c81f4d f0d380e 6c81f4d f0d380e 6c81f4d a74d9e2 6c81f4d f0d380e eb38eda f0d380e eb38eda f0d380e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
---
license: apache-2.0
language:
- en
- ja
tags:
- finetuned
library_name: transformers
pipeline_tag: text-generation
---
<img src="./veteus_logo.svg" width="100%" height="20%" alt="">
# Our Models
- [Vecteus](https://huggingface.co/Local-Novel-LLM-project/Vecteus-v1)
- [Ninja-v1](https://huggingface.co/Local-Novel-LLM-project/Ninja-v1)
- [Ninja-v1-NSFW](https://huggingface.co/Local-Novel-LLM-project/Ninja-v1-NSFW)
- [Ninja-v1-128k](https://huggingface.co/Local-Novel-LLM-project/Ninja-v1-128k)
- [Ninja-v1-NSFW-128k](https://huggingface.co/Local-Novel-LLM-project/Ninja-v1-NSFW-128k)
## Model Card for VecTeus-v1.0
The Mistral-7B--based Large Language Model (LLM) is an noveldataset fine-tuned version of the Mistral-7B-v0.1
VecTeus has the following changes compared to Mistral-7B-v0.1.
- 128k context window (8k context in v0.1)
- Achieving both high quality Japanese and English generation
- Can be generated NSFW
- Memory ability that does not forget even after long-context generation
This model was created with the help of GPUs from the first LocalAI hackathon.
We would like to take this opportunity to thank
## List of Creation Methods
- Chatvector for multiple models
- Simple linear merging of result models
- Domain and Sentence Enhancement with LORA
- Context expansion
## Instruction format
Freed from templates. Congratulations
## Example prompts to improve (Japanese)
- BAD:ใใใชใใฏโโใจใใฆๆฏใ่ใใพใ
- GOOD: ใใชใใฏโโใงใ
- BAD: ใใชใใฏโโใใงใใพใ
- GOOD: ใใชใใฏโโใใใพใ
## Performing inference
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Local-Novel-LLM-project/Vecteus-v1"
new_tokens = 1024
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, torch_dtype=torch.float16, attn_implementation="flash_attention_2", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)
system_prompt = "ใใชใใฏใใญใฎๅฐ่ชฌๅฎถใงใใ\nๅฐ่ชฌใๆธใใฆใใ ใใ\n-------- "
prompt = input("Enter a prompt: ")
system_prompt += prompt + "\n-------- "
model_inputs = tokenizer([system_prompt], return_tensors="pt").to("cuda")
generated_ids = model.generate(**model_inputs, max_new_tokens=new_tokens, do_sample=True)
print(tokenizer.batch_decode(generated_ids)[0])
````
## Merge recipe
- VT0.1 = Ninjav1 + Original Lora
- VT0.2 = Ninjav1 128k + Original Lora
- VT0.2on0.1 = VT0.1 + VT0.2
- VT1 = all VT Series + Lora + Ninja 128k and Normal
## Other points to keep in mind
- The training data may be biased. Be careful with the generated sentences.
- Memory usage may be large for long inferences.
- If possible, we recommend inferring with llamacpp rather than Transformers. |