ehristoforu commited on
Commit
5e175dc
·
verified ·
1 Parent(s): e0939f8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -1
README.md CHANGED
@@ -55,7 +55,7 @@ datasets:
55
  pipeline_tag: text-generation
56
  ---
57
 
58
- # ehristoforu/Gistral-16B-Q4_K_M-GGUF
59
  This model was converted to GGUF format from [`ehristoforu/Gistral-16B`](https://huggingface.co/ehristoforu/Gistral-16B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
60
  Refer to the [original model card](https://huggingface.co/ehristoforu/Gistral-16B) for more details on the model.
61
  ## Use with llama.cpp
@@ -84,3 +84,71 @@ Note: You can also use this checkpoint directly through the [usage steps](https:
84
  ```
85
  git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make && ./main -m gistral-16b.Q4_K_M.gguf -n 128
86
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  pipeline_tag: text-generation
56
  ---
57
 
58
+ # Gistral-16B-Q4_K_M-GGUF
59
  This model was converted to GGUF format from [`ehristoforu/Gistral-16B`](https://huggingface.co/ehristoforu/Gistral-16B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
60
  Refer to the [original model card](https://huggingface.co/ehristoforu/Gistral-16B) for more details on the model.
61
  ## Use with llama.cpp
 
84
  ```
85
  git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make && ./main -m gistral-16b.Q4_K_M.gguf -n 128
86
  ```
87
+
88
+ # Gistral 16B (Mistral from 7B to 16B)
89
+
90
+ ![logo](assets/logo.png)
91
+
92
+ We created a model from other cool models to combine everything into one cool model.
93
+
94
+
95
+ ## Model Details
96
+
97
+ ### Model Description
98
+
99
+ - **Developed by:** [@ehristoforu](https://huggingface.co/ehristoforu)
100
+ - **Model type:** Text Generation (conversational)
101
+ - **Language(s) (NLP):** English, French, Russian, German, Japanese, Chinese, Korean, Italian, Ukrainian, Code
102
+ - **Finetuned from model:** [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
103
+
104
+
105
+ ## How to Get Started with the Model
106
+
107
+ Use the code below to get started with the model.
108
+
109
+ ```py
110
+ from transformers import AutoModelForCausalLM, AutoTokenizer
111
+ model_id = "ehristoforu/Gistral-16B"
112
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
113
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
114
+ messages = [
115
+ {"role": "user", "content": "What is your favourite condiment?"},
116
+ {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
117
+ {"role": "user", "content": "Do you have mayonnaise recipes?"}
118
+ ]
119
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
120
+ outputs = model.generate(inputs, max_new_tokens=20)
121
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
122
+ ```
123
+
124
+
125
+ ## About merge
126
+
127
+ Base model: mistralai/Mistral-7B-Instruct-v0.2
128
+
129
+ Merge models:
130
+ - Gaivoronsky/Mistral-7B-Saiga
131
+ - snorkelai/Snorkel-Mistral-PairRM-DPO
132
+ - OpenBuddy/openbuddy-mistral2-7b-v20.3-32k
133
+ - meta-math/MetaMath-Mistral-7B
134
+ - HuggingFaceH4/mistral-7b-grok
135
+ - HuggingFaceH4/mistral-7b-anthropic
136
+ - NousResearch/Yarn-Mistral-7b-128k
137
+ - ajibawa-2023/Code-Mistral-7B
138
+ - SherlockAssistant/Mistral-7B-Instruct-Ukrainian
139
+
140
+ Merge datasets:
141
+ - HuggingFaceH4/grok-conversation-harmless
142
+ - HuggingFaceH4/ultrachat_200k
143
+ - HuggingFaceH4/ultrafeedback_binarized_fixed
144
+ - HuggingFaceH4/cai-conversation-harmless
145
+ - meta-math/MetaMathQA
146
+ - emozilla/yarn-train-tokenized-16k-mistral
147
+ - snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
148
+ - microsoft/orca-math-word-problems-200k
149
+ - m-a-p/Code-Feedback
150
+ - teknium/openhermes
151
+ - lksy/ru_instruct_gpt4
152
+ - IlyaGusev/ru_turbo_saiga
153
+ - IlyaGusev/ru_sharegpt_cleaned
154
+ - IlyaGusev/oasst1_ru_main_branch