yi-01-ai commited on
Commit
6f6af54
·
1 Parent(s): c7b9ea8

Auto Sync from git://github.com/01-ai/Yi.git/commit/60e542b32465923024aadf1b7e3f8658f07c1e39

Browse files
Files changed (1) hide show
  1. README.md +75 -58
README.md CHANGED
@@ -82,9 +82,11 @@ pipeline_tag: text-generation
82
  - [Quick start](#quick-start)
83
  - [Choose your path](#choose-your-parth)
84
  - [pip](#pip)
 
85
  - [llama.cpp](#quick-start---llamacpp)
 
86
  - [Web demo](#web-demo)
87
- - [Fine tune](#fine-tune)
88
  - [Quantization](#quantization)
89
  - [Deployment](#deployment)
90
  - [Learning hub](#learning-hub)
@@ -101,7 +103,7 @@ pipeline_tag: text-generation
101
  - [📊 Chat model performance](#-chat-model-performance)
102
  - [🟢 Who can use Yi?](#-who-can-use-yi)
103
  - [🟢 Misc.](#-misc)
104
- - [Ackknowledgements](#acknowledgments)
105
  - [📡 Disclaimer](#-disclaimer)
106
  - [🪪 License](#-license)
107
 
@@ -122,7 +124,9 @@ pipeline_tag: text-generation
122
  - For Chinese language capability, the Yi series models landed in 2nd place (following GPT-4), surpassing other LLMs (such as Baidu ERNIE, Qwen, and Baichuan) on the [SuperCLUE](https://www.superclueai.com/) in Oct 2023.
123
 
124
  - 🙏 (Credits to LLaMA) Thanks to the Transformer and LLaMA open-source communities, as they reducing the efforts required to build from scratch and enabling the utilization of the same tools within the AI ecosystem.
 
125
  <details style="display: inline;"><summary> If you're interested in Yi's adoption of LLaMA architecture and license usage policy, see <span style="color: green;">Yi's relation with LLaMA.</span> ⬇️</summary> <ul> <br>
 
126
  > 💡 TL;DR
127
  >
128
  > The Yi series models adopt the same model architecture as LLaMA but are **NOT** derivatives of LLaMA.
@@ -141,8 +145,59 @@ pipeline_tag: text-generation
141
  </ul>
142
  </details>
143
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
 
 
 
 
145
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
146
 
147
  <div align="right"> [ <a href="#building-the-next-generation-of-open-source-and-bilingual-llms">Back to top ⬆️ </a> ] </div>
148
 
@@ -207,67 +262,17 @@ Yi-6B-200K | • [🤗 Hugging Face](https://huggingface.co/01-ai/Yi-6B-200K)
207
 
208
  <div align="right"> [ <a href="#building-the-next-generation-of-open-source-and-bilingual-llms">Back to top ⬆️ </a> ] </div>
209
 
210
- ## 🎉 News
211
-
212
- <details>
213
- <summary>🎯 <b>2023/11/23</b>: The chat models are open to public.</summary>
214
-
215
- This release contains two chat models based on previously released base models, two 8-bit models quantized by GPTQ, and two 4-bit models quantized by AWQ.
216
-
217
- - `Yi-34B-Chat`
218
- - `Yi-34B-Chat-4bits`
219
- - `Yi-34B-Chat-8bits`
220
- - `Yi-6B-Chat`
221
- - `Yi-6B-Chat-4bits`
222
- - `Yi-6B-Chat-8bits`
223
-
224
- You can try some of them interactively at:
225
-
226
- - [Hugging Face](https://huggingface.co/spaces/01-ai/Yi-34B-Chat)
227
- - [Replicate](https://replicate.com/01-ai)
228
- </details>
229
-
230
- <details>
231
- <summary>🔔 <b>2023/11/23</b>: The Yi Series Models Community License Agreement is updated to v2.1.</summary>
232
- </details>
233
-
234
- <details>
235
- <summary>🔥 <b>2023/11/08</b>: Invited test of Yi-34B chat model.</summary>
236
-
237
- Application form:
238
-
239
- - [English](https://cn.mikecrm.com/l91ODJf)
240
- - [Chinese](https://cn.mikecrm.com/gnEZjiQ)
241
-
242
- </details>
243
-
244
- <details>
245
- <summary>🎯 <b>2023/11/05</b>: The base model of <code>Yi-6B-200K</code> and <code>Yi-34B-200K</code>.</summary>
246
-
247
- This release contains two base models with the same parameter sizes as the previous
248
- release, except that the context window is extended to 200K.
249
-
250
- </details>
251
-
252
- <details>
253
- <summary>🎯 <b>2023/11/02</b>: The base model of <code>Yi-6B</code> and <code>Yi-34B</code>.</summary>
254
-
255
- The first public release contains two bilingual (English/Chinese) base models
256
- with the parameter sizes of 6B and 34B. Both of them are trained with 4K
257
- sequence length and can be extended to 32K during inference time.
258
-
259
- </details>
260
-
261
- <div align="right"> [ <a href="#building-the-next-generation-of-open-source-and-bilingual-llms">Back to top ⬆️ </a> ] </div>
262
 
263
  # 🟢 How to use Yi?
264
 
265
  - [Quick start](#quick-start)
266
- - [Choose your path](#choose-your-parth)
267
  - [pip](#pip)
 
 
268
  - [llama.cpp](#quick-start---llamacpp)
269
  - [Web demo](#web-demo)
270
- - [Fine tune](#fine-tune)
271
  - [Quantization](#quantization)
272
  - [Deployment](#deployment)
273
  - [Learning hub](#learning-hub)
@@ -289,7 +294,7 @@ If you prefer to deploy Yi models locally,
289
  - 🙋‍♀️ and you have **sufficient** resources (for example, NVIDIA A800 80GB), you can choose one of the following methods:
290
  - [pip](#pip)
291
  - [Docker](#quick-start---docker)
292
- - [conda-lock](https://github.com/01-ai/Yi/blob/main/docs/README_legacy.md#12-local-development-environment)
293
 
294
  - 🙋‍♀️ and you have **limited** resources (for example, a MacBook Pro), you can use [llama.cpp](#quick-start---llamacpp)
295
 
@@ -449,7 +454,19 @@ ghcr.io/01-ai/yi:latest
449
  <p><strong>Note</strong> that the only difference is to set <code>--model &lt;your-model-mount-path&gt;'</code> instead of <code>model &lt;your-model-path&gt;</code>.</p>
450
  </details>
451
 
 
452
 
 
 
 
 
 
 
 
 
 
 
 
453
 
454
  ### Quick start - llama.cpp
455
  <details>
@@ -605,7 +622,7 @@ You can access the web UI by entering the address provided in the console into y
605
 
606
  ![Quick start - web demo](https://github.com/01-ai/Yi/blob/main/assets/img/yi_34b_chat_web_demo.gif?raw=true)
607
 
608
- ### Finetuning
609
 
610
  ```bash
611
  bash finetune/scripts/run_sft_Yi_6b.sh
 
82
  - [Quick start](#quick-start)
83
  - [Choose your path](#choose-your-parth)
84
  - [pip](#pip)
85
+ - [docker](#quick-start---docker)
86
  - [llama.cpp](#quick-start---llamacpp)
87
+ - [conda-lock](#quick-start---conda-lock)
88
  - [Web demo](#web-demo)
89
+ - [Fine-tuning](#fine-tuning)
90
  - [Quantization](#quantization)
91
  - [Deployment](#deployment)
92
  - [Learning hub](#learning-hub)
 
103
  - [📊 Chat model performance](#-chat-model-performance)
104
  - [🟢 Who can use Yi?](#-who-can-use-yi)
105
  - [🟢 Misc.](#-misc)
106
+ - [Acknowledgements](#acknowledgments)
107
  - [📡 Disclaimer](#-disclaimer)
108
  - [🪪 License](#-license)
109
 
 
124
  - For Chinese language capability, the Yi series models landed in 2nd place (following GPT-4), surpassing other LLMs (such as Baidu ERNIE, Qwen, and Baichuan) on the [SuperCLUE](https://www.superclueai.com/) in Oct 2023.
125
 
126
  - 🙏 (Credits to LLaMA) Thanks to the Transformer and LLaMA open-source communities, as they reducing the efforts required to build from scratch and enabling the utilization of the same tools within the AI ecosystem.
127
+
128
  <details style="display: inline;"><summary> If you're interested in Yi's adoption of LLaMA architecture and license usage policy, see <span style="color: green;">Yi's relation with LLaMA.</span> ⬇️</summary> <ul> <br>
129
+
130
  > 💡 TL;DR
131
  >
132
  > The Yi series models adopt the same model architecture as LLaMA but are **NOT** derivatives of LLaMA.
 
145
  </ul>
146
  </details>
147
 
148
+ <div align="right"> [ <a href="#building-the-next-generation-of-open-source-and-bilingual-llms">Back to top ⬆️ </a> ] </div>
149
+
150
+ ## 🎉 News
151
+
152
+ <details open>
153
+ <summary>🎯 <b>2024/01/23</b>: The Yi-VL models, <code><a href="https://huggingface.co/01-ai/Yi-VL-34B">Yi-VL-34B</a></code> and <code><a href="https://huggingface.co/01-ai/Yi-VL-6B">Yi-VL-6B</a></code>, are open-sourced and available to the public.</summary>
154
+ <br>
155
+ <code><a href="https://huggingface.co/01-ai/Yi-VL-34B">Yi-VL-34B</a></code> has ranked <strong>first</strong> among all existing open-source models in the latest benchmarks, including <a href="https://arxiv.org/abs/2311.16502">MMMU</a> and <a href="https://arxiv.org/abs/2401.11944">CMMMU</a> (based on data available up to January 2024).</li>
156
+ </details>
157
+
158
+
159
+ <details>
160
+ <summary>🎯 <b>2023/11/23</b>: <a href="#chat-models">Chat models</a> are open-sourced and available to the public.</summary>
161
+ <br>This release contains two chat models based on previously released base models, two 8-bit models quantized by GPTQ, and two 4-bit models quantized by AWQ.
162
+
163
+ - `Yi-34B-Chat`
164
+ - `Yi-34B-Chat-4bits`
165
+ - `Yi-34B-Chat-8bits`
166
+ - `Yi-6B-Chat`
167
+ - `Yi-6B-Chat-4bits`
168
+ - `Yi-6B-Chat-8bits`
169
+
170
+ You can try some of them interactively at:
171
+
172
+ - [Hugging Face](https://huggingface.co/spaces/01-ai/Yi-34B-Chat)
173
+ - [Replicate](https://replicate.com/01-ai)
174
+ </details>
175
+
176
+ <details>
177
+ <summary>🔔 <b>2023/11/23</b>: The Yi Series Models Community License Agreement is updated to <a href="https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt">v2.1</a>.</summary>
178
+ </details>
179
 
180
+ <details>
181
+ <summary>🔥 <b>2023/11/08</b>: Invited test of Yi-34B chat model.</summary>
182
+ <br>Application form:
183
 
184
+ - [English](https://cn.mikecrm.com/l91ODJf)
185
+ - [Chinese](https://cn.mikecrm.com/gnEZjiQ)
186
+ </details>
187
+
188
+ <details>
189
+ <summary>🎯 <b>2023/11/05</b>: <a href="#base-models">The base models, </a><code>Yi-6B-200K</code> and <code>Yi-34B-200K</code>, are open-sourced and available to the public.</summary>
190
+ <br>This release contains two base models with the same parameter sizes as the previous
191
+ release, except that the context window is extended to 200K.
192
+ </details>
193
+
194
+ <details>
195
+ <summary>🎯 <b>2023/11/02</b>: <a href="#base-models">The base models, </a><code>Yi-6B</code> and <code>Yi-34B</code>, are open-sourced and available to the public.</summary>
196
+ <br>The first public release contains two bilingual (English/Chinese) base models
197
+ with the parameter sizes of 6B and 34B. Both of them are trained with 4K
198
+ sequence length and can be extended to 32K during inference time.
199
+
200
+ </details>
201
 
202
  <div align="right"> [ <a href="#building-the-next-generation-of-open-source-and-bilingual-llms">Back to top ⬆️ </a> ] </div>
203
 
 
262
 
263
  <div align="right"> [ <a href="#building-the-next-generation-of-open-source-and-bilingual-llms">Back to top ⬆️ </a> ] </div>
264
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
265
 
266
  # 🟢 How to use Yi?
267
 
268
  - [Quick start](#quick-start)
269
+ - [Choose your path](#choose-your-path)
270
  - [pip](#pip)
271
+ - [docker](#quick-start---docker)
272
+ - [conda-lock](#quick-start---conda-lock)
273
  - [llama.cpp](#quick-start---llamacpp)
274
  - [Web demo](#web-demo)
275
+ - [Fine tune](#finetuning)
276
  - [Quantization](#quantization)
277
  - [Deployment](#deployment)
278
  - [Learning hub](#learning-hub)
 
294
  - 🙋‍♀️ and you have **sufficient** resources (for example, NVIDIA A800 80GB), you can choose one of the following methods:
295
  - [pip](#pip)
296
  - [Docker](#quick-start---docker)
297
+ - [conda-lock](#quick-start---conda-lock)
298
 
299
  - 🙋‍♀️ and you have **limited** resources (for example, a MacBook Pro), you can use [llama.cpp](#quick-start---llamacpp)
300
 
 
454
  <p><strong>Note</strong> that the only difference is to set <code>--model &lt;your-model-mount-path&gt;'</code> instead of <code>model &lt;your-model-path&gt;</code>.</p>
455
  </details>
456
 
457
+ ### Quick start - conda-lock
458
 
459
+ <details>
460
+ <summary>You can use <code><a href="https://github.com/conda/conda-lock">conda-lock</a></code> to generate fully reproducible lock files for conda environments. ⬇️</summary>
461
+ <br>
462
+ You can refer to <a href="https://github.com/01-ai/Yi/blob/ebba23451d780f35e74a780987ad377553134f68/conda-lock.yml">conda-lock.yml</a> for the exact versions of the dependencies. Additionally, you can utilize <code><a href="https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html">micromamba</a></code> for installing these dependencies.
463
+ <br>
464
+ To install the dependencies, follow these steps:
465
+
466
+ 1. Install micromamba by following the instructions available <a href="https://mamba.readthedocs.io/en/latest/installation/micromamba-installation.html">here</a>.
467
+
468
+ 2. Execute <code>micromamba install -y -n yi -f conda-lock.yml</code> to create a conda environment named <code>yi</code> and install the necessary dependencies.
469
+ </details>
470
 
471
  ### Quick start - llama.cpp
472
  <details>
 
622
 
623
  ![Quick start - web demo](https://github.com/01-ai/Yi/blob/main/assets/img/yi_34b_chat_web_demo.gif?raw=true)
624
 
625
+ ### Fine-tuning
626
 
627
  ```bash
628
  bash finetune/scripts/run_sft_Yi_6b.sh