Philip Blair commited on
Commit
6f545a3
·
1 Parent(s): 21e411e

Modify tokenizer config to have non-null pad token

Browse files
Files changed (2) hide show
  1. README.md +4 -1
  2. tokenizer_config.json +1 -1
README.md CHANGED
@@ -10,6 +10,9 @@ inference:
10
 
11
  # Model Card for Mistral-7B-Instruct-v0.1
12
 
 
 
 
13
  The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) generative text model using a variety of publicly available conversation datasets.
14
 
15
  For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
@@ -84,4 +87,4 @@ make the model finely respect guardrails, allowing for deployment in environment
84
 
85
  ## The Mistral AI Team
86
 
87
- Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
 
10
 
11
  # Model Card for Mistral-7B-Instruct-v0.1
12
 
13
+ **NOTE**: This is a fork of [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) intended to have a non-null pad token. This has been done in order to
14
+ facilitate usage of this model with off-the-shelf PEFT tuners, such as what is offered by Google Cloud Vertex AI.
15
+
16
  The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) generative text model using a variety of publicly available conversation datasets.
17
 
18
  For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
 
87
 
88
  ## The Mistral AI Team
89
 
90
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
tokenizer_config.json CHANGED
@@ -34,7 +34,7 @@
34
  "eos_token": "</s>",
35
  "legacy": true,
36
  "model_max_length": 1000000000000000019884624838656,
37
- "pad_token": null,
38
  "sp_model_kwargs": {},
39
  "spaces_between_special_tokens": false,
40
  "tokenizer_class": "LlamaTokenizer",
 
34
  "eos_token": "</s>",
35
  "legacy": true,
36
  "model_max_length": 1000000000000000019884624838656,
37
+ "pad_token": "<unk>",
38
  "sp_model_kwargs": {},
39
  "spaces_between_special_tokens": false,
40
  "tokenizer_class": "LlamaTokenizer",