llmware
/

bling-sheared-llama-2.7b-0.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

doberst commited on Nov 12, 2023

Commit

9845df9

·

1 Parent(s): 4c36692

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -77,13 +77,13 @@ Any model can provide inaccurate or incomplete information, and should be used i
 The fastest way to get started with BLING is through direct import in transformers:
-   from transformers import AutoTokenizer, AutoModelForCausalLM
-   tokenizer = AutoTokenizer.from_pretrained("llmware/bling-sheared-llama-2.7b-0.1")
-   model = AutoModelForCausalLM.from_pretrained("llmware/bling-sheared-llama-2.7b-0.1")
 The BLING model was fine-tuned with a simple "\<human> and \<bot> wrapper", so to get the best results, wrap inference entries as:
-   full_prompt = "\<human>\: " + my_prompt + "\n" + "\<bot>\:"
 The BLING model was fine-tuned with closed-context samples, which assume generally that the prompt consists of two sub-parts:

 The fastest way to get started with BLING is through direct import in transformers:
+    from transformers import AutoTokenizer, AutoModelForCausalLM
+    tokenizer = AutoTokenizer.from_pretrained("llmware/bling-sheared-llama-2.7b-0.1")
+    model = AutoModelForCausalLM.from_pretrained("llmware/bling-sheared-llama-2.7b-0.1")
 The BLING model was fine-tuned with a simple "\<human> and \<bot> wrapper", so to get the best results, wrap inference entries as:
+    full_prompt = "<human>: " + my_prompt + "\n" + "<bot>:"
 The BLING model was fine-tuned with closed-context samples, which assume generally that the prompt consists of two sub-parts: