jondurbin commited on
Commit
87a646c
·
1 Parent(s): bfc0ea7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -44,6 +44,38 @@ An experimental fine-tune of [yi-34b-200k](https://huggingface.co/01-ai/Yi-34B-2
44
 
45
  This version underwent a subset of DPO, but is fairly censored. For a less censored version, try [bagel-dpo-34b-v0.2](https://hf.co/jondurbin/bagel-dpo-34b-v0.2)
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  ## SFT data sources
48
 
49
  *Yes, you will see benchmark names in the list, but this only uses the train splits, and a decontamination by cosine similarity is performed at the end as a sanity check*
 
44
 
45
  This version underwent a subset of DPO, but is fairly censored. For a less censored version, try [bagel-dpo-34b-v0.2](https://hf.co/jondurbin/bagel-dpo-34b-v0.2)
46
 
47
+ ## How to easily download and use this model
48
+
49
+ [Massed Compute](https://massedcompute.com/?utm_source=huggingface&utm_creative_format=model_card&utm_content=creator_jon) has created a Virtual Machine (VM) pre-loaded with TGI and Text Generation WebUI.
50
+
51
+ 1) For this model rent the [Jon Durbin 2xA6000](https://shop.massedcompute.com/products/jon-durbin-2x-a6000?utm_source=huggingface&utm_creative_format=model_card&utm_content=creator_jon) Virtual Machine
52
+ 2) After you start your rental you will receive an email with instructions on how to Login to the VM
53
+ 3) Once inside the VM, open the terminal and run `conda activate text-generation-inference`
54
+ 4) Then `cd Desktop/text-generation-inference/`
55
+ 5) Run `volume=$PWD/data`
56
+ 6) Run`model=jondurbin/nontoxic-bagle-34b-v0.2`
57
+ 7) `sudo docker run --gpus '"device=0,1"' --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
58
+ 8) The model will take some time to load...
59
+ 9) Once loaded the model will be available on port 8080
60
+
61
+ Sample command within the VM
62
+ ```
63
+ curl 0.0.0.0:8080/generate \
64
+ -X POST \
65
+ -d '{"inputs":"<|system|>You are a friendly chatbot.\n<|user|>What type of model are you?\n<|assistant|>","parameters":{"do_sample": true, "max_new_tokens": 100, "repetition_penalty": 1.15, "temperature": 0.7, "top_k": 20, "top_p": 0.9, "best_of": 1}}'\
66
+ -H 'Content-Type: application/json'
67
+ ```
68
+
69
+ You can also access the model from outside the VM
70
+ ```
71
+ curl IP_ADDRESS_PROVIDED_BY_MASSED_COMPUTE_VM:8080/generate \
72
+ -X POST \
73
+ -d '{"inputs":"<|system|>You are a friendly chatbot.\n<|user|>What type of model are you?\n<|assistant|>","parameters":{"do_sample": true, "max_new_tokens": 100, "repetition_penalty": 1.15, "temperature": 0.7, "top_k": 20, "top_p": 0.9, "best_of": 1}}'\
74
+ -H 'Content-Type: application/json
75
+ ```
76
+
77
+ For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
78
+
79
  ## SFT data sources
80
 
81
  *Yes, you will see benchmark names in the list, but this only uses the train splits, and a decontamination by cosine similarity is performed at the end as a sanity check*