jondurbin
/

bagel-34b-v0.5

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jondurbin commited on Apr 2, 2024

Commit

34e37b3

·

verified ·

1 Parent(s): 258bef4

Update README.md

Files changed (1) hide show

README.md +3 -19

README.md CHANGED Viewed

@@ -765,14 +765,14 @@ print(tokenizer.apply_chat_template(chat, tokenize=False))
 2) After you created your account update your billing and navigate to the deploy page.
 3) Select the following
     - GPU Type: A6000
-    - GPU Quantity: 1
     - Category: Creator
     - Image: Jon Durbin
     - Coupon Code: JonDurbin
 4) Deploy the VM!
 5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
 6) Once inside the VM, open the terminal and run `volume=$PWD/data`
-7) Run `model=jondurbin/bagel-dpo-20b-v04`
 8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
 9) The model will take some time to load...
 10) Once loaded the model will be available on port 8080
@@ -797,23 +797,7 @@ For assistance with the VM join the [Massed Compute Discord Server](https://disc
 ### Latitude.sh
-[Latitude](https://www.latitude.sh/r/4BBD657C) has h100 instances available (as of today, 2024-02-08) for $3/hr!
-I've added a blueprint for running text-generation-webui within their container system:
-https://www.latitude.sh/dashboard/create/containerWithBlueprint?id=7d1ab441-0bda-41b9-86f3-3bc1c5e08430
-Be sure to set the following environment variables:
-| key | value |
-| --- | --- |
-| PUBLIC_KEY | `{paste your ssh public key}` |
-| UI_ARGS | `--trust-remote-code` |
-Access the webui via `http://{container IP address}:7860`, navigate to model, download jondurbin/bagel-dpo-20b-v04, and ensure the following values are set:
-- `use_flash_attention_2` should be checked
-- set Model loader to Transformers
-- `trust-remote-code` should be checked
 ## Support me

 2) After you created your account update your billing and navigate to the deploy page.
 3) Select the following
     - GPU Type: A6000
+    - GPU Quantity: 2
     - Category: Creator
     - Image: Jon Durbin
     - Coupon Code: JonDurbin
 4) Deploy the VM!
 5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
 6) Once inside the VM, open the terminal and run `volume=$PWD/data`
+7) Run `model=jondurbin/bagel-dpo-34b-v0.5`
 8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
 9) The model will take some time to load...
 10) Once loaded the model will be available on port 8080
 ### Latitude.sh
+[Latitude](https://www.latitude.sh/r/4BBD657C) has h100 instances available (as of today, 2024-02-08) for $3/hr!  A single h100 works great for this model, though you probably want to decrease the context length from 200k to 8k or 16k.
 ## Support me