Update README.md
Browse files
README.md
CHANGED
@@ -765,14 +765,14 @@ print(tokenizer.apply_chat_template(chat, tokenize=False))
|
|
765 |
2) After you created your account update your billing and navigate to the deploy page.
|
766 |
3) Select the following
|
767 |
- GPU Type: A6000
|
768 |
-
- GPU Quantity:
|
769 |
- Category: Creator
|
770 |
- Image: Jon Durbin
|
771 |
- Coupon Code: JonDurbin
|
772 |
4) Deploy the VM!
|
773 |
5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
|
774 |
6) Once inside the VM, open the terminal and run `volume=$PWD/data`
|
775 |
-
7) Run `model=jondurbin/bagel-dpo-
|
776 |
8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
|
777 |
9) The model will take some time to load...
|
778 |
10) Once loaded the model will be available on port 8080
|
@@ -797,23 +797,7 @@ For assistance with the VM join the [Massed Compute Discord Server](https://disc
|
|
797 |
|
798 |
### Latitude.sh
|
799 |
|
800 |
-
[Latitude](https://www.latitude.sh/r/4BBD657C) has h100 instances available (as of today, 2024-02-08) for $3/hr!
|
801 |
-
|
802 |
-
I've added a blueprint for running text-generation-webui within their container system:
|
803 |
-
https://www.latitude.sh/dashboard/create/containerWithBlueprint?id=7d1ab441-0bda-41b9-86f3-3bc1c5e08430
|
804 |
-
|
805 |
-
Be sure to set the following environment variables:
|
806 |
-
|
807 |
-
| key | value |
|
808 |
-
| --- | --- |
|
809 |
-
| PUBLIC_KEY | `{paste your ssh public key}` |
|
810 |
-
| UI_ARGS | `--trust-remote-code` |
|
811 |
-
|
812 |
-
Access the webui via `http://{container IP address}:7860`, navigate to model, download jondurbin/bagel-dpo-20b-v04, and ensure the following values are set:
|
813 |
-
|
814 |
-
- `use_flash_attention_2` should be checked
|
815 |
-
- set Model loader to Transformers
|
816 |
-
- `trust-remote-code` should be checked
|
817 |
|
818 |
## Support me
|
819 |
|
|
|
765 |
2) After you created your account update your billing and navigate to the deploy page.
|
766 |
3) Select the following
|
767 |
- GPU Type: A6000
|
768 |
+
- GPU Quantity: 2
|
769 |
- Category: Creator
|
770 |
- Image: Jon Durbin
|
771 |
- Coupon Code: JonDurbin
|
772 |
4) Deploy the VM!
|
773 |
5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
|
774 |
6) Once inside the VM, open the terminal and run `volume=$PWD/data`
|
775 |
+
7) Run `model=jondurbin/bagel-dpo-34b-v0.5`
|
776 |
8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
|
777 |
9) The model will take some time to load...
|
778 |
10) Once loaded the model will be available on port 8080
|
|
|
797 |
|
798 |
### Latitude.sh
|
799 |
|
800 |
+
[Latitude](https://www.latitude.sh/r/4BBD657C) has h100 instances available (as of today, 2024-02-08) for $3/hr! A single h100 works great for this model, though you probably want to decrease the context length from 200k to 8k or 16k.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
801 |
|
802 |
## Support me
|
803 |
|