leafspark
/

Iridium-72B-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

leafspark commited on Aug 14, 2024

Commit

3d202c1

·

verified ·

1 Parent(s): 4b61728

docs: update model card with presets

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -29,6 +29,7 @@ Iridium is a 72B parameter language model created through a merge of Qwen2-72B-I
 - Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
 - Merged layers: 80
 - Total tensors: 1,043
 ### Tensor Distribution
 - Attention layers: 560 files
@@ -55,6 +56,19 @@ tokenizer = AutoTokenizer.from_pretrained("leafspark/Iridium-72B-v0.1")
 Find them here: [leafspark/Iridium-72B-v0.1-GGUF](https://huggingface.co/leafspark/Iridium-72B-v0.1-GGUF)
 ### Hardware Requirements
-- Minimum ~140GB of storage
-- ~140GB VRAM

 - Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
 - Merged layers: 80
 - Total tensors: 1,043
+- Context length: 128k
 ### Tensor Distribution
 - Attention layers: 560 files
 Find them here: [leafspark/Iridium-72B-v0.1-GGUF](https://huggingface.co/leafspark/Iridium-72B-v0.1-GGUF)
+### Optimal Sampling Parameters
+I found these to work well:
+```json
+{
+  "temperature": 1
+  "min_p": 0.08
+  "top_p": 1
+  "top_k": 40
+  "repetition_penalty": 1
+}
+```
 ### Hardware Requirements
+- At least 135GB of free space
+- ~140GB VRAM/RAM