--- license: other license_name: tongyi-qianwen license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE pipeline_tag: text-generation language: - en - zh library_name: transformers tags: - mergekit - qwen2 --- # Iridium-72B-v0.1 ## Model Description Iridium is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`. ## Features - 72 billion parameters - Comes in 1,043 individual safetensor files - Combines Magnum prose with Calam smarts ## Technical Specifications ### Architecture - `Qwen2ForCasualLM` - Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1 - Merged layers: 80 - Total tensors: 1,043 - Context length: 128k ### Tensor Distribution - Attention layers: 560 files - MLP layers: 240 files - Layer norms: 160 files - Miscellaneous (embeddings, output): 83 files ### Merging Custom script utilizing safetensors library. ## Usage ### Loading the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained("leafspark/Iridium-72B-v0.1", device_map="auto", torch_dtype=torch.float16) tokenizer = AutoTokenizer.from_pretrained("leafspark/Iridium-72B-v0.1") ``` ### GGUFs Find them here: [leafspark/Iridium-72B-v0.1-GGUF](https://huggingface.co/leafspark/Iridium-72B-v0.1-GGUF) ### Optimal Sampling Parameters I found these to work well: ```json { "temperature": 1 "min_p": 0.08 "top_p": 1 "top_k": 40 "repetition_penalty": 1 } ``` ### Hardware Requirements - At least 135GB of free space - ~140GB VRAM/RAM