Iridium-72B-v0.1

Model Description

Iridium is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using model_stock.

Features

  • 72 billion parameters
  • Combines Magnum prose with Calam smarts

Technical Specifications

Architecture

  • Qwen2ForCasualLM
  • Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
  • Merged layers: 80
  • Total tensors: 963
  • Context length: 128k

Tensor Distribution

  • Attention layers: 560 files
  • MLP layers: 240 files
  • Layer norms: 160 files
  • Miscellaneous (embeddings, output): 3 files

Merging

Custom script utilizing safetensors library.

Usage

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained("leafspark/Iridium-72B-v0.1", 
                                             device_map="auto", 
                                             torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained("leafspark/Iridium-72B-v0.1")

GGUFs

Find them here: leafspark/Iridium-72B-v0.1-GGUF

Optimal Sampling Parameters

I found these to work well:

{
  "temperature": 1
  "min_p": 0.08
  "top_p": 1
  "top_k": 40
  "repetition_penalty": 1
}

Hardware Requirements

  • At least 135GB of free space
  • ~140GB VRAM/RAM
Downloads last month
757
Safetensors
Model size
72.7B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for leafspark/Iridium-72B-v0.1

Quantizations
2 models

Collection including leafspark/Iridium-72B-v0.1