ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M

ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M is a custom merged language model based on Qwen2.5-7B with enhanced reasoning, roleplaying, and long-context capabilities. This model supports up to 1 million token context lengths, making it ideal for ultra-long text processing, deep reasoning tasks, and immersive roleplay interactions.

Quants are availble in GGUF format, provided by mradermacher. 1. GGUF 2. imatrix GGUF

๐Ÿ”ง Model Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct-1M
  • Models Used in Merge:
    • Qwen/Qwen2.5-7B-Instruct-1M
    • bunnycore/Qwen2.5-7B-RRP-1M
    • Triangle104/Q2.5-Instruct-1M_Harmony
    • Sakalti/SJT-7B-1M
    • huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
  • Merge Method: MODEL_STOCK (Optimized layer-wise weight averaging)

๐Ÿ“– Overview

Qwen2.5-7B-CelestialHarmony-1M enhances the Qwen2.5-7B series with a fine-tuned balance of roleplaying dynamics, structured reasoning, and long-context memory. The model is particularly well-suited for:

  • Roleplaying ๐Ÿงโ€โ™‚๏ธ: Immersive character-based storytelling with deep contextual awareness.
  • Reasoning & Thought Processing ๐Ÿง : Capable of structured logical thinking, especially when prompted with <think> tags.
  • Ultra-Long Context Handling ๐Ÿ“œ: Efficient processing of sequences up to 1,010,000 tokens using optimized sparse attention.

โš™๏ธ Technical Specifications

Specification Value
Model Type Causal Language Model
Parameters 7.61B
Non-Embedding Parameters 6.53B
Layers 28
Attention Heads (GQA) 28 (Q), 4 (KV)
Max Context Length 1,010,000 tokens
Max Generation Length 8,192 tokens
Merge Method Model Stock

๐Ÿ”ฌ Merging Details

This model was merged using the Model Stock method, which optimally averages weights from multiple fine-tuned models to create a more efficient, balanced, and performant model.

Merge YAML Configuration

base_model: Qwen/Qwen2.5-7B-Instruct-1M
dtype: bfloat16
merge_method: model_stock
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Triangle104/Q2.5-Instruct-1M_Harmony
  - model: Sakalti/SJT-7B-1M
  - model: bunnycore/Qwen2.5-7B-RRP-1M
  - model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M

๐Ÿš€ Quickstart

Install Required Packages

Ensure you have the latest transformers library installed:

pip install transformers torch accelerate

Load and Use the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Tell me a short story about an ancient celestial warrior."
messages = [
    {"role": "system", "content": "You are a wise celestial storyteller."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(**model_inputs, max_new_tokens=512)
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(response)

โšก Optimized Deployment with vLLM

For long-context inference, use vLLM:

git clone -b dev/dual-chunk-attn [email protected]:QwenLM/vllm.git
cd vllm
pip install -e . -v

Run the model:

vllm serve ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M \
  --tensor-parallel-size 4 \
  --max-model-len 1010000 \
  --enable-chunked-prefill --max-num-batched-tokens 131072 \
  --enforce-eager \
  --max-num-seqs 1

๐ŸŽฏ Model Capabilities

โœ… Roleplay & Storytelling โ€“ Designed for engaging interactions.
โœ… Long-Context Awareness โ€“ Handles texts up to 1M tokens.
โœ… Logical Thinking & Reasoning โ€“ Supports <think> tag to enhance thought structuring.
โœ… Optimized Merge Strategy โ€“ Uses Model Stock for superior generalization.


๐Ÿ“œ Acknowledgments

This model is built on top of Qwen2.5-7B, with contributions from bunnycore, Triangle104, and Sakalti, leveraging the Model Stock merging methodology.

For further details, see:


Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 31.75
IFEval (0-Shot) 59.44
BBH (3-Shot) 34.51
MATH Lvl 5 (4-Shot) 33.01
GPQA (0-shot) 9.17
MuSR (0-shot) 16.74
MMLU-PRO (5-shot) 37.63
Downloads last month
58
Safetensors
Model size
7.61B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M

Space using ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M 1

Evaluation results