File size: 6,835 Bytes
12e067c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0b41b59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12e067c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96d1e72
 
 
 
 
 
 
 
 
 
12e067c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
---
license: apache-2.0
base_model:
- Qwen/Qwen2.5-Coder-7B-Instruct
---
# ThreatFlux-Qwen2.5-7B-Instruct

## Model Information

- **Author**: [Wyatt Roersma](https://www.linkedin.com/in/wyattroersma/)
- **Organization**: ThreatFlux
- **Model Type**: Fine-tuned Language Model
- **Base Model**: [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
- **License**: Apache 2.0

This model is a specialized fine-tuned version of Qwen2.5-Coder-7B-Instruct optimized for YARA rule generation and analysis. It inherits the powerful code generation and reasoning capabilities of the base model while adding specialized knowledge for security applications.

## Deployment Methods

### Ollama
```bash
ollama run hf.co/vtriple/Qwen-2.5-7B-Threatflux
```

### llama-cpp-python
```python
from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="vtriple/Qwen-2.5-7B-Threatflux",
    filename="threatflux.gguf",
)

llm.create_chat_completion(
    messages=[{"role": "user", "content": "Write a YARA rule for..."}]
)
```

### llama.cpp

#### Install via Homebrew
```bash
brew install llama.cpp
```

#### Run the Model
```bash
llama-cli \
  --hf-repo "vtriple/Qwen-2.5-7B-Threatflux" \
  --hf-file threatflux.gguf \
  -p "You are a helpful assistant" \
  --conversation
```

For more details on llama.cpp implementation, refer to the [llama.cpp documentation](https://github.com/ggerganov/llama.cpp).

## Model Details

### Base Model Architecture
- Model Type: Causal Language Model
- Parameters: 7.61B (Base) / 6.53B (Non-Embedding)
- Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- Layers: 28
- Attention Heads: 28 for Q and 4 for KV (GQA)
- Context Length: 131,072 tokens
- Training Data: Built on Qwen2.5's 5.5 trillion token dataset

### Fine-tuning Specifications
- Training Dataset: ~1,600 specialized samples curated by ThreatFlux
- Training Type: Instruction tuning
- Domain Focus: YARA rules, malware analysis, threat detection

## Example Output

Here's an example of the model's YARA rule generation capabilities:

```yara
private rule Track_EXE_Files {
    meta:
        description = "Detects all EXE (Executable) files"
        author = "ThreatFlux"
        version = "1.0"
    condition:
        uint16(0) == 0x5A4D
}
```

The model provides detailed explanations of generated rules:
- Rule is marked private to prevent inclusion in final compiled rules
- Utilizes PE signature (MZ header) verification
- Includes condition logic explanation
- Provides technical details about hex values and their significance

## Intended Use

This model is designed to assist security professionals in:
- Generating and optimizing YARA rules
- Analyzing malware patterns
- Supporting threat hunting workflows
- Enhancing detection capabilities

## Performance Metrics and Testing

### Testing Environment
- **GPU**: NVIDIA H100 NVL (48.3 TFLOPS)
- **GPU Memory**: 93.6 GB
- **Memory Bandwidth**: 2271.1 GB/s
- **PCIe**: 5.0 x16 (54.4 GB/s)
- **CPU**: AMD EPYC 9124 16-Core Processor
- **System Memory**: 193 GB
- **Storage**: SAMSUNG MZQLB7T6HALA-00AAZ
- **CUDA Version**: 12.4
- **Network Speed**: 8334.9/7516.3 Mbps (Up/Down)

### Testing Results
- Total Training Time: ~45 hours
- Average Cost per Hour: $2.6667 (GPU)
- Testing Duration: Multiple sessions totaling approximately 23.953 hours
- Testing Environment: Ubuntu Latest with SSH access

## Performance and Limitations

### Strengths
- Specialized knowledge in YARA rule syntax and best practices
- Inherits Qwen2.5-Coder's strong code reasoning abilities
- Long context understanding for complex analysis
- Maintains mathematical and general coding competencies

### Limitations
- Should be used as an assistant, not a replacement for security expertise
- Generated rules require human validation
- Performance varies based on deployment environment
- Inherits base model's limitations

## Technical Specifications

### Deployment Requirements
- Compatible with Hugging Face Transformers (requires version ≥4.37.0)
- Supports both CPU and GPU deployment
- Can utilize YaRN for long context processing

### Configuration
For extended context length support (>32,768 tokens), add to config.json:
```json
{
  "rope_scaling": {
    "factor": 4.0,
    "original_max_position_embeddings": 32768,
    "type": "yarn"
  }
}
```

## Training Details

This model was fine-tuned on the Qwen2.5-Coder-7B-Instruct base, which includes:
- Comprehensive code generation capabilities
- Strong mathematical reasoning
- Extended context understanding
- Security-focused enhancements

The fine-tuning process focused on:
- YARA rule syntax and structure
- Pattern matching optimization
- Security use cases
- Real-world application scenarios

## License

This model inherits the Apache 2.0 license from its base model. See [LICENSE](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/LICENSE) for details.

## Community Support

We're working to make this model freely accessible to everyone through a dedicated public inference server. If you'd like to support this initiative, you can contribute to our server fund:
[Support the ThreatFlux Public Server](https://www.paypal.com/pool/9bfdmRII5Q?sr=wccr)
Your support helps maintain:
- Free public API access
- Consistent model availability
- Community training improvements
- Dedicated infrastructure

## Citation

If you use this model in your work, please cite both this model and the original Qwen2.5-Coder work:

```bibtex
@article{hui2024qwen2,
      title={Qwen2.5-Coder Technical Report},
      author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others},
      journal={arXiv preprint arXiv:2409.12186},
      year={2024}
}

@article{qwen2,
      title={Qwen2 Technical Report}, 
      author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
      journal={arXiv preprint arXiv:2407.10671},
      year={2024}
}
```