vtriple commited on
Commit
12e067c
·
verified ·
1 Parent(s): cf649f8

Update Readme

Browse files
Files changed (1) hide show
  1. README.md +169 -0
README.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen2.5-Coder-7B-Instruct
5
+ ---
6
+ # ThreatFlux-Qwen2.5-7B-Instruct
7
+
8
+ ## Model Information
9
+
10
+ - **Author**: [Wyatt Roersma](https://www.linkedin.com/in/wyattroersma/)
11
+ - **Organization**: ThreatFlux
12
+ - **Model Type**: Fine-tuned Language Model
13
+ - **Base Model**: [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
14
+ - **License**: Apache 2.0
15
+
16
+ This model is a specialized fine-tuned version of Qwen2.5-Coder-7B-Instruct optimized for YARA rule generation and analysis. It inherits the powerful code generation and reasoning capabilities of the base model while adding specialized knowledge for security applications.
17
+
18
+ ## Deployment Methods
19
+
20
+ ### Ollama
21
+ ```bash
22
+ ollama run hf.co/vtriple/Qwen-2.5-7B-Threatflux
23
+ ```
24
+
25
+ ### llama-cpp-python
26
+ ```python
27
+ from llama_cpp import Llama
28
+
29
+ llm = Llama.from_pretrained(
30
+ repo_id="vtriple/Qwen-2.5-7B-Threatflux",
31
+ filename="threatflux.gguf",
32
+ )
33
+
34
+ llm.create_chat_completion(
35
+ messages=[{"role": "user", "content": "Write a YARA rule for..."}]
36
+ )
37
+ ```
38
+
39
+ ### llama.cpp
40
+
41
+ #### Install via Homebrew
42
+ ```bash
43
+ brew install llama.cpp
44
+ ```
45
+
46
+ #### Run the Model
47
+ ```bash
48
+ llama-cli \
49
+ --hf-repo "vtriple/Qwen-2.5-7B-Threatflux" \
50
+ --hf-file threatflux.gguf \
51
+ -p "You are a helpful assistant" \
52
+ --conversation
53
+ ```
54
+
55
+ For more details on llama.cpp implementation, refer to the [llama.cpp documentation](https://github.com/ggerganov/llama.cpp).
56
+
57
+ ## Model Details
58
+
59
+ ### Base Model Architecture
60
+ - Model Type: Causal Language Model
61
+ - Parameters: 7.61B (Base) / 6.53B (Non-Embedding)
62
+ - Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
63
+ - Layers: 28
64
+ - Attention Heads: 28 for Q and 4 for KV (GQA)
65
+ - Context Length: 131,072 tokens
66
+ - Training Data: Built on Qwen2.5's 5.5 trillion token dataset
67
+
68
+ ### Fine-tuning Specifications
69
+ - Training Dataset: ~1,600 specialized samples curated by ThreatFlux
70
+ - Training Type: Instruction tuning
71
+ - Domain Focus: YARA rules, malware analysis, threat detection
72
+
73
+ ## Intended Use
74
+
75
+ This model is designed to assist security professionals in:
76
+ - Generating and optimizing YARA rules
77
+ - Analyzing malware patterns
78
+ - Supporting threat hunting workflows
79
+ - Enhancing detection capabilities
80
+
81
+ ## Performance Metrics and Testing
82
+
83
+ ### Testing Environment
84
+ - **GPU**: NVIDIA H100 NVL (48.3 TFLOPS)
85
+ - **GPU Memory**: 93.6 GB
86
+ - **Memory Bandwidth**: 2271.1 GB/s
87
+ - **PCIe**: 5.0 x16 (54.4 GB/s)
88
+ - **CPU**: AMD EPYC 9124 16-Core Processor
89
+ - **System Memory**: 193 GB
90
+ - **Storage**: SAMSUNG MZQLB7T6HALA-00AAZ
91
+ - **CUDA Version**: 12.4
92
+ - **Network Speed**: 8334.9/7516.3 Mbps (Up/Down)
93
+
94
+ ### Testing Results
95
+ - Total Training Time: ~45 hours
96
+ - Average Cost per Hour: $2.6667 (GPU)
97
+ - Testing Duration: Multiple sessions totaling approximately 23.953 hours
98
+ - Testing Environment: Ubuntu Latest with SSH access
99
+
100
+ ## Performance and Limitations
101
+
102
+ ### Strengths
103
+ - Specialized knowledge in YARA rule syntax and best practices
104
+ - Inherits Qwen2.5-Coder's strong code reasoning abilities
105
+ - Long context understanding for complex analysis
106
+ - Maintains mathematical and general coding competencies
107
+
108
+ ### Limitations
109
+ - Should be used as an assistant, not a replacement for security expertise
110
+ - Generated rules require human validation
111
+ - Performance varies based on deployment environment
112
+ - Inherits base model's limitations
113
+
114
+ ## Technical Specifications
115
+
116
+ ### Deployment Requirements
117
+ - Compatible with Hugging Face Transformers (requires version ≥4.37.0)
118
+ - Supports both CPU and GPU deployment
119
+ - Can utilize YaRN for long context processing
120
+
121
+ ### Configuration
122
+ For extended context length support (>32,768 tokens), add to config.json:
123
+ ```json
124
+ {
125
+ "rope_scaling": {
126
+ "factor": 4.0,
127
+ "original_max_position_embeddings": 32768,
128
+ "type": "yarn"
129
+ }
130
+ }
131
+ ```
132
+
133
+ ## Training Details
134
+
135
+ This model was fine-tuned on the Qwen2.5-Coder-7B-Instruct base, which includes:
136
+ - Comprehensive code generation capabilities
137
+ - Strong mathematical reasoning
138
+ - Extended context understanding
139
+ - Security-focused enhancements
140
+
141
+ The fine-tuning process focused on:
142
+ - YARA rule syntax and structure
143
+ - Pattern matching optimization
144
+ - Security use cases
145
+ - Real-world application scenarios
146
+
147
+ ## License
148
+
149
+ This model inherits the Apache 2.0 license from its base model. See [LICENSE](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/blob/main/LICENSE) for details.
150
+
151
+ ## Citation
152
+
153
+ If you use this model in your work, please cite both this model and the original Qwen2.5-Coder work:
154
+
155
+ ```bibtex
156
+ @article{hui2024qwen2,
157
+ title={Qwen2.5-Coder Technical Report},
158
+ author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others},
159
+ journal={arXiv preprint arXiv:2409.12186},
160
+ year={2024}
161
+ }
162
+
163
+ @article{qwen2,
164
+ title={Qwen2 Technical Report},
165
+ author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
166
+ journal={arXiv preprint arXiv:2407.10671},
167
+ year={2024}
168
+ }
169
+ ```