phanerozoic
commited on
Commit
·
b8c8809
1
Parent(s):
30f1179
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,103 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
---
|
6 |
+
|
7 |
+
# MistralPirate-7b-v0.3
|
8 |
+
|
9 |
+
## Model Card
|
10 |
+
|
11 |
+
### Description
|
12 |
+
MistralPirate-7b-v0.3 is a sophisticated language model fine-tuned for generating intricate and authentic pirate-themed content. This version, correcting our version control from v2 to v0.3, builds upon MistralPirate-7b-v2 and leverages advancements from Mistral Instruct v0.2. It shows improved performance in pirate dialect accuracy and perplexity scores.
|
13 |
+
|
14 |
+
- **Developed by**: phanerozoic
|
15 |
+
- **License**: cc-by-nc-4.0
|
16 |
+
- **Finetuned from**: Mistral Instruct v0.2
|
17 |
+
|
18 |
+
### Version Control Correction
|
19 |
+
Correcting version control to v0.3 to reflect developmental progression and enhancements over the previous version.
|
20 |
+
|
21 |
+
### Comparative Analysis with Previous Model
|
22 |
+
MistralPirate-7b-v0.3 demonstrates notable improvements over its predecessor in several key areas:
|
23 |
+
- **Pirate Dialect**: The new model uses richer and more immersive pirate vernacular, enhancing the thematic experience.
|
24 |
+
- **Technical Accuracy**: It shows a deeper understanding of complex sailing scenarios, providing detailed and practical advice in response to intricate questions.
|
25 |
+
- **Language Coherence**: The model maintains a consistent tone and style, effectively blending pirate jargon with technical expertise.
|
26 |
+
|
27 |
+
### Direct Use
|
28 |
+
Ideal for interactive storytelling, gaming, advanced educational content, and conversational AI in pirate-themed settings.
|
29 |
+
|
30 |
+
### Downstream Use
|
31 |
+
Suitable for tasks requiring detailed language generation and domain-specific knowledge, like advanced thematic content creation or immersive language learning.
|
32 |
+
|
33 |
+
### Out-of-Scope Use
|
34 |
+
Not intended for general-purpose language modeling or non-pirate-themed contexts. Usage outside its specialization may result in suboptimal performance.
|
35 |
+
|
36 |
+
### Bias, Risks, and Limitations
|
37 |
+
Limited by its training data, may inherit biases. Best used where pirate-themed language is appropriate, not for serious or sensitive communication.
|
38 |
+
|
39 |
+
### Recommendations
|
40 |
+
Recommended for thematic contexts, with an understanding of its specialized focus. Not for accurate information outside pirate dialect specialization.
|
41 |
+
|
42 |
+
### Custom Stopping Strings Usage
|
43 |
+
Custom stopping strings employed for output quality:
|
44 |
+
|
45 |
+
- "},"
|
46 |
+
- "User:"
|
47 |
+
- "You:"
|
48 |
+
- "\nUser"
|
49 |
+
- "\nUser:"
|
50 |
+
|
51 |
+
### Training Data
|
52 |
+
Trained on a vast dataset in ChatML format, ensuring diverse and rich inputs.
|
53 |
+
|
54 |
+
### Preprocessing
|
55 |
+
Advanced preprocessing into ChatML format.
|
56 |
+
|
57 |
+
### Training Hyperparameters and Fine-Tuning Details
|
58 |
+
- Training Regime: FP32
|
59 |
+
- Warmup Steps: 1
|
60 |
+
- Per Device Train Batch Size: 2
|
61 |
+
- Gradient Accumulation Steps: 64
|
62 |
+
- Max Steps: 1500
|
63 |
+
- Learning Rate: 0.00015
|
64 |
+
- Logging Steps: 1
|
65 |
+
- Save Steps: 1
|
66 |
+
- Lora Alpha: 32
|
67 |
+
- Dimension Count: 16
|
68 |
+
- Specific Lora Fine-Tuning Run:
|
69 |
+
- Step: 26
|
70 |
+
- Loss: 1.4906
|
71 |
+
- Learning Rate: 0.00019814951887490748
|
72 |
+
- Epoch: 0.01
|
73 |
+
|
74 |
+
### Speeds, Sizes, Times
|
75 |
+
Approximately 12 minutes training time on RTX 6000 Ada GPU.
|
76 |
+
|
77 |
+
### Testing Data
|
78 |
+
Achieved a perplexity score of 5.17 against the Wikitext database.
|
79 |
+
|
80 |
+
### Factors
|
81 |
+
Focus on language coherence, pirate dialect adherence, and technical accuracy.
|
82 |
+
|
83 |
+
### Metrics
|
84 |
+
Primary metric: Perplexity. Qualitative assessments of dialect authenticity and technical content.
|
85 |
+
|
86 |
+
### Results
|
87 |
+
Marked improvement in sophisticated output with authentic pirate tone. Lower perplexity score demonstrates enhanced language modeling.
|
88 |
+
|
89 |
+
### Summary
|
90 |
+
Represents a significant advancement in domain-specific language modeling, excelling in complex, authentic pirate-themed content.
|
91 |
+
|
92 |
+
### Model Architecture and Objective
|
93 |
+
Based on Mistral Instruct v0.2, fine-tuned for high coherence and technical accuracy in pirate-themed content.
|
94 |
+
|
95 |
+
### Compute Infrastructure
|
96 |
+
Trained on RTX 6000 Ada GPU for efficient training and improved perplexity scores.
|
97 |
+
|
98 |
+
### Hardware
|
99 |
+
- Type: RTX 6000 Ada
|
100 |
+
- Utilization: Approx. 12 minutes for training.
|
101 |
+
|
102 |
+
### Acknowledgments
|
103 |
+
Gratitude to Mistral and Mistral Instruct v0.2 teams. Appreciation to the language modeling community for support in domain-specific model enhancement.
|