|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# MistralPirate-7b-v0.3 |
|
|
|
## Model Card |
|
|
|
### Description |
|
MistralPirate-7b-v0.3 is a sophisticated language model fine-tuned for generating intricate and authentic pirate-themed content. This version, correcting our version control from v2 to v0.3, builds upon MistralPirate-7b-v2 and leverages advancements from Mistral Instruct v0.2. It shows improved performance in pirate dialect accuracy and perplexity scores. |
|
|
|
- **Developed by**: phanerozoic |
|
- **License**: cc-by-nc-4.0 |
|
- **Finetuned from**: Mistral Instruct v0.2 |
|
|
|
### Version Control Correction |
|
Correcting version control to v0.3 to reflect developmental progression and enhancements over the previous version. |
|
|
|
### Comparative Analysis with Previous Model |
|
MistralPirate-7b-v0.3 demonstrates notable improvements over its predecessor in several key areas: |
|
- **Pirate Dialect**: The new model uses richer and more immersive pirate vernacular, enhancing the thematic experience. |
|
- **Technical Accuracy**: It shows a deeper understanding of complex sailing scenarios, providing detailed and practical advice in response to intricate questions. |
|
- **Language Coherence**: The model maintains a consistent tone and style, effectively blending pirate jargon with technical expertise. |
|
|
|
### Direct Use |
|
Ideal for interactive storytelling, gaming, advanced educational content, and conversational AI in pirate-themed settings. |
|
|
|
### Downstream Use |
|
Suitable for tasks requiring detailed language generation and domain-specific knowledge, like advanced thematic content creation or immersive language learning. |
|
|
|
### Out-of-Scope Use |
|
Not intended for general-purpose language modeling or non-pirate-themed contexts. Usage outside its specialization may result in suboptimal performance. |
|
|
|
### Bias, Risks, and Limitations |
|
Limited by its training data, may inherit biases. Best used where pirate-themed language is appropriate, not for serious or sensitive communication. |
|
|
|
### Recommendations |
|
Recommended for thematic contexts, with an understanding of its specialized focus. Not for accurate information outside pirate dialect specialization. |
|
|
|
### Custom Stopping Strings Usage |
|
Custom stopping strings employed for output quality: |
|
|
|
- "}," |
|
- "User:" |
|
- "You:" |
|
- "\nUser" |
|
- "\nUser:" |
|
|
|
### Training Data |
|
Trained on a vast dataset in ChatML format, ensuring diverse and rich inputs. |
|
|
|
### Preprocessing |
|
Advanced preprocessing into ChatML format. |
|
|
|
### Training Hyperparameters and Fine-Tuning Details |
|
- Training Regime: FP32 |
|
- Warmup Steps: 1 |
|
- Per Device Train Batch Size: 2 |
|
- Gradient Accumulation Steps: 64 |
|
- Max Steps: 1500 |
|
- Learning Rate: 0.00015 |
|
- Logging Steps: 1 |
|
- Save Steps: 1 |
|
- Lora Alpha: 32 |
|
- Dimension Count: 16 |
|
- Specific Lora Fine-Tuning Run: |
|
- Step: 26 |
|
- Loss: 1.4906 |
|
- Learning Rate: 0.00019814951887490748 |
|
- Epoch: 0.01 |
|
|
|
### Speeds, Sizes, Times |
|
Approximately 12 minutes training time on RTX 6000 Ada GPU. |
|
|
|
### Testing Data |
|
Achieved a perplexity score of 5.17 against the Wikitext database. |
|
|
|
### Factors |
|
Focus on language coherence, pirate dialect adherence, and technical accuracy. |
|
|
|
### Metrics |
|
Primary metric: Perplexity. Qualitative assessments of dialect authenticity and technical content. |
|
|
|
### Results |
|
Marked improvement in sophisticated output with authentic pirate tone. Lower perplexity score demonstrates enhanced language modeling. |
|
|
|
### Summary |
|
Represents a significant advancement in domain-specific language modeling, excelling in complex, authentic pirate-themed content. |
|
|
|
### Model Architecture and Objective |
|
Based on Mistral Instruct v0.2, fine-tuned for high coherence and technical accuracy in pirate-themed content. |
|
|
|
### Compute Infrastructure |
|
Trained on RTX 6000 Ada GPU for efficient training and improved perplexity scores. |
|
|
|
### Hardware |
|
- Type: RTX 6000 Ada |
|
- Utilization: Approx. 12 minutes for training. |
|
|
|
### Acknowledgments |
|
Gratitude to Mistral and Mistral Instruct v0.2 teams. Appreciation to the language modeling community for support in domain-specific model enhancement. |