prithivMLmods commited on
Commit
ac4f479
·
verified ·
1 Parent(s): d5b72b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -55
README.md CHANGED
@@ -17,75 +17,93 @@ tags:
17
  ---
18
  ### **Llama-Song-Stream-3B-Instruct Model Card**
19
 
20
- The **Llama-Song-Stream-3B-Instruct** is a fine-tuned language model built upon **meta-llama/Llama-3.2-3B-Instruct**. It is specifically trained on song lyrics generation tasks, utilizing chain-of-thought reasoning over lyrical datasets.
21
-
22
- | **File Name** | **Size** | **Description** | **Upload Status** |
23
- |----------------------------------------|--------------------|--------------------------------------------------|--------------------|
24
- | `.gitattributes` | 1.57 kB | LFS tracking configuration. | Uploaded |
25
- | `README.md` | 282 Bytes | Updated documentation with project details. | Uploaded |
26
- | `config.json` | 1.03 kB | Configuration settings for model initialization. | Uploaded |
27
- | `generation_config.json` | 248 Bytes | Model generation settings. | Uploaded |
28
- | `pytorch_model-00001-of-00002.bin` | 4.97 GB | Primary model weights (part 1 of 2). | Uploaded (LFS) |
29
- | `pytorch_model-00002-of-00002.bin` | 1.46 GB | Primary model weights (part 2 of 2). | Uploaded (LFS) |
30
- | `pytorch_model.bin.index.json` | 21.2 kB | Index file for model weight mapping. | Uploaded |
31
- | `special_tokens_map.json` | 477 Bytes | Special tokens used by the tokenizer. | Uploaded |
32
- | `tokenizer.json` | 17.2 MB | Tokenizer file (large LFS model tokenizer data). | Uploaded (LFS) |
33
- | `tokenizer_config.json` | 57.4 kB | Tokenizer configuration settings. | Uploaded |
34
 
35
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- ## **Model Details**
 
38
 
39
- ### **Key Metrics:**
40
- - **Base Model:** `meta-llama/Llama-3.2-3B-Instruct`
41
- - **Model Parameters:** 3B (billion parameters).
42
- - **Fine-tuned dataset focus:** Song generation and lyric-based chain-of-thought reasoning.
43
 
44
  ---
 
45
 
46
- ### **Model Components**
47
- 1. **Model Weights:**
48
- - Split into two LFS shards:
49
- - `pytorch_model-00001-of-00002.bin` - **4.97 GB**
50
- - `pytorch_model-00002-of-00002.bin` - **1.46 GB**
51
-
52
- 2. **Tokenizer Data:**
53
- - Tokenizer includes LFS model configuration:
54
- - `tokenizer.json` - **17.2 MB**
55
- - `special_tokens_map.json` - **477 Bytes**
56
- - `tokenizer_config.json` - **57.4 KB**
57
-
58
- 3. **Configuration Files:**
59
- - `config.json` - Model settings (**1.03 KB**).
60
- - `generation_config.json` - Inference task parameters (**248 Bytes**).
61
 
62
  ---
 
63
 
64
- ### **Training Dataset**
65
- - **Dataset Name:** [prithivMLmods/Song-Catalogue-Long-Thought](https://huggingface.co/datasets/prithivMLmods/Song-Catalogue-Long-Thought)
66
- - **Total Examples:** 57,700+
67
- - **Training Focus:** Chain-of-thought reasoning related to lyrical themes and patterns.
 
 
 
 
 
 
 
68
 
69
  ---
70
 
71
- ### **Intended Use Cases**
72
- 1. **Song Lyrics Generation:**
73
- Generate realistic, context-aware song lyrics from user prompts.
74
-
75
- 2. **Creative Writing Tools:**
76
- Aiding songwriters and lyricists by generating thematic drafts.
77
-
78
- 3. **Text Manipulation via Prompts:**
79
- Experiment with different styles, song structures, and lyrical themes.
 
 
80
 
81
  ---
82
 
83
- ### **Current Status:**
84
- - **Inference API Status:**
85
- The model lacks sufficient downloads or visibility for deployment to Hugging Face's Inference API.
86
- - **Action Plan:** Increase visibility through applications and outreach.
87
-
88
- - **Model Deployment Options:**
89
- Use dedicated Inference Endpoints for direct access and deployment.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
  ---
 
17
  ---
18
  ### **Llama-Song-Stream-3B-Instruct Model Card**
19
 
20
+ The **Llama-Song-Stream-3B-Instruct** is a fine-tuned language model specializing in generating music-related text, such as song lyrics, compositions, and musical thoughts. Built upon the **meta-llama/Llama-3.2-3B-Instruct** base, it has been trained with a custom dataset focused on song lyrics and music compositions to produce context-aware, creative, and stylized music output.
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
+ | **File Name** | **Size** | **Description** |
23
+ |---------------------------------|------------|-------------------------------------------------|
24
+ | `.gitattributes` | 1.57 kB | LFS tracking file to manage large model files. |
25
+ | `README.md` | 282 Bytes | Documentation with model details and usage. |
26
+ | `config.json` | 1.03 kB | Model configuration settings. |
27
+ | `generation_config.json` | 248 Bytes | Generation parameters like max sequence length. |
28
+ | `pytorch_model-00001-of-00002.bin` | 4.97 GB | Primary weights (part 1 of 2). |
29
+ | `pytorch_model-00002-of-00002.bin` | 1.46 GB | Primary weights (part 2 of 2). |
30
+ | `pytorch_model.bin.index.json` | 21.2 kB | Index file mapping the checkpoint layers. |
31
+ | `special_tokens_map.json` | 477 Bytes | Defines special tokens for tokenization. |
32
+ | `tokenizer.json` | 17.2 MB | Tokenizer data for text generation. |
33
+ | `tokenizer_config.json` | 57.4 kB | Configuration settings for tokenization. |
34
+
35
+ ### **Key Features**
36
+
37
+ 1. **Song Generation:**
38
+ - Generates full song lyrics based on user input, maintaining rhyme, meter, and thematic consistency.
39
+
40
+ 2. **Music Context Understanding:**
41
+ - Trained on lyrics and song patterns to mimic and generate song-like content.
42
 
43
+ 3. **Fine-tuned Creativity:**
44
+ - Fine-tuned using *Song-Catalogue-Long-Thought* for coherent lyric generation over extended prompts.
45
 
46
+ 4. **Interactive Text Generation:**
47
+ - Designed for use cases like generating lyrical ideas, creating drafts for songwriters, or exploring themes musically.
 
 
48
 
49
  ---
50
+ ### **Training Details**
51
 
52
+ - **Base Model:** [meta-llama/Llama-3.2-3B-Instruct](#)
53
+ - **Finetuning Dataset:** [prithivMLmods/Song-Catalogue-Long-Thought](#)
54
+ - This dataset comprises 57.7k examples of lyrical patterns, song fragments, and themes.
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  ---
57
+ ### **Applications**
58
 
59
+ 1. **Songwriting AI Tools:**
60
+ - Generate lyrics for genres like pop, rock, rap, classical, and others.
61
+
62
+ 2. **Creative Writing Assistance:**
63
+ - Assist songwriters by suggesting lyric variations and song drafts.
64
+
65
+ 3. **Storytelling via Music:**
66
+ - Create song narratives using custom themes and moods.
67
+
68
+ 4. **Entertainment AI Integration:**
69
+ - Build virtual musicians or interactive lyric-based content generators.
70
 
71
  ---
72
 
73
+ ### **Example Usage**
74
+
75
+ #### **Setup**
76
+ First, load the Llama-Song-Stream model:
77
+ ```python
78
+ from transformers import AutoModelForCausalLM, AutoTokenizer
79
+
80
+ model_name = "prithivMLmods/Llama-Song-Stream-3B-Instruct"
81
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
82
+ model = AutoModelForCausalLM.from_pretrained(model_name)
83
+ ```
84
 
85
  ---
86
 
87
+ #### **Generate Lyrics Example**
88
+ ```python
89
+ prompt = "Write a song about freedom and the open sky"
90
+ inputs = tokenizer(prompt, return_tensors="pt")
91
+ outputs = model.generate(**inputs, max_length=100, temperature=0.7, num_return_sequences=1)
92
+
93
+ generated_lyrics = tokenizer.decode(outputs[0], skip_special_tokens=True)
94
+ print(generated_lyrics)
95
+ ```
96
+
97
+ ---
98
+
99
+ ### **Deployment Notes**
100
+
101
+ 1. **Serverless vs. Dedicated Endpoints:**
102
+ The model currently does not have enough usage for a serverless endpoint. Options include:
103
+ - **Dedicated inference endpoints** for faster responses.
104
+ - **Custom integrations via Hugging Face inference tools.**
105
+
106
+ 2. **Resource Requirements:**
107
+ Ensure sufficient GPU memory and compute for large PyTorch model weights.
108
 
109
  ---