jan-hq commited on
Commit
f9e60c5
·
verified ·
1 Parent(s): 022f195

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -7
README.md CHANGED
@@ -18,7 +18,7 @@ tags:
18
 
19
  Speechless is a compact, open-source text-to-semantics (1B parameters) model, designed to generate direct semantic representations of audio as discrete tokens, bypassing the need for a text-to-speech (TTS) model. Unlike traditional pipelines that rely on generating and processing audio (TTS → ASR), Speechless eliminates this complexity by directly converting text into semantic speech tokens, simplifying training, saving resources, and enabling scalability, especially for low-resource languages.
20
 
21
- Trained on over XXX hours of English and XXX hours of Vietnamese data, Speechless is a core component of the Ichigo v0.5 family.
22
 
23
  For more details, check out our official [blog post]().
24
 
@@ -49,7 +49,21 @@ For more details, check out our official [blog post]().
49
  You can use given example code to load the model.
50
 
51
  ```{python}
 
 
52
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ```
54
 
55
 
@@ -57,14 +71,15 @@ You can use given example code to load the model.
57
 
58
  | **Parameter** | **Value** |
59
  |----------------------------|-------------------------|
60
- | **Epochs** | |
61
- | **Global Batch Size** | |
62
- | **Learning Rate** | |
63
  | **Learning Scheduler** | Cosine |
64
  | **Optimizer** | AdamW |
65
- | **Warmup Ratio** | |
66
- | **Weight Decay** | |
67
- | **Max Sequence Length** | |
 
68
 
69
  ## Evaluation
70
 
 
18
 
19
  Speechless is a compact, open-source text-to-semantics (1B parameters) model, designed to generate direct semantic representations of audio as discrete tokens, bypassing the need for a text-to-speech (TTS) model. Unlike traditional pipelines that rely on generating and processing audio (TTS → ASR), Speechless eliminates this complexity by directly converting text into semantic speech tokens, simplifying training, saving resources, and enabling scalability, especially for low-resource languages.
20
 
21
+ Trained on over ~400 hours of English and ~1000 hours of Vietnamese data, Speechless is a core component of the Ichigo v0.5 family.
22
 
23
  For more details, check out our official [blog post]().
24
 
 
49
  You can use given example code to load the model.
50
 
51
  ```{python}
52
+ import torch
53
+ from transformers import pipeline
54
 
55
+ model_id = "homebrewltd/Speechless-llama3.2-v0.1"
56
+
57
+ pipe = pipeline(
58
+ "text-generation",
59
+ model=model_id,
60
+ torch_dtype=torch.bfloat16,
61
+ device_map="auto"
62
+ )
63
+
64
+ pipe("<|reserved_special_token_69|>I’m Speechless – A Model Developed by Homebrew Research")
65
+
66
+ >>> [{'generated_text': '<|reserved_special_token_69|>I’m Speechless – A Model Developed by Homebrew Research.assistant\n\n<|sound_1968|><|sound_0464|><|sound_0642|><|duration_02|><|sound_0634|><|sound_0105|><|duration_02|><|sound_1745|><|duration_02|><|sound_1345|><|sound_0210|><|sound_1312|><|sound_1312|>'}]
67
  ```
68
 
69
 
 
71
 
72
  | **Parameter** | **Value** |
73
  |----------------------------|-------------------------|
74
+ | **Epochs** | 2 |
75
+ | **Global Batch Size** | 144 |
76
+ | **Learning Rate** | 3e-4 |
77
  | **Learning Scheduler** | Cosine |
78
  | **Optimizer** | AdamW |
79
+ | **Warmup Ratio** | 0.05 |
80
+ | **Weight Decay** | 0.01 |
81
+ | **Max Sequence Length** | 512 |
82
+ | **Clip Grad Norm** | 1.0 |
83
 
84
  ## Evaluation
85