hexgrad commited on
Commit
b29511e
·
verified ·
1 Parent(s): a67f113

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -12,12 +12,10 @@ pipeline_tag: text-to-speech
12
 
13
  **Kokoro** is a frontier TTS model for its size of **82 million parameters** (text in/audio out).
14
 
15
- On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision along with 2 voicepacks (Bella and Sarah), all under an Apache 2.0 license.
16
 
17
- As of 28 Dec 2024, **8 unique Voicepacks have been released**: 2F 2M each for American and British English.
18
-
19
- At the time of release, Kokoro v0.19 was the #1🥇 ranked model in [TTS Spaces Arena](https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena). Kokoro had achieved higher Elo in this single-voice Arena setting over other models, using fewer parameters and less data:
20
- 1. **Kokoro v0.19: 82M params, Apache, trained on <100 hours of audio, for <20 epochs**
21
  2. XTTS v2: 467M, CPML, >10k hours
22
  3. Edge TTS: Microsoft, proprietary
23
  4. MetaVoice: 1.2B, Apache, 100k hours
@@ -44,14 +42,15 @@ import torch
44
  device = 'cuda' if torch.cuda.is_available() else 'cpu'
45
  MODEL = build_model('kokoro-v0_19.pth', device)
46
  VOICE_NAME = [
47
- 'af', # Default voice is a 50-50 mix of af_bella & af_sarah
48
  'af_bella', 'af_sarah', 'am_adam', 'am_michael',
49
  'bf_emma', 'bf_isabella', 'bm_george', 'bm_lewis',
 
50
  ][0]
51
  VOICEPACK = torch.load(f'voices/{VOICE_NAME}.pt', weights_only=True).to(device)
52
  print(f'Loaded voice: {VOICE_NAME}')
53
 
54
- # 3️⃣ Call generate, which returns a 24khz audio waveform and a string of output phonemes
55
  from kokoro import generate
56
  text = "How could I know? It's an unanswerable question. Like asking an unborn child if they'll lead a good life. They haven't even been born."
57
  audio, out_ps = generate(MODEL, text, VOICEPACK, lang=VOICE_NAME[0])
@@ -87,6 +86,7 @@ No affiliation can be assumed between parties on different lines.
87
  - 25 Dec 2024: Model v0.19, `af_bella`, `af_sarah`
88
  - 26 Dec 2024: `am_adam`, `am_michael`
89
  - 28 Dec 2024: `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`
 
90
 
91
  ### Licenses
92
  - Apache 2.0 weights in this repository
 
12
 
13
  **Kokoro** is a frontier TTS model for its size of **82 million parameters** (text in/audio out).
14
 
15
+ On 25 Dec 2024, Kokoro v0.19 weights were permissively released in full fp32 precision under an Apache 2.0 license. As of 30 Dec 2024, 9 unique Voicepacks have been released.
16
 
17
+ In the weeks leading up to its release, Kokoro v0.19 was the #1🥇 ranked model in [TTS Spaces Arena](https://huggingface.co/hexgrad/Kokoro-82M#evaluation). Kokoro had achieved higher Elo in this single-voice Arena setting over other models, using fewer parameters and less data:
18
+ 1. **Kokoro v0.19: 82M params, Apache, trained on <100 hours of audio**
 
 
19
  2. XTTS v2: 467M, CPML, >10k hours
20
  3. Edge TTS: Microsoft, proprietary
21
  4. MetaVoice: 1.2B, Apache, 100k hours
 
42
  device = 'cuda' if torch.cuda.is_available() else 'cpu'
43
  MODEL = build_model('kokoro-v0_19.pth', device)
44
  VOICE_NAME = [
45
+ 'af', # Default voice is a 50-50 mix of Bella & Sarah
46
  'af_bella', 'af_sarah', 'am_adam', 'am_michael',
47
  'bf_emma', 'bf_isabella', 'bm_george', 'bm_lewis',
48
+ 'af_nicole', # ASMR voice
49
  ][0]
50
  VOICEPACK = torch.load(f'voices/{VOICE_NAME}.pt', weights_only=True).to(device)
51
  print(f'Loaded voice: {VOICE_NAME}')
52
 
53
+ # 3️⃣ Call generate, which returns 24khz audio and the phonemes used
54
  from kokoro import generate
55
  text = "How could I know? It's an unanswerable question. Like asking an unborn child if they'll lead a good life. They haven't even been born."
56
  audio, out_ps = generate(MODEL, text, VOICEPACK, lang=VOICE_NAME[0])
 
86
  - 25 Dec 2024: Model v0.19, `af_bella`, `af_sarah`
87
  - 26 Dec 2024: `am_adam`, `am_michael`
88
  - 28 Dec 2024: `bf_emma`, `bf_isabella`, `bm_george`, `bm_lewis`
89
+ - 30 Dec 2024: `af_nicole`
90
 
91
  ### Licenses
92
  - Apache 2.0 weights in this repository