hubertsiuzdak
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,12 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
3 |
---
|
4 |
|
5 |
-
#
|
6 |
|
7 |
-
Multi-**S**cale **N**eural **A**udio **C**odec (SNAC) compressess
|
8 |
|
9 |
See GitHub repository: https://github.com/hubertsiuzdak/snac/
|
10 |
|
@@ -13,9 +15,8 @@ See GitHub repository: https://github.com/hubertsiuzdak/snac/
|
|
13 |
SNAC encodes audio into hierarchical tokens similarly to SoundStream, EnCodec, and DAC. However, SNAC introduces a simple change where coarse tokens are sampled less frequently,
|
14 |
covering a broader time span.
|
15 |
|
16 |
-
This
|
17 |
-
|
18 |
-
consistent structure of an audio track for ~3 minutes.
|
19 |
|
20 |
## Usage
|
21 |
|
@@ -24,11 +25,6 @@ Install it using:
|
|
24 |
```bash
|
25 |
pip install snac
|
26 |
```
|
27 |
-
|
28 |
-
A pretrained model that compresses audio into discrete codes at a 2.2 kbps bitrate is available
|
29 |
-
at [Hugging Face](https://huggingface.co/hubertsiuzdak/snac). It uses 4 RVQ levels with token rates of 12.5, 25, 50, and
|
30 |
-
100 Hz.
|
31 |
-
|
32 |
To encode (and reconstruct) audio with SNAC in Python, use the following code:
|
33 |
|
34 |
```python
|
@@ -47,9 +43,9 @@ resolution.
|
|
47 |
|
48 |
```
|
49 |
>>> [code.shape[1] for code in codes]
|
50 |
-
[
|
51 |
```
|
52 |
|
53 |
## Acknowledgements
|
54 |
|
55 |
-
Module definitions are adapted from the [Descript Audio Codec](https://github.com/descriptinc/descript-audio-codec).
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
tags:
|
4 |
+
- audio
|
5 |
---
|
6 |
|
7 |
+
# SNAC 🍿
|
8 |
|
9 |
+
Multi-**S**cale **N**eural **A**udio **C**odec (SNAC) compressess audio into discrete codes at a low bitrate.
|
10 |
|
11 |
See GitHub repository: https://github.com/hubertsiuzdak/snac/
|
12 |
|
|
|
15 |
SNAC encodes audio into hierarchical tokens similarly to SoundStream, EnCodec, and DAC. However, SNAC introduces a simple change where coarse tokens are sampled less frequently,
|
16 |
covering a broader time span.
|
17 |
|
18 |
+
This model compresses 44 kHz audio into discrete codes at a 2.6 kbps bitrate. It uses 4 RVQ levels with token rates of 14, 29, 57, and
|
19 |
+
115 Hz.
|
|
|
20 |
|
21 |
## Usage
|
22 |
|
|
|
25 |
```bash
|
26 |
pip install snac
|
27 |
```
|
|
|
|
|
|
|
|
|
|
|
28 |
To encode (and reconstruct) audio with SNAC in Python, use the following code:
|
29 |
|
30 |
```python
|
|
|
43 |
|
44 |
```
|
45 |
>>> [code.shape[1] for code in codes]
|
46 |
+
[16, 32, 64, 128]
|
47 |
```
|
48 |
|
49 |
## Acknowledgements
|
50 |
|
51 |
+
Module definitions are adapted from the [Descript Audio Codec](https://github.com/descriptinc/descript-audio-codec).
|