hance-ai
/

audiomae

dslee2601 commited on Aug 12, 2024

Commit

3aa23f4

verified ·

1 Parent(s): 87da80c

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -23,6 +23,8 @@ z = model('path/audio_fname.wav')  # (768, 8, 64) = (latent_dim_size, latent_fre
 Depending on a task, a different pooling strategy should be facilitated.
 For instance, a global average pooling can be used for a classification task. [2] uses an adaptive pooling.
 # Sanity Check Result
 In the following, a spectrogram of an input audio and corresponding $z$ are visualized.

 Depending on a task, a different pooling strategy should be facilitated.
 For instance, a global average pooling can be used for a classification task. [2] uses an adaptive pooling.
+⚠️ AudioMAE accepts audio with maximum length of 10s (as described in [1]). Any audio longer than 10s will be clipped to 10s, meaning the excess beyond 10s will be discarded.
 # Sanity Check Result
 In the following, a spectrogram of an input audio and corresponding $z$ are visualized.