5roop
/

Wav2Vec2BertProsodicUnitsFrameClassifier

@@ -67,6 +67,12 @@ In this fashion we obtain the following metrics:
 ![A gif illustrating correspondance between true and predicted prosodic
 units](output.gif)
 ## Uses
 ### Simple use (short files)

 ![A gif illustrating correspondance between true and predicted prosodic
 units](output.gif)
+As seen in the gif image above, we observe generally good correspondence between true (blue) and predicted (orange) prosodic units, but there are cases where the grouping is incorrect: the model will annotate only a single prosodic unit where a human annotator would annotate two or more.
+### Known limitations
+* Edge cases: if the input audio starts or ends within a prosodic unit, there is a high  chance of not detecting the ending or starting prosodic unit.
+* Unknown behaviour on non-speech audio: as of the time of writing, no tests were performed to check what happens in cases of music, noise, pure sine, ...
 ## Uses
 ### Simple use (short files)