Spaces:

Suprath
/

liptotext

Running

Suprath commited on Apr 26, 2024

Commit

8964b86

verified ·

1 Parent(s): 55c3ec1

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -192,20 +192,14 @@ text_output = gr.Textbox()
 with demo:
     gr.Markdown('''
             <div>
-            <h1 style='text-align: center'>Speech Recognition from Visual Lip Movement by Audio-Visual Hidden Unit BERT Model (AV-HuBERT)</h1>
-            This space uses AV-HuBERT models from <a href='https://github.com/facebookresearch' target='_blank'><b>Meta Research</b></a> to recoginze the speech from Lip Movement
-            <figure>
-              <img src="https://huggingface.co/vumichien/AV-HuBERT/resolve/main/lipreading.gif" alt="Audio-Visual Speech Recognition">
-              <figcaption> Speech Recognition from visual lip movement
-              </figcaption>
-            </figure>
             </div>
         ''')
     with gr.Row():
             gr.Markdown('''
             ### Reading Lip movement with youtube link using Avhubert
             ##### Step 1a. Download video from youtube (Note: the length of video should be less than 10 seconds if not it will be cut and the face should be stable for better result)
-            ##### Step 1b. You also can upload video directly
             ##### Step 2. Generating landmarks surrounding mouth area
             ##### Step 3. Reading lip movement.
             ''')

 with demo:
     gr.Markdown('''
             <div>
+            <h1 style='text-align: center'>Lip Reading Using Machine learning (Audio-Visual Hidden Unit BERT Model (AV-HuBERT))</h1>
             </div>
         ''')
     with gr.Row():
             gr.Markdown('''
             ### Reading Lip movement with youtube link using Avhubert
             ##### Step 1a. Download video from youtube (Note: the length of video should be less than 10 seconds if not it will be cut and the face should be stable for better result)
+            ##### Step 1b. Drag and drop videos to upload directly
             ##### Step 2. Generating landmarks surrounding mouth area
             ##### Step 3. Reading lip movement.
             ''')