metadata
license: apache-2.0
datasets:
- mozilla-foundation/common_voice_11_0
language:
- ur
base_model:
- openai/whisper-medium
pipeline_tag: automatic-speech-recognition
library_name: transformers
Whisper Medium Urdu Model
This model is a fine-tuned version of OpenAI's Whisper model for automatic speech recognition (ASR) in Urdu. It is trained on various audio datasets and is designed to convert spoken Urdu language into text.
Model Description
The Whisper model is a general-purpose ASR system trained on a large multilingual dataset, capable of transcribing speech to text in many languages, including Urdu. This specific model has been fine-tuned on Urdu audio datasets for better accuracy with Urdu speech inputs.
Key Features:
- Language: Urdu
- Model Type: Whisper medium model
- Task: Automatic Speech Recognition (ASR)
- Training Data: The model was trained on a diverse set of Urdu speech data.
Intended Use
This model is intended for automatic transcription of Urdu speech to text. It can be used for applications such as:
- Speech-to-text transcription in Urdu
- Transcription for Urdu audio or video content
- Accessibility features for Urdu-speaking users
How to Use
You can easily use the model with the Hugging Face transformers
library:
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
# Load the model and processor
processor = AutoProcessor.from_pretrained("Abdul145/whisper-medium-urdu-custom")
model = AutoModelForSpeechSeq2Seq.from_pretrained("Abdul145/whisper-medium-urdu-custom")