File size: 3,173 Bytes
5a94d6e
 
 
 
 
 
f7113fc
 
 
 
 
5a94d6e
 
 
 
f7113fc
 
 
 
 
 
 
 
 
 
 
 
5a94d6e
 
f7113fc
5a94d6e
f7113fc
5a94d6e
f7113fc
5a94d6e
f7113fc
5a94d6e
f7113fc
5a94d6e
f7113fc
 
 
 
 
 
5a94d6e
f7113fc
5a94d6e
f7113fc
5a94d6e
f7113fc
 
 
 
5a94d6e
f7113fc
 
 
 
5a94d6e
f7113fc
5a94d6e
f7113fc
 
 
 
 
5a94d6e
f7113fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
language:
- ar
license: apache-2.0
base_model: openai/whisper-small
tags:
- fine-tuned
- Quran
- automatic-speech-recognition
- arabic
- whisper
datasets:
- fawzanaramam/the-amma-juz
model-index:
- name: Whisper small Finetuned on Amma Juz of Quran
  results:
  - task:
      type: automatic-speech-recognition
      name: Speech Recognition
    dataset:
      name: The Amma Juz Dataset
      type: fawzanaramam/the-amma-juz
    metrics:
      - type: eval_loss
        value: 0.0058
      - type: eval_wer
        value: 1.1494
---

# Whisper Small Finetuned on Amma Juz of Quran

This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small), specialized in transcribing Arabic audio with a focus on Quranic recitation from the *Amma Juz* dataset. This fine-tuning makes the model highly effective for tasks involving accurate recognition of Arabic speech, especially in religious and Quranic contexts.

## Model Description

Whisper Small is a transformer-based model for automatic speech recognition (ASR), developed by OpenAI. By fine-tuning it on the *Amma Juz* dataset, this version achieves state-of-the-art results on transcribing Quranic recitations with minimal word error rates and high accuracy. The fine-tuned model retains the original capabilities of the Whisper architecture while being optimized for Arabic Quranic text.

## Performance Metrics

On the evaluation set, the model achieved:
- **Evaluation Loss**: 0.0058  
- **Word Error Rate (WER)**: 1.1494%  
- **Evaluation Runtime**: 44.2766 seconds  
- **Evaluation Samples per Second**: 2.259  
- **Evaluation Steps per Second**: 0.294  

These metrics demonstrate the model's efficiency and accuracy when processing Quranic recitations.

## Intended Uses & Limitations

### Intended Uses
- **Speech-to-text transcription** of Arabic Quranic recitation, specifically from the *Amma Juz*.  
- Research and educational purposes in the domain of Quranic studies.  
- Applications in tools for learning Quranic recitation.  

### Limitations
- The model is fine-tuned on Quranic recitation and may not perform as well on non-Quranic Arabic speech or general Arabic conversations.  
- Noise in audio inputs, variations in recitation style, or heavy accents might affect accuracy.  
- It is recommended to use clean and high-quality audio for optimal performance.  

## Training and Evaluation Data

The model was trained using the *Amma Juz* dataset, which comprises Quranic audio data and corresponding transcripts. This dataset was curated to ensure high-quality representation of Quranic recitations.

## Training Procedure

### Training Hyperparameters
The following hyperparameters were used during training:
- **Learning Rate**: 1e-05  
- **Training Batch Size**: 16  
- **Evaluation Batch Size**: 8  
- **Seed**: 42  
- **Optimizer**: Adam (betas=(0.9, 0.999), epsilon=1e-08)  
- **Learning Rate Scheduler**: Linear  
- **Warmup Steps**: 10  
- **Number of Epochs**: 3.0  
- **Mixed Precision Training**: Native AMP  

### Framework Versions
- **Transformers**: 4.41.1  
- **PyTorch**: 2.2.1+cu121  
- **Datasets**: 2.19.1  
- **Tokenizers**: 0.19.1