whisper-small-es-ja

Model Overview

This model was developed as part of a workshop organized by Yasmin Moslem, focusing on speech-to-text pipelines. The workshop's primary goal was to enable accurate transcription and translation of spoken source languages into written target languages while learning about end-to-end and cascaded approaches in the process.

This model represents an end-to-end solution for Spanish-to-Japanese speech-to-text (STT) tasks and is a fine-tuned version of OpenAI's Whisper-small, specifically trained on the Marianoleiras/voxpopuli_es-ja dataset for Spanish-to-Japanese speech-to-text (STT) tasks.

The model achieves performance metrics on the provided dataset:

Evaluation Set:

  • Loss: 1.1724
  • BLEU: 22.2850

Test Set:

  • BLEU: 21.4557

(Baseline evaluation on test set: 0.4793)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 3500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Validation Loss
1.5787 0.3962 250 11.6756 1.5196
1.3535 0.7924 500 16.0514 1.3470
1.0658 1.1886 750 17.7743 1.2533
1.0303 1.5848 1000 19.1894 1.2046
0.9893 1.9810 1250 20.1198 1.1591
0.7569 2.3772 1500 21.0054 1.1546
0.7571 2.7734 1750 21.6425 1.1378
0.5557 3.1696 2000 21.7563 1.1500
0.5612 3.5658 2250 21.1391 1.1395
0.5581 3.9620 2500 22.0412 1.1343
0.4144 4.3582 2750 22.2850 1.1724
0.4114 4.7544 3000 22.1925 1.1681
0.3005 5.1506 3250 21.4948 1.1947
0.2945 5.5468 3500 22.1454 1.1921

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.4.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0

Linked Models

Model Card Contact

Mariano González ([email protected])

Downloads last month
34
Safetensors
Model size
242M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Marianoleiras/whisper-small-es-ja

Finetuned
(2210)
this model

Dataset used to train Marianoleiras/whisper-small-es-ja