ivrit-ai
/

faster-whisper-v2-d4

Automatic Speech Recognition

Model card Files Files and versions Community

faster-whisper-v2-d4 / README.md

benderrodriguez's picture

benderrodriguez

Update README.md

cac56e9 verified 5 months ago

|

history blame contribute delete

813 Bytes

	---
	license: apache-2.0
	datasets:
	- ivrit-ai/crowd-transcribe-v4
	language:
	- he
	- en
	base_model: openai/whisper-large-v2
	pipeline_tag: automatic-speech-recognition
	---

	This is ivrit.ai's faster-whisper model, based on the ivrit-ai/whisper-v2-d4 Whisper model.

	Training data includes 250 hours of volunteer-transcribed speech from the ivrit-ai/crowd-transcribe-v4 dataset, as well as 100 ours of professional transcribed speech from other sources.

	Release date: September 8th, 2024.

	# Prerequisites

	pip3 install faster_whisper

	# Usage

	```
	import faster_whisper
	model = faster_whisper.WhisperModel('ivrit-ai/faster-whisper-v2-d4')

	segs, _ = model.transcribe('media-file', language='he')

	texts = [s.text for s in segs]

	transcribed_text = ' '.join(texts)
	print(f'Transcribed text: {transcribed_text}')
	```