Automatic Speech Recognition
Safetensors
Italian
whisper
italian
localai

whisper-tiny-it-multi

Fine-tuned openai/whisper-tiny (39M params) for Italian ASR on multiple datasets.

Author: Ettore Di Giacinto

Brought to you by the LocalAI team. This model can be used directly with LocalAI.

Results

Evaluated on combined test set (Common Voice + MLS + VoxPopuli, 17,598 samples):

Step WER
1000 39.8%
3000 33.5%
5000 31.4%
10000 29.4%

Training Details

  • Base model: openai/whisper-tiny (39M parameters)
  • Datasets: Common Voice 25.0 Italian (173k) + MLS Italian (60k) + VoxPopuli Italian (23k) = 255k train samples
  • Steps: 10,000 (batch size 32)
  • Learning rate: 1e-5 with 500 warmup steps
  • Precision: bf16 on NVIDIA GB10

Usage

Transformers

from transformers import pipeline

pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-tiny-it-multi")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])

CTranslate2 / faster-whisper

For optimized CPU inference: LocalAI-io/whisper-tiny-it-multi-ct2-int8

Links

Downloads last month
-
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LocalAI-io/whisper-tiny-it-multi

Finetuned
(1791)
this model

Datasets used to train LocalAI-io/whisper-tiny-it-multi