whisper-tiny-it-multi
Fine-tuned openai/whisper-tiny (39M params) for Italian ASR on multiple datasets.
Author: Ettore Di Giacinto
Brought to you by the LocalAI team. This model can be used directly with LocalAI.
Results
Evaluated on combined test set (Common Voice + MLS + VoxPopuli, 17,598 samples):
| Step | WER |
|---|---|
| 1000 | 39.8% |
| 3000 | 33.5% |
| 5000 | 31.4% |
| 10000 | 29.4% |
Training Details
- Base model: openai/whisper-tiny (39M parameters)
- Datasets: Common Voice 25.0 Italian (173k) + MLS Italian (60k) + VoxPopuli Italian (23k) = 255k train samples
- Steps: 10,000 (batch size 32)
- Learning rate: 1e-5 with 500 warmup steps
- Precision: bf16 on NVIDIA GB10
Usage
Transformers
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-tiny-it-multi")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])
CTranslate2 / faster-whisper
For optimized CPU inference: LocalAI-io/whisper-tiny-it-multi-ct2-int8
Links
- CV-only version: LocalAI-io/whisper-tiny-it (WER 27.1% on CV test)
- CTranslate2 INT8: LocalAI-io/whisper-tiny-it-multi-ct2-int8
- Code: github.com/localai-org/whisper-it
- LocalAI: github.com/mudler/LocalAI
- Downloads last month
- -
Model tree for LocalAI-io/whisper-tiny-it-multi
Base model
openai/whisper-tiny