whisper-base-it
Fine-tuned openai/whisper-base (74M params) for Italian automatic speech recognition (ASR).
Author: Ettore Di Giacinto
Brought to you by the LocalAI team. This model can be used directly with LocalAI.
Results
Evaluated on Common Voice 25.0 Italian test set (15,184 samples):
| Step | WER |
|---|---|
| 1000 | 26.5% |
| 2000 | 24.0% |
| 3000 | 22.4% |
| 5000 | 20.6% |
| 7000 | 19.9% |
| 10000 | 19.2% |
Training Details
- Base model: openai/whisper-base (74M parameters)
- Dataset: Common Voice 25.0 Italian (173k train, 15k dev, 15k test)
- Steps: 10,000 (batch size 32, ~1.8 epochs)
- Learning rate: 1e-5 with 500 warmup steps
- Precision: bf16 on NVIDIA GB10
Usage
Transformers
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-base-it")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])
CTranslate2 / faster-whisper
For optimized CPU inference, use the INT8 quantized version: LocalAI-io/whisper-base-it-ct2-int8 (79MB).
LocalAI
This model is compatible with LocalAI for local, self-hosted AI inference.
Links
- CTranslate2 INT8: LocalAI-io/whisper-base-it-ct2-int8
- Code: github.com/localai-org/whisper-it
- LocalAI: github.com/mudler/LocalAI
- Downloads last month
- 33
Model tree for LocalAI-io/whisper-base-it
Base model
openai/whisper-base