whisper-base-it

Fine-tuned openai/whisper-base (74M params) for Italian automatic speech recognition (ASR).

Author: Ettore Di Giacinto

Brought to you by the LocalAI team. This model can be used directly with LocalAI.

Results

Evaluated on Common Voice 25.0 Italian test set (15,184 samples):

Step WER
1000 26.5%
2000 24.0%
3000 22.4%
5000 20.6%
7000 19.9%
10000 19.2%

Training Details

  • Base model: openai/whisper-base (74M parameters)
  • Dataset: Common Voice 25.0 Italian (173k train, 15k dev, 15k test)
  • Steps: 10,000 (batch size 32, ~1.8 epochs)
  • Learning rate: 1e-5 with 500 warmup steps
  • Precision: bf16 on NVIDIA GB10

Usage

Transformers

from transformers import pipeline

pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-base-it")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])

CTranslate2 / faster-whisper

For optimized CPU inference, use the INT8 quantized version: LocalAI-io/whisper-base-it-ct2-int8 (79MB).

LocalAI

This model is compatible with LocalAI for local, self-hosted AI inference.

Links

Downloads last month
33
Safetensors
Model size
72.6M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LocalAI-io/whisper-base-it

Finetuned
(686)
this model