File size: 1,185 Bytes
16cb41c
a9ec0b6
16cb41c
 
a9ec0b6
16cb41c
a9ec0b6
16cb41c
f5b99bd
16cb41c
 
a9ec0b6
16cb41c
a9ec0b6
16cb41c
a9ec0b6
 
 
16cb41c
a9ec0b6
16cb41c
a9ec0b6
 
 
 
 
16cb41c
a9ec0b6
16cb41c
 
 
 
 
a9ec0b6
16cb41c
 
 
a9ec0b6
16cb41c
 
 
 
a9ec0b6
 
16cb41c
a9ec0b6
16cb41c
fce536d
f5b99bd
08c3d22
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
license: apache-2.0
tags:
  - speech-enhancement
  - denoising
  - coreml
  - apple-silicon
  - deepfilternet
library_name: speech-swift
---

# DeepFilterNet3 — Core ML (FP16)

Real-time speech enhancement model for Apple Silicon. Removes background noise from speech audio.

- **2.1M params**, FP16, ~4.2 MB
- Runs on **Neural Engine** via Core ML
- 48kHz native, 10ms frames

## Latency (M2 Max)

| Duration | Time | RTF |
|----------|------|-----|
| 5s | 0.65s | 0.13 |
| 10s | 1.2s | 0.12 |
| 20s | 4.8s | 0.24 |

## Usage

```swift
import SpeechEnhancement

let enhancer = try await SpeechEnhancer.fromPretrained()
let clean = try enhancer.enhance(audio: noisyAudio, sampleRate: 48000)
```

```bash
swift run audio denoise noisy.wav --output clean.wav
```

## Files

- `DeepFilterNet3.mlpackage` — Core ML FP16 model (Neural Engine)
- `auxiliary.npz` — ERB filterbank, Vorbis window, normalization states

## Reference

- [DeepFilterNet3](https://arxiv.org/abs/2305.08227)
- Part of [speech-swift](https://github.com/soniqo/speech-swift)

---

## Links

- **Blog**: [blog.ivan.digital](https://blog.ivan.digital)
- **Library Docs**: [soniqo.audio](https://soniqo.audio)