Retour aux promos
Sherpa Voice

Outils de developpement

Gratuit

Sherpa Voice

par Arthur Breton

v1.0 60 Mo Universel 4+

Description

Sherpa Voice is an open-source demo app that showcases what on-device speech and audio AI can do — directly on your phone, with no cloud and no internet required.

Whether you are a developer evaluating libraries for your next app, a researcher exploring mobile inference, or just curious about offline AI, Sherpa Voice lets you try 10 speech and audio features hands-on.

WHAT YOU CAN TRY

- Speech Recognition: Transcribe speech to text from files or live microphone input. Multiple languages supported.
- Text-to-Speech: Generate natural-sounding speech from text with various voices and speakers.
- Voice Activity Detection: Detect speech vs silence in real-time audio.
- Keyword Spotting: Listen for custom keywords without full transcription.
- Speaker Identification: Recognize speakers by their voice.
- Speaker Diarization: Figure out who spoke when in a conversation.
- Audio Tagging: Classify sounds — music, speech, environmental noise.
- Language Identification: Detect which language is being spoken.
- Speech Enhancement: Remove background noise from recordings.
- Punctuation: Add punctuation to raw transcripts.

Each feature comes with a downloadable model catalog so you can compare different models for speed, size, and accuracy.

FOR DEVELOPERS

Sherpa Voice is built with React Native and Expo using two open-source libraries:

- sherpa-onnx: C++ inference engine for speech models, with mobile-optimized ONNX Runtime
- expo-audio-stream: Real-time audio recording, analysis, and visualization for React Native

Everything you see in this app can be integrated into your own project. The full source code is available on GitHub — use it as a reference, fork it, or contribute.

HOW IT WORKS

1. Pick a feature (Speech Recognition, TTS, etc.)
2. Download a model from the built-in catalog (one-time, typically 15-100 MB)
3. Use the feature — all processing runs locally on your device

PRIVACY

No accounts. No tracking. No data collection. Audio never leaves your device.