ESP-Skainet [中文]
ESP-Skainet is Espressif's intelligent voice assistant, which currently supports the Wake Word Engine and Speech Commands Recognition.
ESP32-S3 is recommended to run speech commands recognition, which supports AI instructions and high-speed octal SPI PSRAM. The latest models will be deployed on ESP32-S3 first.
ESP-Skainet enables convenient development of wake word detection and speech command recognition applications based on Espressif's ESP32 series chips. With ESP-Skainet, you can easily build wake word and command recognition solutions.
ESP-Skainet has the following features:
The input audio stream can come from a microphone or from wav/pcm files stored in flash or SD card.
The WakeNet engine is designed to provide high performance and low memory usage for wake word detection, enabling devices to always listen for wake words such as "Alexa", “天猫精灵” (Tian Mao Jing Ling), and “小爱同学” (Xiao Ai Tong Xue).
Espressif provides wake words such as "Hi, Lexin" and "Hi, ESP" for free, and also supports custom wake words. For details, see the Espressif Speech Wake Words Customization Process.
The MultiNet model provides flexible offline speech command recognition. You can easily add your own commands without retraining the model.
Currently, MultiNet supports up to 200 Chinese or English speech commands, such as “打开空调” (Turn on the air conditioner) and “打开卧室灯” (Turn on the bedroom light).
The Audio Front-End (AFE) integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation), and NS (Noise Suppression).
Our two-mic AFE has been qualified as a “Software Audio Front-End Solution” for Amazon Alexa Built-in devices.
To run ESP-Skainet, you need an ESP32 or ESP32-S3 development board with an integrated audio input module.
Example Name | Latest Models | Supported Board |
---|---|---|
cn_speech_commands_recognition | MultiNet7 | ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV |
en_speech_commands_recognition | MultiNet7 | ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV |
wake_word_detection | WakeNet9 | ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV |
chinese_tts | esp-tts-v1.7 | ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-P4-Function-EV |
usb_mic_recorder | ESP-BOX, ESP-S3-Korvo-2 |
For application configuration, please refer to the README.md in each example folder.
Clone this project:
git clone https://github.com/espressif/esp-skainet.git
ESP-IDF v4.4 and ESP-IDF v5.0 are supported. If you have already set up ESP-IDF and do not want to change your existing environment, you can set the IDF_PATH
environment variable to the new ESP-IDF path.
For setup details, see the Getting Started Guide for ESP-IDF v4.4.
The examples folder contains sample applications using the ESP-Skainet API.
We recommend starting with the wake_word_detection example:
- Enter the example folder:
cd examples/wake_word_detection
- Compile and flash the project:
idf.py flash monitor
- Advanced users can add or modify speech commands using:
idf.py menuconfig
For more details, see the README in each example folder.
- View the Issues section on GitHub — If you find a bug or have a feature request, please check existing issues before opening a new one.
- Interested in contributing? See the Contributions Guide.