ESP-Skainet [中文]

ESP-Skainet is Espressif's intelligent voice assistant, which currently supports the Wake Word Engine and Speech Commands Recognition.

ESP32-S3 is recommended to run speech commands recognition, which supports AI instructions and high-speed octal SPI PSRAM. The latest models will be deployed on ESP32-S3 first.

Overview

ESP-Skainet enables convenient development of wake word detection and speech command recognition applications based on Espressif's ESP32 series chips. With ESP-Skainet, you can easily build wake word and command recognition solutions.

ESP-Skainet has the following features:

Input Voice Stream

The input audio stream can come from a microphone or from wav/pcm files stored in flash or SD card.

Wake Word Engine

The WakeNet engine is designed to provide high performance and low memory usage for wake word detection, enabling devices to always listen for wake words such as "Alexa", “天猫精灵” (Tian Mao Jing Ling), and “小爱同学” (Xiao Ai Tong Xue).

Espressif provides wake words such as "Hi, Lexin" and "Hi, ESP" for free, and also supports custom wake words. For details, see the Espressif Speech Wake Words Customization Process.

Speech Commands Recognition

The MultiNet model provides flexible offline speech command recognition. You can easily add your own commands without retraining the model.

Currently, MultiNet supports up to 200 Chinese or English speech commands, such as “打开空调” (Turn on the air conditioner) and “打开卧室灯” (Turn on the bedroom light).

Audio Front End

The Audio Front-End (AFE) integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation), and NS (Noise Suppression).

Our two-mic AFE has been qualified as a “Software Audio Front-End Solution” for Amazon Alexa Built-in devices.

Quick Start with ESP-Skainet

Hardware Preparation

To run ESP-Skainet, you need an ESP32 or ESP32-S3 development board with an integrated audio input module.

Example Name	Latest Models	Supported Board
cn_speech_commands_recognition	MultiNet7	ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV
en_speech_commands_recognition	MultiNet7	ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV
wake_word_detection	WakeNet9	ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV
chinese_tts	esp-tts-v1.7	ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-P4-Function-EV
usb_mic_recorder		ESP-BOX, ESP-S3-Korvo-2

For application configuration, please refer to the README.md in each example folder.

Software Preparation

ESP-Skainet

Clone this project:

git clone https://github.com/espressif/esp-skainet.git

ESP-IDF

ESP-IDF v4.4 and ESP-IDF v5.0 are supported. If you have already set up ESP-IDF and do not want to change your existing environment, you can set the IDF_PATH environment variable to the new ESP-IDF path.

Note: If you need to use ESP-IDF v3.2 or earlier, please refer to esp-skainet v0.2.0.

For setup details, see the Getting Started Guide for ESP-IDF v4.4.

Examples

The examples folder contains sample applications using the ESP-Skainet API.

We recommend starting with the wake_word_detection example:

Enter the example folder:
```
cd examples/wake_word_detection
```
Compile and flash the project:
```
idf.py flash monitor
```
Advanced users can add or modify speech commands using:
```
idf.py menuconfig
```

For more details, see the README in each example folder.

Resources

View the Issues section on GitHub — If you find a bug or have a feature request, please check existing issues before opening a new one.
Interested in contributing? See the Contributions Guide.

Name		Name	Last commit message	Last commit date
Latest commit History 430 Commits
.github/workflows		.github/workflows
components		components
docs		docs
examples		examples
img		img
test		test
tools		tools
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
LICENSE		LICENSE
README.md		README.md
README_cn.md		README_cn.md
conftest.py		conftest.py
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ESP-Skainet [中文]

ESP32-S3 is recommended to run speech commands recognition, which supports AI instructions and high-speed octal SPI PSRAM. The latest models will be deployed on ESP32-S3 first.

Overview

Input Voice Stream

Wake Word Engine

Speech Commands Recognition

Audio Front End

Quick Start with ESP-Skainet

Hardware Preparation

Software Preparation

ESP-Skainet

ESP-IDF

Note: If you need to use ESP-IDF v3.2 or earlier, please refer to esp-skainet v0.2.0.

Examples

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 12

Languages

License

espressif/esp-skainet

Folders and files

Latest commit

History

Repository files navigation

ESP-Skainet [中文]

ESP32-S3 is recommended to run speech commands recognition, which supports AI instructions and high-speed octal SPI PSRAM. The latest models will be deployed on ESP32-S3 first.

Overview

Input Voice Stream

Wake Word Engine

Speech Commands Recognition

Audio Front End

Quick Start with ESP-Skainet

Hardware Preparation

Software Preparation

ESP-Skainet

ESP-IDF

Note: If you need to use ESP-IDF v3.2 or earlier, please refer to esp-skainet v0.2.0.

Examples

Resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 12

Languages

Packages