Skip to content

espressif/esp-skainet

Repository files navigation

ESP-Skainet [中文]

ESP-Skainet is Espressif's intelligent voice assistant, which currently supports the Wake Word Engine and Speech Commands Recognition.

ESP32-S3 is recommended to run speech commands recognition, which supports AI instructions and high-speed octal SPI PSRAM. The latest models will be deployed on ESP32-S3 first.

Overview

ESP-Skainet enables convenient development of wake word detection and speech command recognition applications based on Espressif's ESP32 series chips. With ESP-Skainet, you can easily build wake word and command recognition solutions.

ESP-Skainet has the following features:

overview

Input Voice Stream

The input audio stream can come from a microphone or from wav/pcm files stored in flash or SD card.

Wake Word Engine

The WakeNet engine is designed to provide high performance and low memory usage for wake word detection, enabling devices to always listen for wake words such as "Alexa", “天猫精灵” (Tian Mao Jing Ling), and “小爱同学” (Xiao Ai Tong Xue).

Espressif provides wake words such as "Hi, Lexin" and "Hi, ESP" for free, and also supports custom wake words. For details, see the Espressif Speech Wake Words Customization Process.

Speech Commands Recognition

The MultiNet model provides flexible offline speech command recognition. You can easily add your own commands without retraining the model.

Currently, MultiNet supports up to 200 Chinese or English speech commands, such as “打开空调” (Turn on the air conditioner) and “打开卧室灯” (Turn on the bedroom light).

Audio Front End

The Audio Front-End (AFE) integrates AEC (Acoustic Echo Cancellation), VAD (Voice Activity Detection), BSS (Blind Source Separation), and NS (Noise Suppression).

Our two-mic AFE has been qualified as a “Software Audio Front-End Solution” for Amazon Alexa Built-in devices. AFE

Quick Start with ESP-Skainet

Hardware Preparation

To run ESP-Skainet, you need an ESP32 or ESP32-S3 development board with an integrated audio input module.

Example Name Latest Models Supported Board
cn_speech_commands_recognition MultiNet7 ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV
en_speech_commands_recognition MultiNet7 ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV
wake_word_detection WakeNet9 ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-S3-EYE, ESP32-P4-Function-EV
chinese_tts esp-tts-v1.7 ESP32-Korvo, ESP32-S3-Korvo-1, ESP-BOX, ESP-S3-Korvo-2, ESP32-P4-Function-EV
usb_mic_recorder ESP-BOX, ESP-S3-Korvo-2

For application configuration, please refer to the README.md in each example folder.

Software Preparation

ESP-Skainet

Clone this project:

git clone https://github.com/espressif/esp-skainet.git

ESP-IDF

ESP-IDF v4.4 and ESP-IDF v5.0 are supported. If you have already set up ESP-IDF and do not want to change your existing environment, you can set the IDF_PATH environment variable to the new ESP-IDF path.

Note: If you need to use ESP-IDF v3.2 or earlier, please refer to esp-skainet v0.2.0.

For setup details, see the Getting Started Guide for ESP-IDF v4.4.

Examples

The examples folder contains sample applications using the ESP-Skainet API.

We recommend starting with the wake_word_detection example:

  1. Enter the example folder:
    cd examples/wake_word_detection
    
  2. Compile and flash the project:
    idf.py flash monitor
    
  3. Advanced users can add or modify speech commands using:
    idf.py menuconfig
    

For more details, see the README in each example folder.

Resources

About

Espressif intelligent voice assistant

Resources

License

Stars

Watchers

Forks

Packages

No packages published