Espressif’s ESP32-S3 is a low-power system-on-chip (SoC) that supports 2.4 GHz Wi-Fi (IEEE 802.11b/g/n), Bluetooth 5 Low Energy, and Bluetooth mesh connectivity. This device is suitable for a wide set of applications, including:
- Smart home
- White goods
- Industrial automation
- Consumer electronics
- Smart agriculture
- Video streaming cameras
- Speech and image recognition
- Touch sensing
The ESP32-S3 contains a high-performance dual-core CPU, the 32-bit Xtensa LX7, which can run at up to 240 MHz and is supported by 384 kB ROM, 512 kB SRAM, and 16 kB SRAM in the RTC module. Standard interfaces are included to allow connection to flash and external RAM.
This device includes a power management unit with multiple power modes and ultra-low-power co-processors. Peak current consumption in active RF operation is 340 mA, and this falls to 7 µA in deep-sleep mode. In this mode, the main CPU and most peripherals – excluding RTC memory, which contains Wi-Fi connection data – are powered down.
The ESP32-S3 also contains a highly integrated RF module that supports wireless functionality, and an internal co-existence mechanism allows Wi-Fi and Bluetooth to share the same antenna. Other peripheral interfaces include 45 programmable GPIOs, a Digital Video Port (DVP) 8- to 16-bit camera interface, an LED PWM controller with up to 8 channels, 2 × 12-bit SAR ADCs with up to 20 channels, and 14 touch-sensing IOs. Secure operation of the ESP32-S3 is supported with secure boot, flash encryption, digital signature, and more.
For more specifications, the full datasheet for the ESP32-S3 series can be found here.
The ESP32-S3-BOX is an AIoT development kit for testing the ESP32-S3 in smart device applications that may require offline or online voice assistants. This kit simplifies product development by enabling users to quickly create proof-of-concept designs for end-device HMIs. To this end, the ESP32-S3-BOX integrates a variety of peripherals, including a 2.4-inch display (320 px x 240 px) with capacitive touch sensing, dual microphones, a speaker, and two Pmod™-compatible headers for hardware expansion. Various sensors, an infrared controller, and a smart gateway are also included along with 3 buttons for reset, boot mode, and mute control. A USB Type-C connector provides 5 V of power input while also supporting programming and serial and JTAG debugging.
To support AI voice functions, the ESP32-S3-BOX runs Espressif’s audio front-end (AFE) algorithm, which supports acoustic echo cancellation (AEC), blind source separation (BSS), and noise suppression (NS) for good far-field performance in noisy environments.
Included with ESP32-S3-BOX is ESP-Skainet, Espressif’s offline voice-assistant SDK, which supports over 200 custom voice commands. Users can use their voice to wake the device at any time while it is speaking or playing music and talk to the device continuously once it has woken up. The ESP32-S3-BOX also supports integration of third-party online voice assistants such as Alexa.
The main functions of the ESP32-S3-BOX include AI image processing, Wi-Fi human-body detection, and wireless image transmission. Espressif’s complete AIoT platform, ESP RainMaker®, can also be used with ESP32-S3-BOX for configuring GPIOs and offline commands while providing control via mobile apps and/or a voice assistant.
For slightly different functionality, a lite version of this kit is available. The ESP32-S3-BOX-Lite retains the 2.4-inch display without capacitive touch functionality and replaces the mute button with 3 configurable buttons. The ESP32-S3-BOX-Lite also does not come with a desktop stand and dock and does not support AEC and ESP-Skainet’s voice-wake functions. Please specify if this option is preferred when filling out the application form below.