Unfortunately the available memory on Microcontrollers is quite restricted and we do not get very far by storing a (uncompressed) WAV file e.g. in program/flesh memory, so I started to look into compressed audio formats.
On the desktop we can use the FFmpeg project which comes with a rich set of functionality. Unfortunately the situation is much more fragmented for Microcontrollers.
I started to collect the relevant libraries and in order to make things simple to use I also added a simple C++ API on top of the available libraries:
- libhelix A MP3 and AAC Decoder from Realnetworks
- fdk-aac A AAC Encoder and Decoder from the Frauenhofer Institute
- libmad A open source MP3 Decoder from Underbit
- liblame A open source MP3 Encoder uing LAME
All these projects can be used as Arduino Libraries, but they compile and work also outside of Arduino with the help of cmake.
I am also providing an integration into my Arduino Audio Tools where you can use these libraries with the EncodedAudioStream class:
#include "Arduino.h"
#include "AudioTools.h"
#include "AudioCodecs/CodecMP3Helix.h"
#include "BabyElephantWalk60_mp3.h"
using namespace audio_tools;
MemoryStream mp3(BabyElephantWalk60_mp3, BabyElephantWalk60_mp3_len); // MP3 data source
I2SStream i2s; // final output of decoded stream
EncodedAudioStream dec(new i2s, new MP3DecoderHelix()); // Decoding stream
StreamCopy copier(dec, mp3); // copy in to out
void setup(){
Serial.begin(115200);
dec.setNotifyAudioChange(i2s);
dec.begin();
i2s.begin();
}
void loop(){
if (mp3) {
copier.copy();
}
}
The setNotifyAudioChange
method is making sure, that the I2S settings are updated with the audio Information (channels, bits per sample, audio rate) based in the information provided by the decoder.
8 Comments
Bromium · 10. January 2023 at 22:19
Just what I needed! Thank you for covering this important base in the audio sphere. What am I trying to do is conceptually simple. I have a bunch of mp3 tracks on an ESP32 + VS1053, which I would like to mix. So I would need to 1- fade in and fade out each track, 2- I need to mix (overlap, add) the first track with the second track, at the places that they are fading. 3- Would like to do the mix in real-time.
Can this be done without decoding/re-encoding?
If not, does the zip file contain a decoder?
Thanks a lot, Bromium
pschatzmann · 11. January 2023 at 3:50
To my knowledge this can not be done with mp3 directly and you would need to decode the mp3 to pcm to perform the mixing.
If you look into the documentation of the AudioTools project you can find that it supports
– Decoding
– Mixing
– Output to VS1053
John · 23. December 2021 at 0:54
Hi,
I would like to send some 44.1Khz stereo audio data to my PC (and eventually Sonos) using the web server on an ESP32.
I tried using the “sine” wave example, and after bumping the rates to 44100/2CH, I get drop-outs when I stream it to chrome on my PC.
I then tried updating it to use the “AACEncoderFDK”, however, I get a bunch of calloc errors when attempting to connect.
Is it possible to get this working?
Here is a pastebin link to what I currently have:
https://pastebin.com/PFPFk3VP
pschatzmann · 23. December 2021 at 9:24
The decoders are still work in progress and I have my doubts that this architecture is fitting with your intended purpose.
The audio webserver is just a pretty hack, that I have implemented because I was tired of wiring up I2S decoders and it is not intended to be used as HIFI streaming source.
I was looking into other streaming approaches – but that has failed miserably (see https://github.com/pschatzmann/live555).
So far the only reliable way is to use A2DP Bluetooth – but I am afraid that will not help you with Sonos.
Unfortunatly my decoders haven’t been fully tested yet and I did not have the time to look into these memory issues
Alex · 6. October 2021 at 15:20
It is possible to compile your example ( arduino-fdk-aac/examples/encode/encode.ino ) in arduino ide (windows 7)? When executed, I get an error in the terminal:
“[E] AACEncoderFDK.h : 320 – Unable to open encoder
starting…
[E] AACEncoderFDK.h : 199 – The encoder is not open
512 samples of random data written
[E] AACEncoderFDK.h : 199 – The encoder is not open
512 samples of random data written
[E] AACEncoderFDK.h : 199 – The encoder is not open”
….
An error occurs when calling “Get_AacEncoder ()”, but this function is not in the source code.
John Taipei · 2. September 2021 at 19:20
Hello Sir:
my English is poor
I have a idea that Long Range Walkie Talkie (500~1000meters)
audio–>INMP441–>esp32–>LoRa–>TX 443MHz
RX 443MHz–>LoRa–>esp32–>Max98357–>audio
Hardware is ready
https://www.facebook.com/groups/797613180807626/permalink/950282998873976/
but I can not write code
I am good at hardware not software
May I have chance to work with you??
Sincerely
John
pschatzmann · 3. September 2021 at 11:01
If you are not in a hurry I might be able to help you. What have you tried so far ?
I suggest to have a look at my https://github.com/pschatzmann/arduino-audio-tools library. You should find all necessary building blocks there and you might even exchange the esp32 e.g with a Rasperry Pico.
I am just worried about LoRa: Telephone quality is at min 8000 hz. With 16bit data this gives 128 kbps which is way above what you can achieve with LoRa.
So we might be forced to use MP3 or AAC which will add complexity and delays…
John Taipei · 7. September 2021 at 5:06
Thank you,first
I am not hurry
https://www.youtube.com/watch?v=vq7mPgecGKA
(wifi walkie talkie)
https://www.youtube.com/watch?v=d_h38X4_eQQ
(ESP NOW walkie talkie)
but Rang is too short
yes,SX1278 data rate only has 40kbps
I will change SX1280 2.4GHz module
it’s data rate 200kbps