The goal of this blog is to give a quick introduction into using TensorFlow Lite For Microcontrollers to create Audio Output with the help of the audio-tools library.
Hallo World
The starting point is the good overview provided by the “Hallo World” example of Tensorflow Lite which describes how to create, train and use a model which based on the sine function.
Converting into Audio
We can use this model to output the result as a tone with the help o the following sketch:
#include "AudioTools.h"
#include "AudioLibs/TfLiteAudioStream.h"
#include "model.h"
TfLiteSineReader tf_reader(20000,0.3); // Audio generation logic
TfLiteAudioStream tf_stream; // Audio source -> no classification so N is 0
I2SStream i2s; // Audio destination
StreamCopy copier(i2s, tf_stream); // copy tf_stream to i2s
int channels = 1;
int samples_per_second = 16000;
void setup() {
Serial.begin(115200);
AudioLogger::instance().begin(Serial, AudioLogger::Warning);
// Setup tensorflow input
auto tcfg = tf_stream.defaultConfig();
tcfg.channels = channels;
tcfg.sample_rate = samples_per_second;
tcfg.kTensorArenaSize = 2 * 1024;
tcfg.model = g_model;
tcfg.input = &tf_reader;
tf_stream.begin(tcfg);
// setup Audioi2s output
auto cfg = i2s.defaultConfig(TX_MODE);
cfg.channels = channels;
cfg.sample_rate = samples_per_second;
i2s.begin(cfg);
}
void loop() { copier.copy(); }
Like in any other audio sketch we copy the audio data from the source (TfLiteAudioStream) to the sink (I2SStream).
The TfLiteSineReader class
The heart of the processing is the TfLiteSineReader class which is provided by the framework and has been defined as follows
class TfLiteSineReader : public TfLiteReader {
public: TfLiteSineReader(int16_t range=32767, float increment=0.01 ){
this->increment = increment;
this->range = range;
}
virtual int read(TfLiteAudioStream *parent, int16_t*data, int sampleCount) {
int channels = parent->config().channels;
float two_pi = 2 * PI;
// setup on first call
if (p_interpreter==nullptr){
p_interpreter = parent->interpreter();
input = p_interpreter->input(0);
output = p_interpreter->output(0);
}
for (int j=0; j<sampleCount; j+=channels){
// Quantize the input from floating-point to integer
input->data.int8[0] = TfQuantizer::quantize(actX,input->params.scale, input->params.zero_point);
// Invoke TF Model
TfLiteStatus invoke_status = p_interpreter->Invoke();
// Dequantize the output and convert it to int32 range
data[j] = TfQuantizer::dequantizeToNewRange(output->data.int8[0], output->params.scale, output->params.zero_point, range);
for (int i=1;i<channels;i++){
data[j+i] = data[j];
}
// Increment X
actX += increment;
if (actX>two_pi){
actX-=two_pi;
}
}
return sampleCount;
}
As you can see, we just provide a array of int16_t data generated by the Tensorflow Model!
Summary
This is a pretty bad way to generate a sine tone and the audio tools library provides better ways to do this. However the goal was to give an simple introduction as a stepping stone…
Dependencies
- Arduino Audio Tools
- tflite-micro-arduino-examples
- arduino-audiokit – Optional if you use an AudioKit board
Github
The full example can be found on Github
1 Comment
Stuart Naylor · 24. October 2022 at 12:46
It would be interesting to see if you could build a KWS on the newer esp32-s3 and get the new vector instructions and beefed up capability.
A CNN or DSCNN are likely models due to the lack of recurrent LSTM OR GRU layers.
https://github.com/google-research/google-research/tree/master/kws_streaming has a really good framework that allows model creation specifically for the tflite4micro frontend.
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/microfrontend
https://github.com/42io has some interesting repo’s for KWS but it would be really interesting to push the newer ML capability of the S3 with a broadcast after KW KWS.