An Introduction to Arduino Audio Generated by Tensorflow Lite

The goal of this blog is to give a quick introduction into using TensorFlow Lite For Microcontrollers to create Audio Output with the help of the audio-tools library.

Hallo World

The starting point is the good overview provided by the “Hallo World” example of Tensorflow Lite which describes how to create, train and use a model which based on the sine function.

Converting into Audio

We can use this model to output the result as a tone with the help o the following sketch:

#include "AudioTools.h"
#include "AudioLibs/TfLiteAudioStream.h"
#include "model.h"

TfLiteSineReader tf_reader(20000,0.3);  // Audio generation logic 
TfLiteAudioStream tf_stream;            // Audio source -> no classification so N is 0
I2SStream i2s;                          // Audio destination
StreamCopy copier(i2s, tf_stream);      // copy tf_stream to i2s
int channels = 1;
int samples_per_second = 16000;


void setup() {
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Warning);

  // Setup tensorflow input
  auto tcfg = tf_stream.defaultConfig();
  tcfg.channels = channels;
  tcfg.sample_rate = samples_per_second;
  tcfg.kTensorArenaSize = 2 * 1024;
  tcfg.model = g_model;
  tcfg.input = &tf_reader;
  tf_stream.begin(tcfg);

  // setup Audioi2s output
  auto cfg = i2s.defaultConfig(TX_MODE);
  cfg.channels = channels;
  cfg.sample_rate = samples_per_second;
  i2s.begin(cfg);

}

void loop() { copier.copy(); }

Like in any other audio sketch we copy the audio data from the source (TfLiteAudioStream) to the sink (I2SStream).

The TfLiteSineReader class

The heart of the processing is the TfLiteSineReader class which is provided by the framework and has been defined as follows

class TfLiteSineReader : public TfLiteReader {
  public: TfLiteSineReader(int16_t range=32767, float increment=0.01 ){
    this->increment = increment;
    this->range = range;
  }

  virtual int read(TfLiteAudioStream *parent, int16_t*data, int sampleCount) {
    int channels = parent->config().channels;
    float two_pi = 2 * PI;
    // setup on first call
    if (p_interpreter==nullptr){
      p_interpreter = parent->interpreter();
      input = p_interpreter->input(0);
      output = p_interpreter->output(0);
    }
    for (int j=0; j<sampleCount; j+=channels){
      // Quantize the input from floating-point to integer
      input->data.int8[0] = TfQuantizer::quantize(actX,input->params.scale, input->params.zero_point);

      // Invoke TF Model
      TfLiteStatus invoke_status = p_interpreter->Invoke();
      // Dequantize the output and convert it to int32 range
      data[j] = TfQuantizer::dequantizeToNewRange(output->data.int8[0], output->params.scale, output->params.zero_point, range);
      for (int i=1;i<channels;i++){
          data[j+i] = data[j];
      }
      // Increment X
      actX += increment;
      if (actX>two_pi){
        actX-=two_pi;
      }
    }

    return sampleCount;
  }

As you can see, we just provide a array of int16_t data generated by the Tensorflow Model!

Summary

This is a pretty bad way to generate a sine tone and the audio tools library provides better ways to do this. However the goal was to give an simple introduction as a stepping stone…

Dependencies

Arduino Audio Tools
tflite-micro-arduino-examples
arduino-audiokit – Optional if you use an AudioKit board

Github

The full example can be found on Github

1 Comment

Stuart Naylor · 24. October 2022 at 12:46

It would be interesting to see if you could build a KWS on the newer esp32-s3 and get the new vector instructions and beefed up capability.

A CNN or DSCNN are likely models due to the lack of recurrent LSTM OR GRU layers.
https://github.com/google-research/google-research/tree/master/kws_streaming has a really good framework that allows model creation specifically for the tflite4micro frontend.
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/microfrontend
https://github.com/42io has some interesting repo’s for KWS but it would be really interesting to push the newer ML capability of the S3 with a broadcast after KW KWS.

An Introduction to Arduino Audio Generated by Tensorflow Lite

Published by pschatzmann on 8. April 20228. April 2022

Hallo World

Converting into Audio

The TfLiteSineReader class

Summary

Dependencies

Github

1 Comment

Stuart Naylor · 24. October 2022 at 12:46

Leave a Reply Cancel reply

Arduino Audio Tools: Pimping Up Resampling

HIMEM – ESP32 PSRAM on Steroids

Pimping up your ContainerM4A

An Introduction to Arduino Audio Generated by Tensorflow Lite

Published by pschatzmann on 8. April 20228. April 2022

Hallo World

Converting into Audio

The TfLiteSineReader class

Summary

Dependencies

Github

see also:

1 Comment

Stuart Naylor · 24. October 2022 at 12:46

Leave a Reply Cancel reply

Related Posts

Arduino Audio Tools: Pimping Up Resampling

HIMEM – ESP32 PSRAM on Steroids

Pimping up your ContainerM4A