A new Architechture for Talkie: TalkiePCM

There are quite a few popular TTS libraries for Arduino but most of them suffer from the same problem: The TTS function is not properly separated from the output function.

Architechture

In a properly architected solution we would have a

A TTS function which produces platfrom independent PCM data
An output function or library to process the PCM data.

In Arduino we have the abstact Print class which is used by everything to which you can output data to and we have the abstract Stream class which inherits from Print and which provides the functioinality to read the data from it. Good examples for Streams are the File and the HardwareSerial class (for the Serial object).

So one good flexible and portable way to define the output of a TTS functionality is to provide it with an instance that supports Print so that it can be used to output the generated PCM samples. This supports e.g. the output to

Serial
Files
I2S (if your platform supports it)
and many more

My AudioTools library is the perfect fit for the second part. You can e.g. output the audio

with the help of PWM
to the internal DAC
to an external DAC
to a Bluetooth Speaker
to Serial has CSV or hex data
to the network using different protocols
and many more

The Talkie Library

I wanted to extend the quite popular Talkie TTS library to provide PCM data, so that we can send it e.g. to a Bluetooth speaker.

Unfortunately I was looking at a big mess of #ifdefs all over the place that were inpossible to untangle. So I decided to go back to the original Talkie library from going-digital: Here there was at least a chance to understand what’s going on, because only one platfrom was supported and the code was quite wells structured. Unfortunately it really took me too much time to figure out how the timer callback and the generation are working together.

After an embarrassingly long time, I finally managed to grok the inner workings and after restructuring the code a bit, I got my PCM generation finally working.

The generated audio is 16 bits with a sampling rate of 8000 and you can define how many channels you want to generate. E.g for I2S which is a stereo protocol, you can just generate data on the 2 channels.

I decided to roll my own version of the Library and I called it TalkiePCM to avoid any naming conflicts with the existing libraries. I also added a CMakeLists.txt to make it usable outside of Arduino.

An Example Arduino Sketch

Here is a simple example sketch:

#include "AudioTools.h"
#include "AudioTools/AudioLibs/AudioBoardStream.h" 
#include "TalkiePCM.h" 
#include "Vocab_US_Large.h"

const AudioInfo info(8000, 2, 16);
AudioBoardStream out(AudioKitEs8388V1);  // Audio sink
//CsvOutput<int16_t> out(Serial); // ouput on screen
TalkiePCM voice(out, info.channels);

void setup() {
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);
  // setup AudioKit
  auto cfg = out.defaultConfig();
  cfg.copyFrom(info);
  out.begin(cfg);

  Serial.println("Talking...");
}

void loop() {
  voice.say(sp2_DANGER);
  voice.say(sp2_DANGER);
  voice.say(sp2_RED);
  voice.say(sp2_ALERT);
  voice.say(sp2_MOTOR);
  voice.say(sp2_IS);
  voice.say(sp2_ON);
  voice.say(sp2_FIRE);
  voice.silence(1000);
}

1) We define the AudioBoardStream out output object which uses the AudioKitEs8388V1 driver, but you can replace this with any supported audio sink class (e.g. I2SStream).
2) We define a TalkiePCM object giving the above as output and telling it to generate audio in stereo.
3) In the setup we open the AudioBoardStream;
4) In the loop we just generate the PCM data with the help of the say() method, which will automatically render to samples to the assinged AudioBoardStream.

Dependencies

For this example, you need to have the following libraries installed:

arduino-audio-tools to output the Audio
arduino-audio-driver to support the AudioKit
TalkiePCM Library to generate the audio from the text

A new Architechture for Talkie: TalkiePCM

Published by pschatzmann on 21. October 202421. October 2024

Architechture

The Talkie Library

An Example Arduino Sketch

Dependencies

Further Reading

0 Comments

Leave a Reply Cancel reply

SD Read and Write Speeds on an ESP32

AudioTools: An ESP32 IDF implementation of URLStream

AudioTools: Using Multiple Decoders

A new Architechture for Talkie: TalkiePCM

Published by pschatzmann on 21. October 202421. October 2024

Architechture

The Talkie Library

An Example Arduino Sketch

Dependencies

Further Reading

see also:

0 Comments

Leave a Reply Cancel reply

Related Posts

SD Read and Write Speeds on an ESP32

AudioTools: An ESP32 IDF implementation of URLStream

AudioTools: Using Multiple Decoders