In my last Blogs I looked at SAM and Arduino/TTS. I was putting high hopes in CMU Flite:
CMU Flite (festival-lite) is a small, fast run-time open source text to speech synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative text to speech synthesis engine to Festival for voices built using the FestVox suite of voice building tools.
I was extending the project as well to provide a simple API and added some additional output scenarios, so that I could receive the data as stream: My extended project can be found on Github.
Like for SAM and TTS, the Arduino sketch for the Webserver is equally small (by using my arduino-audio-tools ):
#include "flite_arduino.h"
#include "AudioServer.h"
using namespace audio_tools;
AudioWAVServer server("ssid","password");
// Callback which provides the audio data
void outputData(Stream &out){
Serial.print("providing data...");
Flite flite(out);
flite.say("Hallo, my name is Alice");
}
void setup(){
Serial.begin(115200);
// start data sink
server.begin(outputData, 8000,1,16);
}
// Arduino loop
void loop() {
// Handle new connections
server.doLoop();
}
I did not get disappointed – this is so far the best voice quality:
But it comes at the cost of the size:
Sketch uses 2730326 bytes (86%) of program storage space. Maximum is 3145728 bytes.
Global variables use 38956 bytes (11%) of dynamic memory, leaving 288724 bytes for local variables. Maximum is 327680 bytes.
This might be just at the edge for an ESP32 but it is already too much for a Rasperry Pico…
0 Comments