In my introductory blog, I have described that it would be cool to remove the silence from the recorded samples. If we look at one of the recordings we can see that they start and end with some silence:

I have created a SilenceRemovalConverter which does exactly this. The first parameters indicates the number of samples we consider (if the last n samples were silence we remove the sample), the second the threshold value. Here is a Arduino Sketch that shows how to use it:

#include "SimpleTTS.h"
#include "AudioCodecs/CodecMP3Helix.h"
#include "Desktop.h"

I2SStream i2s;
SilenceRemovalConverter<int16_t> rem(8, 2);
ConvertedStream<int16_t,SilenceRemovalConverter<int16_t>> out(i2s, rem); 

MP3DecoderHelix mp3;
AudioDictionary dictionary(ExampleAudioDictionaryValues);
NumberUnitToText utt;
TextToSpeech tts(utt, out, mp3, dictionary);

double number = 1.1;

void setup(){
    Serial.begin(115200);
    AudioLogger::instance().begin(Serial, AudioLogger::Info);
    // setup out
    auto cfg = i2s.defaultConfig(); 
    cfg.sample_rate = 24000;
    cfg.channels = 1;
    i2s.begin(cfg);

    // define volume
    volume.setVolume(0.6);

}

void increment() {
    number +=1;
}

void loop() {
    // speach output
    utt.say(number, "usd");

    increment();
    delay(1000);
}

If we run the sketch we can see in the log that it is actually working:

11:26:06.707 -> [I] AudioCopy.h : 121 - StreamCopy::copy 1024 -> 1024 -> 1024 bytes - in 1 hops
11:26:06.707 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.741 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.741 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.741 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.776 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.776 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.813 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.813 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.813 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.850 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:06.850 -> [I] Converter.h : 676 - filtered silence from 1152 -> 458
11:26:06.850 -> [I] AudioCopy.h : 121 - StreamCopy::copy 1024 -> 1024 -> 1024 bytes - in 1 hops
11:26:06.850 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.888 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.888 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.922 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.922 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.957 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.957 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.957 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.990 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:06.990 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.026 -> [I] AudioCopy.h : 121 - StreamCopy::copy 1024 -> 1024 -> 1024 bytes - in 1 hops
11:26:07.026 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.026 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.059 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.059 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.093 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.093 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.126 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.126 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.126 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.160 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.160 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.193 -> [I] AudioCopy.h : 121 - StreamCopy::copy 992 -> 992 -> 992 bytes - in 1 hops
11:26:07.193 -> [I] TextToSpeech.h : 63 - say: dollars
11:26:07.193 -> [I] AudioCopy.h : 74 - buffer_size=1024
11:26:07.193 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.193 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.231 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.231 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.265 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.298 -> [I] Converter.h : 676 - filtered silence from 1152 -> 72
11:26:07.333 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.333 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.333 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.367 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.367 -> [I] AudioCopy.h : 121 - StreamCopy::copy 1024 -> 1024 -> 1024 bytes - in 1 hops
11:26:07.367 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.400 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.400 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.433 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.433 -> [I] Converter.h : 676 - filtered silence from 1152 -> 230
11:26:07.433 -> [I] Converter.h : 676 - filtered silence from 1152 -> 0
11:26:07.466 -> [I] Converter.h : 676 - filtered silence from 1152 -> 34
11:26:07.466 -> [I] Converter.h : 676 - filtered silence from 1152 -> 946
11:26:07.502 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1150
11:26:07.502 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.502 -> [I] Converter.h : 676 - filtered silence from 1152 -> 1152
11:26:07.537 -> [I] AudioCopy.h : 121 - StreamCopy::copy 1024 -> 1024 -> 1024 bytes - in 1 hops


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *