Disco Narrator - Data Formatting

152334H 3220 words 16 minutes published on September 25, 2022 included in Tech

With the raw data in tow, we can construct a proper TTS Dataset with the use of a few Python scripts.

Phonecoding

152334H 1014 words 5 minutes published on September 18, 2022 included in Tech

Back when 2019’s Advent of Code was going on, I started a habit of what I now call phonecoding – the act of programming using a run-of-the-mill Android smartphone.

Disco Narrator - Data Scraping

152334H 1208 words 6 minutes published on September 11, 2022 included in Tech

To train an AI Text-To-Speech (TTS) model, we’ll need to obtain a Labelled Dataset with two things:

Clean audio files, containing only the voice we’re cloning
The dialogue transcript (text) for each audio file

Append a newline to your shell prompt

152334H 139 words One minute published on September 4, 2022 included in Tech

The default shell prompt (for Ubuntu 20) looks like this:

1
username@hostname:/path/to/cwd$ █

@cache without @cache

152334H 801 words 4 minutes published on August 28, 2022 included in Tech

Memoization is a part of the standard toolkit for, “things I can use to solve the algorithm question in my next job interview”. Most of the time, I like to use functools.cache for this: