November 15, 2019·Jan Tyl·1 min read·Archive 2019

New AI Clones Your Voice from Just 5 Seconds of Audio Recording!

New AI clones your voice from just 5 seconds of audio recording! New research introduces us to AI that converts text to speech (TTS). The algorithm is traditionally based on a neural network. Upon closer inspection, it consists of 3 main components:

New AI clones your voice from just 5 seconds of audio recording!

New research introduces us to AI that converts text to speech (TTS). The algorithm is traditionally based on a neural network. Upon closer inspection, it consists of 3 main components:

A speaker encoder network (trained on thousands of speakers — this is how the system learns what a human voice sounds like).
Next is a sequence synthesis network based on Tacotron 2, which generates a spectrogram from text.
Finally, there is an auto-regressive vocoder based on WaveNet, which converts the spectrum into a sequence of samples.

More information can be found in the links.

Demonstration and basic explanation: https://www.youtube.com/watch?v=0sR1rU3gLzQ&fbclid=IwAR0cXA2E6gt0YWusREZpj9K5k2o91Ecvsgki7NhnPfMfWV7Sjll66R0T-q0

Paper: https://arxiv.org/abs/1806.04558

Originally published on Facebook — link to post

Original source: facebook

Související články

November 2020

Five New Artificial Intelligence Solutions Enter This Year's AI Hype Cycle

Read

October 2020

ALPHA is on! Today’s amazing filming with Czech TV. It was a creative delight led by ...

Read

September 2020

Every day something happens. Today we are putting together a magnificent audiovisual project driven by AI...

Read