New AI Clones Your Voice from Just 5 Seconds of Audio Recording!
New AI clones your voice from just 5 seconds of audio recording! New research introduces us to AI that converts text to speech (TTS). The algorithm is traditionally based on a neural network. Upon closer inspection, it consists of 3 main components:

New AI clones your voice from just 5 seconds of audio recording!
New research introduces us to AI that converts text to speech (TTS). The algorithm is traditionally based on a neural network. Upon closer inspection, it consists of 3 main components:
-
A speaker encoder network (trained on thousands of speakers — this is how the system learns what a human voice sounds like).
-
Next is a sequence synthesis network based on Tacotron 2, which generates a spectrogram from text.
-
Finally, there is an auto-regressive vocoder based on WaveNet, which converts the spectrum into a sequence of samples.
More information can be found in the links.
Demonstration and basic explanation: https://www.youtube.com/watch?v=0sR1rU3gLzQ&fbclid=IwAR0cXA2E6gt0YWusREZpj9K5k2o91Ecvsgki7NhnPfMfWV7Sjll66R0T-q0
Paper: https://arxiv.org/abs/1806.04558
Originally published on Facebook — link to post
Původní zdroj: facebook
Související články
November 2020
Five New Artificial Intelligence Solutions Enter This Year's AI Hype Cycle
ReadOctober 2020
ALPHA is on! Today’s amazing filming with Czech TV. It was a creative delight led by ...
ReadSeptember 2020