Cloning the voice and speech of Piotr Fronczewski for Polish speech synthesis
Abstract:
The quality of synthetically generated speech has improved significantly in recent years, largely due to the technological development of speech synthesis systems, in particular those based on deep neural networks (DNN). However, the problem of emotion in speech synthesis still remains a challenge. Most of the existing speech synthesis systems do not convey the pervasive emotional contexts in human-human interaction. The lack of expression limits the emotional intelligence of current speech synthesis systems. This work aimed to develop a recording method for preparing a balanced corpus of emotional recordings in the Polish language for use in speech synthesis based on artificial intelligence (AI) algorithms. An essential aspect of the work was the selection of a voice-over artist who would allow the recording of the spectrum of an actor's voice, emphasizing the actor's interpretations and emotions derived from the content. Outstanding actor Piotr Fronczewski was chosen for the role.