Speech synthesis with voice transformations

Alexander Sorin

Abstract

A text-to-speech synthesis (TTS) system can speak in few voices, each is derived from audio recordings of a real person. TTS voice transformations that change a perceived speaker identity in a controllable way is an attractive alternative to expensive, lengthy and human labor consuming recording and processing of new speech datasets. Foreseen entertainment applications in particular will require multitudes of distinct TTS voices to be created on demand which makes the voice transformation the merely viable option.

I'll present a state of the art in the research area of TTS voice transformation and our work on endowing a product level TTS system with instant, externally configurable voice transformation capabilities.

Speaker

Photo of Alexander Sorin

Alexander Sorin is a senior researcher in the Speech Technologies group at IBM Haifa Research Lab. He is an author of numerous articles and holds 7 patents. He received his M.Sc. degree in Applied Mathematics from the Automation and Computers Department of Moscow Oil and Gas Institute, USSR in 1979. Since 1988 he works at IBM HRL on numerous research projects in speech and image processing including concatenative and statistical text-to-speech synthesis, voice-based emotion detection, automatic speech transcription and distributed speech recognition. He led the IBM team in several European research projects. He is currently leading a research project in the area of speech synthesis and modeling.

Lecture languages

EnglishHebrewRussian

Topics

AI / AutomationDigital Reinvention

Duration options

1 hour

Travel/delivery options

In-countryOutside of country: Open for discussionRemote via video conference

Country

Israel

Lecture booking request

Thank you for your interest in hosting an IBM speaker. Please fill out the following form with as much detail as possible. An IBM representative will reach out to discuss your booking request. All guest lectures are subject to availability and agreements under this collaboration are not legally binding.