Emotion in Speech Synthesis
Emotion in Speech Synthesis
This chapter emphasizes the notion that human speech is listener-centred; speech is intended to be heard and understood. Adding expressive and emotive content provides the listener with information about the speaker’s identity (gender, age, education, etc.), the speaker’s attitude and feelings toward the listener, and the nature of what is being said. Adding this to synthesis presents problems, including determining the most useful type of synthesizer, incorporating a proposed prosodic wrapper for speech, linking parameters of emotive content with acoustic parameters, and with underlying theory constructs such as category labels or parameters for driving the synthesizer. The relationship between high- and low-level synthesis, and how to incorporate a range of emotive content and voice quality are discussed.
Keywords: listener-centred, speaker identity, prosodic wrapper, category labels, acoustic parameters, theory constructs, high-level synthesis, low-level synthesis, range of emotive content
Oxford Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.
Please, subscribe or login to access full text content.
If you think you should have access to this title, please contact your librarian.
To troubleshoot, please check our FAQs , and if you can't find the answer there, please contact us .