BOOKS - Neural Text-to-Speech Synthesis
US $8.94
94856
94856
Neural Text-to-Speech Synthesis
Author: Xu Ta
Year: 2023
Format: PDF
File size: PDF 9.1 MB
Language: English
Year: 2023
Format: PDF
File size: PDF 9.1 MB
Language: English
Text to speech TTS synthesis is an Artificial Intelligence AI technique that renders a preferably naturally sounding speech given an arbitrary text It is a key technological component in many important applications including virtual assistants AI generated audiobooks speech to speech translation AI news reporters audible driving guidance and digital humans In the past decade we have observed significant progress made in TTS These new developments are mainly attributed to Deep Learning techniques and are usually referred to as neural TTS Many neural TTS systems have achieved human quality for the tasks they are designed for This book first introduces the history of TTS technologies and overviews neural TTS and provides preliminary knowledge on language and speech processing neural networks and Deep Learning and deep generative models It then introduces neural TTS from the perspective of key components text analyses acoustic models vocoders and end to end models and advanced topics expressive and controllable robust model efficient and data efficient TTS It also points some future research directions and collects some resources related to TTS Although many TTS books have been published this book is the first of its kind that provides a comprehensive introduction to neural TTS including but not limited to the key components such as text analysis acoustic model and vocoder the key milestone models such as Tacotron DeepVoice FastSpeech and the more advanced techniques such as expressive and controllable TTS robust TTS and efficient TTS Xu Tan the author of this book has contributed significantly to the recent advances in TTS He has developed several impactful neural TTS systems such as FastSpeech 1 2 DelightfulTTS and NaturalSpeech the latter of which has achieved human parity on the TTS benchmark dataset His knowledge of the domain and his first hand experience with the topic allow him to organize the contents effectively and make them more accessible to readers and to describe the key concepts the basic methods and the state of the art techniques and their relationships in detail and clearly I am very glad that he introduced and clarified many key concepts and background knowledge at the beginning of this book so that people with little or no knowledge of TTS can also read and understand the book effectively This is a very well written book and certainly one that provides useful and thoughtful information to readers at various levels I believe this book is a great reference book for all researchers practitioners and students who are interested in quickly grasping the history the state of the art and the future directions of speech synthesis or are interested in gaining insightful ideas on the development of TTS