Skip to main content
Reinforcement Learning Methods in Speech and Language Technology (Signals and Communication Technology)

Reinforcement Learning Methods in Speech and Language Technology (Signals and Communication Technology)

Current price: $99.99
Publication Date: November 12th, 2024
Publisher:
Springer
ISBN:
9783031537196
Pages:
0

Description

This book offers a comprehensive guide to reinforcement learning (RL) and bandits for speech and language technology. The book first provides an overview of RL and bandit methods and their applications to various speech and language tasks. The author then covers essential topics such as the formulations for speech and language tasks into RL problems, RL-based solutions in automatic speech recognition, speaker recognition, diarization, natural language understanding, text-to-speech synthesis, natural language generation, and conversational recommendation systems. The book also presents emerging strategies in RL methods, along with open questions and challenges in RL-based speech and language technology. With a focus on real-world applications, the book provides step-by-step guidance on how to use RL and bandit methods to solve problems in speech and language technology. The book also includes case studies and practical tips to help readers apply RL and bandit methods to their own projects. The book is a timely resource for speech and language researchers, engineers, students, and practitioners who are interested in learning how RL methods can improve the performance of speech and language systems and provide new interactive learning paradigms from an interface design point of view.

About the Author

Baihan Lin is an AI researcher and neuroscientist at Columbia University, specializing in speech and natural language processing (NLP). With a PhD in computational biology from Columbia University and an MS in applied mathematics from the University of Washington, Baihan has dedicated his research to developing intelligent speech and text-based systems that can augment human-AI and human-human interactions in healthcare, and held research positions at IBM, Google, Microsoft, Amazon and BGI Genomics. He has created and deployed various pioneering machine learning solutions in the speech and language domains, such as the first-ever online and reinforcement learning (RL)-based speaker diarization system and RL-based interactive spoken language understanding (SLU) systems for children with speech and communication disorders. Baihan's research focuses on deep learning, RL and NLP has led to deployed real-world applications, such as AI companions for therapists and surrounding-aware virtual realities. He has authored 50+ peer-reviewed publications and patents, with an H-index of 13, and served program committees or reviewers for over 15 conferences, including INTERSPEECH and NeurIPS, as well as over 20 journals. Baihan was the chair of the conference tutorials at INTERSPEECH-22 and WACV-22 on RL and bandits for speech, NLP, computer vision and multi-fidelity signal processing, and the chair of the IJCAI-23 workshop on knowledge-based compositional generalization. His research has also contributed to the development of RSAToolbox, an open-sourced software that performs statistical inference to understand neural systems and the theory of neural networks.