IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
IEP-18: Methods for Reliable & Responsible Neural Conversational AI
Fri, 13 May, 22:00 - 22:45 China Time (UTC +8)
Fri, 13 May, 14:00 - 14:45 UTC
Location: Gather Area P
Virtual
Gather.Town
Expert
Presented by: Ahmad Beirami, Meta AI (formerly known as Facebook AI)

We will give an overview of CAIRaoke, an effort to build neural conversational AI models to power the next generation of task-oriented virtual digital assistants with augmented/virtual reality capabilities. We then continue with some of the challenges that we faced in training CAIRaoke dialog models, namely noisy training data resulting in faulty models, and lack of variations in training conversation flow data leading to poor generalization to real-world conditions. These prompted us to create new public benchmarks for quantifying robustness in task-oriented dialog as well as new modeling techniques to solve these challenges. We will survey the state-of-the-art efforts across industry and academia to tackle such challenges, and will also briefly present three of our recent efforts to solve these challenges. The first paper (CheckDST: https://arxiv.org/pdf/2112.08321.pdf) provides new metrics for quantifying real-world generalization of dialog state tracking performance. The second method (TERM: https://arxiv.org/pdf/2007.01162.pdf) is a simple tweak to the widely used empirical risk minimization framework that can promote robustness against noisy outlier samples. The third method (DAIR: https://arxiv.org/pdf/2110.11205.pdf) is a simple regularization add-on that targets performance consistency when data augmentation is used for better generalization to unseen examples. We will end the talk with an overview of existing open challenges in the field that we hope the signal processing society can tackle. Relevance to ICASSP: This presentation provides an overview of challenges of realizing neural conversational models and provides theoretical methods inspired from information theory and signal processing to address some of them. We end with presenting more open challenges that we hope that ICASSP community can tackle to help realize neural conversational models. Given that signal processing (SP) society has been a pioneer in image processing, speech processing, and video processing techniques, we believe that SP society is uniquely positioned to lead the way to solve these challenges and the main goal of this talk is to expose these problems and forge concrete connections between industrial research and SP society to this effect.

Biography

Ahmad Beirami is a research scientist at Meta AI, leading research to power the next generation of virtual digital assistants with AR/VR capabilities. His research broadly involves learning models with robustness and fairness considerations in large-scale systems. Prior to that, he led the AI agent research program for automated playtesting of video games at Electronic Arts. Before moving to industry in 2018, he held a joint postdoctoral fellow position at Harvard & MIT, focused on problems in the intersection of core machine learning and information theory. He is the recipient of the Sigma Xi Best PhD Thesis Award from Georgia Tech in 2014 for his work on fundamental limits of redundancy elimination from network traffic data.