IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
AUD-29.4

END-TO-END NEURAL SPEECH CODING FOR REAL-TIME COMMUNICATIONS

Xue Jiang, Chengyu Zheng, Yuan Zhang, Communication University of China, China; Xiulian Peng, Huaying Xue, Yan Lu, Microsoft Research Asia, China

Session:
Audio and Speech Coding

Track:
Audio and Acoustic Signal Processing

Location:
Gather Area K

Presentation Time:
Thu, 12 May, 22:00 - 22:45 China Time (UTC +8)
Thu, 12 May, 14:00 - 14:45 UTC

Session Chair:
Jan Skoglund, Google
Presentation
Discussion
Resources
Session AUD-29
AUD-29.1: PSYCHOACOUSTIC CALIBRATION OF LOSS FUNCTIONS FOR EFFICIENT END-TO-END NEURAL AUDIO CODING
Kai Zhen, Minje Kim, Indiana University, United States of America; Mi Suk Lee, Jongmo Sung, Seungkwon Beack, Electronics and Telecommunications Research Institute, Korea, Republic of
AUD-29.2: Frequency Domain Long-Term Prediction for Low Delay General Audio Coding
Ning Guo, Bernd Edler, International Audio Laboratories Erlangen, Germany
AUD-29.3: ARCHITECTURE FOR VARIABLE BITRATE NEURAL SPEECH CODEC WITH CONFIGURABLE COMPUTATION COMPLEXITY
Tejas Jayashankar, Massachusetts Institute of Technology, United States of America; Thilo Koehler, Kaustubh Kalgaonkar, Zhiping Xiu, Jilong Wu, Ju Lin, Prabhav Agrawal, Qing He, Facebook AI, United States of America
AUD-29.4: END-TO-END NEURAL SPEECH CODING FOR REAL-TIME COMMUNICATIONS
Xue Jiang, Chengyu Zheng, Yuan Zhang, Communication University of China, China; Xiulian Peng, Huaying Xue, Yan Lu, Microsoft Research Asia, China
AUD-29.5: DEEP NEURAL NETWORK (DNN) AUDIO CODER USING A PERCEPTUALLY IMPROVED TRAINING METHOD
Seungmin Shin, Joon Byun, Youngcheol Park, Intelligent Signal Processing Lab, Yonsei University, Wonju, Korea, Republic of; Jongmo Sung, Seungkwon Beack, Electronics and Telecommunications Research Institute (ETRI), Daejeon, Korea, Republic of
AUD-29.6: PROGRESSIVE MULTI-STAGE NEURAL AUDIO CODING WITH GUIDED REFERENCES
Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Hong-Goo Kang, Yonsei university, Korea, Republic of; Inseon Jang, Electronics and Telecommunications Research Institution, Korea, Republic of