IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-39: Speech Synthesis: Front-end
Tue, 10 May, 21:00 - 21:45 China Time (UTC +8)
Tue, 10 May, 13:00 - 13:45 UTC
Location: Gather Area D
Session Chair: Daniele Giacobello, Apple Inc.
Track: Speech and Language Processing

SPE-39.1: PART-OF-SPEECH MODELS COMPRESSION METHODS FOR ON-DEVICE GRAPHEME-TO-PHONEME CONVERSION

Marek Kubis, Paweł Skórzewski, Adam Mickiewicz University in Poznań, Poland; Maxime Méloux, Marcin Lewandowski, Samsung R&D Institute Poland, Poland; Gunu Jho, Hyoungmin Park, Samsung Electronics, Korea, Republic of

SPE-39.2: AN END-TO-END CHINESE TEXT NORMALIZATION MODEL BASED ON RULE-GUIDED FLAT-LATTICE TRANSFORMER

Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Tsinghua University, China; Huashan Pan, Xiulin Li, Databaker Technology Co., Ltd, China; Helen Meng, The Chinese University of Hong Kong, China

SPE-39.3: CHINESE SPELLING TEXT GENERATION OF MATHEMATICAL FORMULAS

Su Dong, Sicen Liu, Buzhou Tang, Harbin Institute of Technology, China; Shan Liu, Tencent Technology Company, China

SPE-39.4: POLYPHONE DISAMBIGUATION AND ACCENT PREDICTION USING PRE-TRAINED LANGUAGE MODELS IN JAPANESE TTS FRONT-END

Rem Hida, Masaki Hamada, Emiru Tsunoo, Toshiyuki Sekiya, Sony Group Corporation, Japan; Chie Kamada, Toshiyuki Kumakura, Sony Corporation of America, United States of America

SPE-39.5: INCREMENTAL TEXT-TO-SPEECH SYNTHESIS USING PSEUDO LOOKAHEAD WITH LARGE PRETRAINED LANGUAGE MODEL

Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari, The University of Tokyo, Japan

SPE-39.6: DATA AUGMENTATION FOR LONG-TAILED AND IMBALANCED POLYPHONE DISAMBIGUATION IN MANDARIN

Yang Zhang, Haitong Zhang, Yue Lin, NetEase Games AI Lab, China