Technical Program

ASR IV

Session Type: Poster
Time: Friday, December 21, 13:30 - 15:30
Location: Kallirhoe Hall
 
TOWARD DOMAIN-INVARIANT SPEECH RECOGNITION VIA LARGE SCALE TRAINING
         Arun Narayanan; Google
         Ananya Misra; Google
         Khe Chai Sim; Google
         Golan Pundak; Google
         Anshuman Tripathi; Google
         Mohamed Elfeky; Google
         Parisa Haghani; Google
         Trevor Strohman; Google
         Michiel Bacchiani; Google
 
TRANSLITERATION BASED APPROACHES TO IMPROVE CODE-SWITCHED SPEECH RECOGNITION PERFORMANCE
         Jesse Emond; Google
         Bhuvana Ramabhadran; Google
         Brian Roark; Google
         Pedro Moreno; Google
         Min Ma; Google
 
EXPLORING LAYER TRAJECTORY LSTM WITH DEPTH PROCESSING UNITS AND ATTENTION
         Jinyu Li; Microsoft
         Liang Lu; Microsoft
         Changliang Liu; Microsoft
         Yifan Gong; Microsoft
 
MULTICHANNEL ASR WITH KNOWLEDGE DISTILLATION AND GENERALIZED CROSS CORRELATION FEATURE
         Wenjie Li; Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics
         Yu Zhang; Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics
         Pengyuan Zhang; Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics
         Fengpei Ge; Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics
 
OPTIMIZING THE QUALITY OF SYNTHETICALLY GENERATED PSEUDOWORDS FOR THE TASK OF MINIMAL-PAIR DISTINCTION
         Heiko Holz; University of Tübingen
         Maria Chinkina; University of Tübingen
         Laura Vetter; Ludwig Maximilian University of Munich
 
LEVERAGING SEQUENCE-TO-SEQUENCE SPEECH SYNTHESIS FOR ENHANCING ACOUSTIC-TO-WORD SPEECH RECOGNITION
         Masato Mimura; Kyoto University
         Sei Ueno; Kyoto University
         Hirofumi Inaguma; Kyoto University
         Shinsuke Sakai; Kyoto University
         Tatsuya Kawahara; Kyoto University
 
HIERARCHICAL MULTITASK LEARNING WITH CTC
         Ramon Sanabria; Carnegie Mellon University
         Florian Metze; Carnegie Mellon University
 
A K-NEAREST NEIGHBOURS APPROACH TO UNSUPERVISED SPOKEN TERM DISCOVERY
         Alexis Thual; ENS
         Corentin Dancette; ENS
         Julien Karadayi; ENS
         Juan Benjumea; ENS
         Emmanuel Dupoux; ENS
 
A NEW TIMIT BENCHMARK FOR CONTEXT-INDEPENDENT PHONE RECOGNITION USING TURBO FUSION
         Timo Lohrenz; TU Braunschweig
         Wei Li; TU Braunschweig
         Tim Fingscheidt; TU Braunschweig
 
EFFICIENT IMPLEMENTATION OF RECURRENT NEURAL NETWORK TRANSDUCER IN TENSORFLOW
         Tom Bagby; Google
         Kanishka Rao; Google
         Khe Chai Sim; Google
 
AUDIO-VISUAL SPEECH RECOGNITION WITH A HYBRID CTC/ATTENTION ARCHITECTURE
         Stavros Petridis; Imperial College London
         Themos Stafylakis; University of Nottingham
         Pingchuan Ma; Imperial College London
         Georgios Tzimiropoulos; University of Nottingham
         Maja Pantic; Imperial College London
 
MULTILINGUAL SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION: ARCHITECTURE, TRANSFER LEARNING, AND LANGUAGE MODELING
         Jaejin Cho; Johns Hopkins University
         Murali Karthick Baskar; Brno university of technology
         Ruizhi Li; Johns Hopkins University
         Matthew Wiesner; Johns Hopkins University
         Sri Harish Mallidi; Amazon
         Nelson Yalta; Waseda University
         Martin Karafiat; Brno university of technology
         Shinji Watanabe; Johns Hopkins University
         Takaaki Hori; Mitsubishi Electric Research Laboratories
 
SPEAKER SELECTIVE BEAMFORMER WITH KEYWORD MASK ESTIMATION
         Yusuke Kida; Yahoo Japan Corporation
         Dung Tran; Yahoo Japan Corporation
         Motoi Omachi; Yahoo Japan Corporation
         Toru Taniguchi; Yahoo Japan Corporation
         Yuya Fujita; Yahoo Japan Corporation
 
SPEAKER ADAPTED BEAMFORMING FOR MULTI-CHANNEL AUTOMATIC SPEECH RECOGNITION
         Tobias Menne; RWTH Aachen University
         Ralf Schlüter; RWTH Aachen University
         Hermann Ney; RWTH Aachen University
 
SPEAKER ADAPTATION FOR END-TO-END CTC MODELS
         Ke Li; Johns Hopkins University
         Jinyu Li; Microsoft AI and Research
         Yong Zhao; Microsoft AI and Research
         Kshitiz Kumar; Microsoft AI and Research
         Yifan Gong; Microsoft AI and Research
 
AN EXPLORATION OF MIMIC ARCHITECTURES FOR RESIDUAL NETWORK BASED SPECTRAL MAPPING
         Peter Plantinga; The Ohio State University
         Deblin Bagchi; The Ohio State University
         Eric Fosler-Lussier; The Ohio State University
 
MULTI-CHANNEL MULTI-SPEAKER OVERLAPPED SPEECH RECOGNITION WITH LOCATION GUIDED SPEECH EXTRACTION NETWORK
         Zhuo Chen; Microsoft Cloud & AI
         Xiong Xiao; Microsoft Cloud & AI
         Takuya Yoshioka; Microsoft Cloud & AI
         Jinyu Li; Microsoft Cloud & AI
         Hakan Erdogan; Microsoft Cloud & AI
         Yifan Gong; Microsoft Cloud & AI
 
A STUDY ON SPEECH ENHANCEMENT USING EXPONENT-ONLY FLOATING POINT QUANTIZED NEURAL NETWORK (EOFP-QNN)
         Yi-Te Hsu; Academia Sinica
         Yu-Chen Lin; National Taiwan University
         Szu-Wei Fu; National Taiwan University
         Yu Tsao; Academia Sinica
         Tei-Wei Kuo; National Taiwan University
 
RAPID SPEAKER ADAPTATION OF NEURAL NETWORK BASED FILTERBANK LAYER FOR AUTOMATIC SPEECH RECOGNITION
         Hiroshi Seki; Toyohashi University of Technology
         Kazumasa Yamamoto; Chubu University
         Tomoyosi Akiba; Toyohashi University of Technology
         Seiichi Nakagawa; Chubu University
 
FAR-FIELD ASR USING LOW-RANK AND SPARSE SOFT TARGETS FROM PARALLEL DATA
         Pranay Dighe; Idiap Research Institute, EPFL
         Afsaneh Asaei; Idiap Research Institute
         Herve Bourlard; Idiap Research Institute, EPFL
 
DEEP VIEW2VIEW MAPPING FOR VIEW-INVARIANT LIPREADING
         Alexandros Koumparoulis; National Technical University of Athens
         Gerasimos Potamianos; University of Thessaly