AUD-21.4
HTS-AT: A HIERARCHICAL TOKEN-SEMANTIC AUDIO TRANSFORMER FOR SOUND CLASSIFICATION AND DETECTION
Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov, University of California San Diego, United States of America; Xingjian Du, Bilei Zhu, Zejun Ma, Bytedance Inc., China
Session:
Detection and Classification of Acoustic Scenes and Events V: Classification
Track:
Audio and Acoustic Signal Processing
Location:
Gather Area K
Presentation Time:
Wed, 11 May, 21:00 - 21:45 China Time (UTC +8)
Wed, 11 May, 13:00 - 13:45 UTC
Wed, 11 May, 13:00 - 13:45 UTC
Session Chair:
Scott Wisdom, Google
Session AUD-21
AUD-21.1: ON THE IMPACT OF NORMALIZATION STRATEGIES IN UNSUPERVISED ADVERSARIAL DOMAIN ADAPTATION FOR ACOUSTIC SCENE CLASSIFICATION
Michel Olvera, Emmanuel Vincent, Université de Lorraine, CNRS, Inria, Loria, France; Gilles Gasso, LITIS, Université & INSA Rouen Normandie, France
AUD-21.2: IMPROVING BIRD CLASSIFICATION WITH UNSUPERVISED SOUND SEPARATION
Tom Denton, Google, United States of America; Scott Wisdom, John R. Hershey, Google Research, United States of America
AUD-21.3: SCALABLE NEURAL ARCHITECTURES FOR END-TO-END ENVIRONMENTAL SOUND CLASSIFICATION
Francesco Paissan, Alberto Ancilotto, Alessio Brutti, Elisabetta Farella, Fondazione Bruno Kessler, Italy
AUD-21.4: HTS-AT: A HIERARCHICAL TOKEN-SEMANTIC AUDIO TRANSFORMER FOR SOUND CLASSIFICATION AND DETECTION
Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov, University of California San Diego, United States of America; Xingjian Du, Bilei Zhu, Zejun Ma, Bytedance Inc., China
AUD-21.5: HYBRID ATTENTION-BASED PROTOTYPICAL NETWORKS FOR FEW-SHOT SOUND CLASSIFICATION
You Wang, David Anderson, Georgia Institute of Technology, United States of America
AUD-21.6: Audio scene monitoring using redundant ad-hoc microphone arrays
Peter Gerstoft, Yihan Hu, Michael J. Bianco, Chaitanya Patil, Ardel Alegre, Yoav Freund, Francois Grondin, University of California, San Diego, United States of America