IEEE ICASSP 2022 || Singapore || 7-13 May 2022 Virtual; 22-27 May 2022 In-Person

AUD-21.4

HTS-AT: A HIERARCHICAL TOKEN-SEMANTIC AUDIO TRANSFORMER FOR SOUND CLASSIFICATION AND DETECTION

Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov, University of California San Diego, United States of America; Xingjian Du, Bilei Zhu, Zejun Ma, Bytedance Inc., China

Session:

Detection and Classification of Acoustic Scenes and Events V: Classification

Location:

Gather Area K

Presentation Time:

Wed, 11 May, 21:00 - 21:45 China Time (UTC +8)
Wed, 11 May, 13:00 - 13:45 UTC

Session Chair:

Scott Wisdom, Google

Resources

View Manuscript

Session AUD-21

AUD-21.1: ON THE IMPACT OF NORMALIZATION STRATEGIES IN UNSUPERVISED ADVERSARIAL DOMAIN ADAPTATION FOR ACOUSTIC SCENE CLASSIFICATION

Michel Olvera, Emmanuel Vincent, Université de Lorraine, CNRS, Inria, Loria, France; Gilles Gasso, LITIS, Université & INSA Rouen Normandie, France

AUD-21.2: IMPROVING BIRD CLASSIFICATION WITH UNSUPERVISED SOUND SEPARATION

Tom Denton, Google, United States of America; Scott Wisdom, John R. Hershey, Google Research, United States of America

AUD-21.3: SCALABLE NEURAL ARCHITECTURES FOR END-TO-END ENVIRONMENTAL SOUND CLASSIFICATION

Francesco Paissan, Alberto Ancilotto, Alessio Brutti, Elisabetta Farella, Fondazione Bruno Kessler, Italy

AUD-21.4: HTS-AT: A HIERARCHICAL TOKEN-SEMANTIC AUDIO TRANSFORMER FOR SOUND CLASSIFICATION AND DETECTION

Ke Chen, Taylor Berg-Kirkpatrick, Shlomo Dubnov, University of California San Diego, United States of America; Xingjian Du, Bilei Zhu, Zejun Ma, Bytedance Inc., China

AUD-21.5: HYBRID ATTENTION-BASED PROTOTYPICAL NETWORKS FOR FEW-SHOT SOUND CLASSIFICATION

You Wang, David Anderson, Georgia Institute of Technology, United States of America

AUD-21.6: Audio scene monitoring using redundant ad-hoc microphone arrays

Peter Gerstoft, Yihan Hu, Michael J. Bianco, Chaitanya Patil, Ardel Alegre, Yoav Freund, Francois Grondin, University of California, San Diego, United States of America

IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022

Virtual (all paper presentations)

22-27 May 2022

Main Venue: Marina Bay Sands Expo & Convention Center, Singapore

27-28 October 2022

Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

HTS-AT: A HIERARCHICAL TOKEN-SEMANTIC AUDIO TRANSFORMER FOR SOUND CLASSIFICATION AND DETECTION

IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022 Virtual (all paper presentations) 22-27 May 2022 Main Venue: Marina Bay Sands Expo & Convention Center, Singapore 27-28 October 2022 Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

HTS-AT: A HIERARCHICAL TOKEN-SEMANTIC AUDIO TRANSFORMER FOR SOUND CLASSIFICATION AND DETECTION

7-13 May 2022

Virtual (all paper presentations)

22-27 May 2022

Main Venue: Marina Bay Sands Expo & Convention Center, Singapore

27-28 October 2022

Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China