| Paper ID | MLSP-11.3 | 
    | Paper Title | UNSUPERVISED DISCRIMINATIVE LEARNING OF SOUNDS FOR AUDIO EVENT CLASSIFICATION | 
	| Authors | Sascha Hornauer, Ke Li, Stella Yu, University of California, Berkeley, United States; Shabnam Ghaffarzadegan, Liu Ren, Robert Bosch LLC, United States | 
  | Session | MLSP-11: Self-supervised Learning for Speech Processing | 
  | Location | Gather.Town | 
  | Session Time: | Tuesday, 08 June, 16:30 - 17:15 | 
  | Presentation Time: | Tuesday, 08 June, 16:30 - 17:15 | 
  | Presentation | Poster | 
	 | Topic | Machine Learning for Signal Processing: [MLR-SSUP] Self-supervised and semi-supervised learning | 
  
	
    | IEEE Xplore Open Preview | Click here to view in IEEE Xplore | 
  
	
    | Virtual Presentation | Click here to watch in the Virtual Conference | 
  
  
    | Abstract | Recent progress in network-based audio event classification has shown the benefit of pre-training models on visual data such as ImageNet. While this process allows knowledge transfer across different domains, training a model on large-scale visual datasets is time consuming. On several audio event classification benchmarks, we show a fast and effective alternative that pre-trains the model unsupervised, only on audio data and yet delivers on-par performance with ImageNet pre-training. Furthermore, we show that our discriminative audio learning can be used to transfer knowledge across audio datasets and optionally include ImageNet pre-training. |