Paper ID | AUD-30.3 |
Paper Title |
SOUND EVENT DETECTION AND SEPARATION: A BENCHMARK ON DESED SYNTHETIC SOUNDSCAPES |
Authors |
Nicolas Turpault, Romain Serizel, Université de Lorraine, CNRS, Inria, Loria, France; Scott Wisdom, Hakan Erdogan, John R. Hershey, Google Research, United States; Eduardo Fonseca, Universitat Pompeu Fabra, Spain; Prem Seetharaman, Descript, Inc., United States; Justin Salamon, Adobe Research, United States |
Session | AUD-30: Detection and Classification of Acoustic Scenes and Events 5: Scenes |
Location | Gather.Town |
Session Time: | Friday, 11 June, 13:00 - 13:45 |
Presentation Time: | Friday, 11 June, 13:00 - 13:45 |
Presentation |
Poster
|
Topic |
Audio and Acoustic Signal Processing: [AUD-CLAS] Detection and Classification of Acoustic Scenes and Events |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
We propose a benchmark of state-of-the-art sound event detection systems (SED). We designed synthetic evaluation sets to focus on specific sound event detection challenges. We analyze the performance of the submissions to DCASE 2020 task 4 depending on time related modifications (time position of an event and length of clips) and we study the impact of non-target sound events and reverberation. We show that the localization in time of sound events is still a problem for SED systems. We also show that reverberation and non-target sound events are severely degrading the performance of the SED systems. In the latter case, sound separation seems like a promising solution. |