Technical Program

Paper Detail

Presentation #8
Session:Detection, Paralinguistics and Coding
Location:Kallirhoe Hall
Session Time:Wednesday, December 19, 13:30 - 15:30
Presentation Time:Wednesday, December 19, 13:30 - 15:30
Presentation: Poster
Topic: Speech recognition and synthesis:
Paper Title: AN EXPERIMENTAL STUDY ON AUDIO REPLAY ATTACK DETECTION USING DEEP NEURAL NETWORKS
Authors: Bekir Bakar, Cemal Hanilci, Bursa Technical University, Turkey
Abstract: Automatic speaker verification (ASV) systems can be easily spoofed by previously recorded speech, synthesized speech and speech signal that artificially generated by voice conversion techniques. In order to increase the reliability of the ASV systems, detecting spoofing attacks whether a given speech signal is genuine or spoofed plays an important role. In this paper, we consider the detection of replay attacks which is the most easily implementable attack type against ASV systems. To this end, we utilize a deep neural network (DNN) based classifier using features extracted from the long-term average spectrum. The experiments are conducted on the latest edition of Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) database. The results obtained using the DNN classifier are compared with the ASVspoof 2017 baseline system provided by the organizers which consist of Gaussian mixture model (GMM) with constant-Q transform cepstral coefficients (CQCC) and the GMM with standard mel-frequency cepstrum coefficients (MFCC) features. Experimental results reveal that DNN considerably outperforms the well-known and successful GMM classifier. It is found that LTAS based features give better spoofing detection performance than CQCC and MFCC. Finally, we find that high-frequency region components convey much more discriminative information independent of features and classifiers.