2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

IEEE Signal Processing Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper ID	MLSP-28.2
Paper Title	MULTIPHISH: MULTI-MODAL FEATURES FUSION NETWORKS FOR PHISHING DETECTION
Authors	Lei Zhang, Peng Zhang, Luchen Liu, Jianlong Tan, Institute of Information Engineering, Chinese Academy of Sciences, China
Session	MLSP-28: ML and Time Series
Location	Gather.Town
Session Time:	Thursday, 10 June, 14:00 - 14:45
Presentation Time:	Thursday, 10 June, 14:00 - 14:45
Presentation	Poster
Topic	Machine Learning for Signal Processing: [MLR-APPL] Applications of machine learning
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	Phishing is an increasingly serious cybercrime. Phishers create phishing websites by mimicking legitimate websites to confuse users and steal their personal information. The proliferation of phishing websites and more advanced camouflage techniques are problems faced by most existing methods. In this paper, we propose a features fusion networks (MultiPhish) which is the first study on fusing multi-modal features with neural networks for the phishing detection task. In this end-to-end network, the domain and favicon of the website are represented via deep neural networks, and the representation of the website identity is obtained through multi-modal features fusion. In addition, the variation autoencoder (VAE) is introduced to optimize the representation. In the phishing detection module, we incorporate URL features to improve situations where phishing websites cannot be detected only by estimating whether the website identity is disguised. Based on the latest collected dataset, we have carried out extensive experiments and proved that our model is superior to the relevant methods. In addition, MultiPhish is a completely language-independent strategy, so it can perform phishing detection regardless of the text language.