Technical Program

Paper Detail

Presentation #9
Session:Natural Language Processing
Location:Kallirhoe Hall
Session Time:Thursday, December 20, 13:30 - 15:30
Presentation Time:Thursday, December 20, 13:30 - 15:30
Presentation: Poster
Topic: Natural language processing:
Paper Title: Generating Semantic Similarity Atlas for Natural Languages
Authors: Lütfi Kerem Şenel, İhsan Utlu, Veysel Yücesoy, Aykut Koç, ASELSAN, Turkey; Tolga Çukur, Bilkent University, Turkey
Abstract: Cross-lingual studies attract a growing interest in natural language processing (NLP) research, and several studies showed that similar languages are more advantageous to work with than fundamentally different languages in transferring knowledge. Different similarity measures for the languages are proposed by researchers from different domains. However, a similarity measure focusing on semantic structures of languages can be useful for selecting language pairs or groups to work with, especially for the tasks requiring semantic knowledge such as sentiment analysis or word sense disambiguation. For this purpose, in this study, we leverage a recently proposed word embedding based method to generate a language similarity atlas for 76 different languages around the world. This atlas can help researchers select similar language pairs or groups in cross-lingual applications. Our findings suggest that semantic similarity between two languages is strongly correlated with the geographic proximity of the countries in which they are used.