HLT-10: Multi-modality in Language |
Session Type: Poster |
Time: Wednesday, 9 June, 16:30 - 17:15 |
Location: Gather.Town |
Virtual Session: View on Virtual Platform |
Session Chair: Mahnoosh Mehrabani, Interactions Research |
HLT-10.1: INCORPORATING SYNTACTIC AND PHONETIC INFORMATION INTO MULTIMODAL WORD EMBEDDINGS USING GRAPH CONVOLUTIONAL NETWORKS |
Wenhao Zhu; Shanghai University |
Shuang Liu; Shanghai University |
Chaoming Liu; Shanghai University |
HLT-10.2: LIFI: TOWARDS LINGUISTICALLY INFORMED FRAME INTERPOLATION |
Aradhya Mathur; IIIT Delhi |
Devansh Batra; IIIT-D |
Yaman Kumar Singla; IIIT-D; Adobe; State University of New York at Buffalo |
Rajiv Ratn Shah; IIIT Delhi |
Changyou Chen; State University of New York at Buffalo |
Roger Zimmermann; NUS |
HLT-10.3: TRIPLE SEQUENCE GENERATIVE ADVERSARIAL NETS FOR UNSUPERVISED IMAGE CAPTIONING |
Yucheng Zhou; Fudan University |
Wei Tao; Fudan University |
Wenqiang Zhang; Fudan University |
HLT-10.4: ALIGN OR ATTEND? TOWARD MORE EFFICIENT AND ACCURATE SPOKEN WORD DISCOVERY USING SPEECH-TO-IMAGE RETRIEVAL |
Liming Wang; University of Illinois, Urbana-Champaign |
Xinsheng Wang; Delft University of Technology |
Mark Hasegawa-Johnson; University of Illinois, Urbana-Champaign |
Odette Scharenborg; Delft University of Technology |
Najim Dehak; Johns Hopkins University |
HLT-10.5: TOWARDS PRACTICAL LIPREADING WITH DISTILLED AND EFFICIENT MODELS |
Pingchuan Ma; Imperial College London |
Brais Martinez; Samsung AI Research Center |
Stavros Petridis; Imperial College London |
Maja Pantic; Imperial College London |
HLT-10.6: END-TO-END AUDIO-VISUAL SPEECH RECOGNITION WITH CONFORMERS |
Pingchuan Ma; Imperial College London |
Stavros Petridis; Imperial College London |
Maja Pantic; Imperial College London |