| Paper ID | IVMSP-29.3 | ||
| Paper Title | CGAN-NET: CLASS-GUIDED ASYMMETRIC NON-LOCAL NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION | ||
| Authors | Hanlin Chen, National University of Defense Technology, China; Qingyong Hu, University of Oxford, United Kingdom; Jungang Yang, Jing Wu, National University of Defense Technology, China; Yulan Guo, National University of Defense Technology, Sun Yat-sen University, China | ||
| Session | IVMSP-29: Semantic Segmentation | ||
| Location | Gather.Town | ||
| Session Time: | Friday, 11 June, 13:00 - 13:45 | ||
| Presentation Time: | Friday, 11 June, 13:00 - 13:45 | ||
| Presentation | Poster | ||
| Topic | Image, Video, and Multidimensional Signal Processing: [IVTEC] Image & Video Processing Techniques | ||
| IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
| Abstract | By introducing various non-local blocks to capture the long-range dependencies, remarkable progress has been achieved in semantic segmentation recently. However, the improvement in segmentation accuracy usually comes at the price of significant reductions in network efficiency, as non-local block usually requires expensive computation and memory for dense pixel-to-pixel correlation. In this paper, we introduce a Class Guided Asymmetric Non-local Network (CGAN-Net) to enhance the class-discriminability in learned feature map, while maintaining real-time efficiency. The key to our approach is to calculate the dense similarity matrix in coarse semantic prediction maps, instead of the high-dimensional latent feature map. This is not only computationally and memory efficient, but helps to learn query-dependent global context. Experiments conducted on Cityscape and CamVid demonstrate the compelling performance of our CGAN-Net. In particular, our network achieves 76.8% mean IoU on the Cityscapes test set with a speed of 38 FPS for 1024x2048 images on a single Tesla V100 GPU. | ||