Paper ID | SMR-4.6 | ||
Paper Title | KNOWLEDGE-BASED REASONING NETWORK FOR OBJECT DETECTION | ||
Authors | Huigang Zhang, Liuan Wang, Jun Sun, Fujitsu R&D Center, Co., LTD, China | ||
Session | SMR-4: Image and Video Sensing, Modeling, and Representation | ||
Location | Area F | ||
Session Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
Presentation Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
Presentation | Poster | ||
Topic | Image and Video Sensing, Modeling, and Representation: Structural-model based methods | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | The mainstream object detection algorithms rely on recognizing object instances individually, but do not consider the high-level relationship among objects in context. This will inevitably lead to biased detection results, due to the lack of commonsense knowledge that humans often use to assist the task for object identification. In this paper, we present a novel reasoning module to endow the current detection systems with the power of commonsense knowledge. Specifically, we use graph attention network (GAT) to represent the knowledge among objects. The knowledge covers visual and semantic relations. Through the iterative update of GAT, the object features can be enriched. Experiments on the COCO detection benchmark indicate that our knowledge-based reasoning network has achieved consistent improvements upon various CNN detectors. We achieved 1.9 and 1.8 points higher Average Precision (AP) than Faster-RCNN and Mask-RCNN respectively, when using ResNet50-FPN as backbone. |