| Paper ID | ARS-7.7 | ||
| Paper Title | Multi-Task Learning by a Top-Down Control Network | ||
| Authors | Hila Levi, Shimon Ullman, Weizmann Institute of Science, Israel | ||
| Session | ARS-7: Image and Video Interpretation and Understanding 2 | ||
| Location | Area H | ||
| Session Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
| Presentation Time: | Wednesday, 22 September, 08:00 - 09:30 | ||
| Presentation | Poster | ||
| Topic | Image and Video Analysis, Synthesis, and Retrieval: Image & Video Interpretation and Understanding | ||
| IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
| Abstract | As the range of tasks performed by a general vision system expands, executing multiple tasks accurately and efficiently in a single network has become an important and still open problem. Recent computer vision approaches address this problem by branching networks, or by a channel-wise modulation of the network feature-maps with task specific vectors. We present a novel architecture that uses a dedicated top-down control network to modify the activation of all the units in the main recognition network in a manner that depends on the selected task, image content, and spatial location. We show the effectiveness of our scheme by achieving significantly better results than alternative state-of-the-art approaches on four datasets. We further demonstrate our advantages in terms of task selectivity, scaling the number of tasks and interpretability. | ||