Abstract: Vision-language models (VLM) can solve complex tasks such as visual question answering by integrating visual and linguistic information. Their performance have improved significantly with ...
1 University of Science and Technology of China 2 WeChat, Tencent Inc. 1. A Novel Parameter Space Alignment Paradigm Recent MLLMs follow an input space alignment paradigm that aligns visual features ...
Abstract: Visual target navigation is a critical capability for autonomous robots operating in unknown environments, particularly in human-robot interaction scenarios. While classical and ...
A Nigerian Visual Artist, Yele Akin-Johnson is at the forefront of global digital culture by re-imagining African visual language in a global digital economy. Akin-Johnson is working at the porous ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results