About Me

Hi! I’m a Research Scientist at Orby AI. Previously, I worked in Google Research / DeepMind.

Research Interest

I have been focusing on multimodal LLM post-training recently, specifically multi-turn RL for GUI agents. In the past, I worked on multimodal LLMs for GUI understanding and user modeling.

Selected Publications

Gang Li and Yang Li. Spotlight: Mobile UI understanding using vision-language models with a focus. In The Eleventh International Conference on Learning Representations, 2023. [Google Research Blog].
Youwei Liang, Junfeng He, Gang Li*, Peizhao Li, Arseniy Klimovskiy, Nicholas Carolan, Jiao Sun, Jordi Pont-Tuset, Sarah Young, Feng Yang, Junjie Ke, Krishnamurthy Dj Dvijotham, Katie Collins, Yiwen Luo, Yang Li, Kai J Kohlhoff, Deepak Ramachandran, and Vidhya Navalpakkam. Rich human feedback for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024. * Co-first authors. Best Paper Award. [Google Research Blog].
Bryan Wang, Gang Li, and Yang Li. Enabling conversational interaction with mobile UI using large language models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–17, 2023. [Google Research Blog].

The full list can be found on my Google Scholar page.