I am currently a final-year PhD student within the Computer Vision Group in the School of Electronic Engineering and Computer Science at Queen Mary University of London, supervised by Prof. Shaogang (Sean) Gong.

Before that, I received my M.S degree from IIE, Chinese Academy of Sciences in Beijing, China, under the supervision of Prof. Yu Zhou.

My research interest lies in computer vision and machine learning, focusing on learning dynamic visual environments through natural language for understanding, generating and reasoning.

Selected Publications

ViMo: A Generative Visual GUI World Model for App Agent [Project Page]
Dezhao Luo*, Bohan Tang*, Kang Li, Georgios Papoudakis, Jifei Song, Shaogang Gong, Jianye Hao, Jun Wang, Kun Shao
Under review, 2025.

Beyond Syntax: Action Semantics Learning for App Agents [Paper]
Bohan Tang*, Dezhao Luo*, Jingxuan Chen, Shaogang Gong, Jianye Hao, Jun Wang, Kun Shao
Under review, 2025.

Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval [Project Page]
Dezhao Luo, Shaogang Gong, Jiabo Huang, Hailin Jin, Yang Liu
Proceedings of the AAAI Conference on Artificial Intelligence, 2025 (AAAI'25).

The Role of Video Generation in Enhancing Data-Limited Action Understanding [Paper]
Wei Li, Dezhao Luo, Dongbao Yang, Zhenhang Li, Weiping Wang, Yu Zhou
Proceedings of the Thirty‐Fourth International Joint Conference on Artificial Intelligence, 2025 (IJCAI'25).

Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Dezhao Luo, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 5464–5473 (WACV'24).

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 23045–23055 (CVPR'23).

Exploring Relations in Untrimmed Videos for Self-Supervised Learning
Dezhao Luo, Bo Fang, Yu Zhou, Yucan Zhou, Dayan Wu, Weiping Wang
ACM Transactions on Multimedia Computing, Communications, and Applications, 2022, 18(1s): 1–21 (TOMM'22).

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning
Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma, Qixiang Ye, Weiping Wang
Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(07): 11701–11708 (AAAI'20, Oral presentation).

Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning
Yuan Yao, Chang Liu, Dezhao Luo, Yu Zhou, Qixiang Ye
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 6548–6557 (CVPR'20).