🤖 About Me

I’m a Scientist working at Agency for Science, Technology and Research (A*STAR) in Singapore.
My research focuses on perception and planning in embodied AI, aiming to empower autonomous systems with the ability to perceive, understand, and navigate complex 3D environments. I work at the intersection of 3D scene understanding, motion planning, and vision-language model, with the goal of bridging visual perception and intelligent decision-making. I have authored over 20 peer-reviewed papers in leading venues, including TPAMI, CVPR, ICCV, RAL, ICRA and IROS.

I’m always open to collaboration and in-depth discussions. If you’re interested in my research, feel free to get in touch!

🔥 News

2026: One papers were accepted to ICLR, and one paper was accepted to ICASSP.
2025-10: We got the 2nd place in Challenge of Multimodal Robot Learning in InternUtopia and Real World and 1st place in simulation stage at IROS2025.
2025: Two papers were accepted to ICCV, and one paper was accepted to CVPR.
2024.4: I successfully defended my PhD thesis and began my new role as a Scientist at A*STAR.

📖 Experience

Educations

2019.10 ‑ 2024.4, Doctor
Supervisor: Prof. Dr. Juergen Gall
University of Bonn @ Bonn, Germany

2016.09 ‑ 2019.06, Master
Supervisor: Prof. Dr. Ming-Ming Cheng
Nankai University @ Tianjin, China

2012.09 ‑ 2016.06, Bachlor
Supervisor: Prof. Dr. Hong Cheng
University of Electronic Science and Technology of China @ Chengdu, China

Working Experience

2024.03 ‑ Present, Scientist
Agency for Science, Technology and Research (A*STAR) @ Singapore

2023.07 ‑ 2023.11, Research Intern
Qualcomm AI Research @ Amsterdam, Netherlands

2023.1 ‑ 2023.5, Research Intern
Intel Labs @ Munich, Germany

2018.10 ‑ 2018.12, Visisting Student
Technical University of Munich @ Munich, Germany

2018.5 ‑ 2018.8, Research Intern
Alibaba DAMO Academy @ Beijing, China

2017.6 ‑ 2018.2, Research Intern
UISEE @ Beijing, China

📝 Publications

Preprint

[arXiv 2025] SpatialReasoner: Active Perception for Large-Scale 3D Scene Understanding [PDF]

Hongpei Zheng, Shijie Li, Yanran Li, Hujun Yin
[arXiv 2025] Thinking Ahead: Foresight Intelligence in MLLMs and World Models [PDF]

Zhantao Gong, Liaoyuan Fan, Qing Guo, Xun Xu, Xulei Yang, Shijie Li
[arXiv 2025] MonoSR: Open-Vocabulary Spatial Reasoning from Monocular Images [PDF]

Qirui Wang, Jingyi He, Yining Pan, Si Yong Yeo, Xulei Yang, Shijie Li
[arXiv 2025] Improving the Generalization of Segmentation Foundation Models via Weakly-Supervised and Unsupervised Adaptation

Haojie Zhang, Yongyi Su, Nanqing Liu, Shijie Li, Xulei Yang, Xiangyu Yue, Kui Jia, Xun Xu
[arXiv 2025] DiffPCN: Latent Diffusion Model Based on Multi-view Depth Images for Point Cloud Completion [PDF]

Zijun Li, Hongyu Yan, Shijie Li, Kunming Luo, Li Lu, Xulei Yang, Weisi Lin
[arXiv 2025] Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion [PDF]

Yifan Liu, Xun Xu, Shijie Li, Jingyi Liao, Xulei Yang

2026

[ICASSP 2026] Sam-Guided Multi-View Fusion for Weakly Supervised 3D Point Cloud Segmentation

Yuena Qiao, Nanqing Liu, Yongyi Su, Shijie Li, Xulei Yang, Bihan Wen, Nancy Chen, Tianrui Li, Xun Xu
[ICLR 2026] Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs [PDF] [code]

Yongyi Su, Haojie Zhang, Shijie Li, Nanqing Liu, Jingyi Liao, Junyi Pan, Yuan Liu, Xiaofen Xing, Chong Sun, Chen Li, Nancy F. Chen, Shuicheng Yan, Xulei Yang and Xun Xu

2025

[ICCV 2025] Global-Aware Monocular Semantic Scene Completion with State Space Models [PDF] [code]

Shijie Li, Zhongyao Cheng, Rong Li, Shuai Li, Juergen Gall, Xun Xu, Xulei Yang
[ICCV 2025] Future-Aware Interaction Network For Motion Forecasting [PDF] [code]

Shijie Li, Xun Xu, Si Yong Yeo, Xulei Yang
[CVPR 2025] Seeground: See and ground for zero-shot open-vocabulary 3d visual grounding [PDF] [code]

Rong Li, Shijie Li, Lingdong Kong, Xulei Yang, Junwei Liang
[WACV 2025] VaLID: Variable-Length Input Diffusion for Novel View Synthesis [PDF]

Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Juergen Gall, Amirhossein Habibian
[Neurocomputing] SiFH: Siamese Frequency Harmonization Self-Supervised Learning for Motion Forecasting [PDF]

Chunyu Liu, Zeyu Liu, Tiechui Yao, Shijie Li

2024

[CVPRW 2024] Tfnet: Exploiting temporal cues for fast and accurate lidar semantic segmentation [PDF]

Rong Li, Shijie Li, Xieyuanli Chen, Teli Ma, Juergen Gall, Junwei Liang

2023

[TPAMI] MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation [PDF] [code]

Shijie Li, Yazan Abu Farha, Yun Liu, Ming-Ming Cheng, Juergen Gall
[CCDC 2023] Foresight Social-aware Reinforcement Learning for Robot Navigation [PDF]

Yanying Zhou, Shijie Li, Jochen Garcke

2022

[BMVC 2022] Dual Pyramid Generative Adversarial Networks for Semantic Image Synthesis [PDF] [code]

Shijie Li, Ming-Ming Cheng, Juergen Gall

2021

[RA-L 2021] Moving object segmentation in 3D LiDAR data: A learning-based approach exploiting sequential data [PDF] [code]

Xieyuanli Chen, Shijie Li, Benedikt Mersch, Lukas Wiesmann, Juergen Gall, Jens Behley, Cyrill Stachniss
[RA-L 2021] Multi-scale interaction for real-time lidar data segmentation on an embedded platform [PDF] [code]

Shijie Li, Xieyuanli Chen, Yun Liu, Dengxin Dai, Cyrill Stachniss, Juergen Gall
[TNNLS] Rethinking 3-D LiDAR point cloud segmentation [PDF] [code]

Shijie Li, Yun Liu, Juergen Gall
[ICCV 2021] Spatial-temporal consistency network for low-latency trajectory forecasting [PDF]

Shijie Li, Yanying Zhou, Jinhui Yi, Juergen Gall

2020

[AAAI 2020] MiniSeg: An Extremely Minimum Network for Efficient COVID-19 Segmentation [PDF] [code]

Yu Qiu, Yun Liu, Shijie Li, Jing Xu
[RA-L 2020]Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition [PDF] [code]

Shijie Li, Jinhui Yi, Yazan Abu Farha, Juergen Gall
[Neurocomputing] Refinedbox: Refining for fewer and high-quality object proposals [PDF]

Yun Liu, Shijie Li, Ming-Ming Cheng

2019

[Frontiers of Computer Science] Joint salient object detection and existence prediction [PDF]

Huaizu Jiang, Ming-Ming Cheng, Shijie Li, Ali Borji, Jingdong Wang

2018

[IJCAI 2018] Del: Deep embedding learning for efficient image segmentation [PDF] [code]

Yun Liu, Peng-Tao Jiang, Vahan Petrosyan, Shijie Li, Jiawang Bian, Le Zhang, Ming-Ming Cheng
[ICRA 2018] Direct line guidance odometry [PDF]

Shijie Li, Bo Ren, Yun Liu, Ming-Ming Cheng, Duncan Frost, Victor Adrian Prisacariu
[IROS 2018] Structured skip list: A compact data structure for 3D reconstruction [PDF]

Shijie Li, Ming-Ming Cheng, Yun Liu, Shao-Ping Lu, YaHui Wang, Victor Adrian Prisacariu

🌏 Misc.

I enjoy exploring different places—here’s a list of countries I’ve visited so far 😊

Asia: 🇨🇳 🇸🇬 🇲🇾 🇮🇩 🇶🇦 🇦🇪 🇰🇷
Europe: 🇸🇪 🇵🇱 🇬🇧 🇨🇿 🇦🇹 🇮🇹 🇲🇨 🇫🇷 🇵🇹 🇪🇸 🇨🇭 🇧🇪 🇩🇪 🇳🇱 🇱🇺
North America: 🇺🇸
Oceania: 🇦🇺

Shijie Li

李仕杰