詹锟 | Kun Zhan

Email: zk_1028@aliyun.com

WeChat: KevinZhan1990

Beijing, China

Request a collaboration Browse publications

🔭 About Me

I’m Kun Zhan, Senior Director of Li Auto’s Vision-Language-Action (VLA) organization and Site Manager of the company’s Silicon Valley R&D center. I build scalable teams that take frontier research in multimodal models, reinforcement learning, and world modeling all the way into mass-production vehicles.

My journey started with a master’s degree in Automation from Beihang University, followed by leading Baidu Apollo’s behavior prediction team. Since joining Li Auto in 2021, I have been responsible for architecting and deploying three generations of autonomous-driving stacks, culminating in the current VLA framework that unifies large-model cognition with automotive-grade reliability.

My mission is to realize physical-world AGI, with autonomous driving as the proving ground for safe, embodied intelligence.

Key Highlights

Scaled leadership: Direct a 100+ person global org spanning perception, prediction, planning, simulation, and foundation-model training.
Production impact: Delivered Highway NoA (2022), City NoA (2023), End-to-End + VLM dual-system (2024), and the new VLA stack (2025) to mass-produced Li Auto vehicles.
Global footprint: Built Li Auto’s U.S. research hub, aligning Silicon Valley exploration with Beijing HQ execution.

🌟 Research Interests

Autonomous Driving: Vision-Language-Action (VLA) models, end-to-end driving systems, decision-making and planning
Computer Vision: Object detection/tracking, scene understanding, BEV perception
3D & World Models: Dynamic scene reconstruction, generative simulation, reinforcement learning at fleet scale
Multimodal LLMs: Applying large vision-language models to cognition, planning, and driver-vehicle interaction

💼 Work Experience

Li Auto | Apr 2021 – Present

Senior Director, Head of VLA Models & Algorithm Owner

Built Li Auto’s end-to-end autonomous stack from scratch, progressing from E2E → VLM → VLA architectures that now operate on hundreds of thousands of customer vehicles.
Assemble and mentor a 100+ member org across perception, planning, foundation models, simulation, and on-vehicle deployment with a strong applied-research culture.
Established a dedicated world-model and RL group to accelerate closed-loop learning and reduce real-world testing costs by double digits.

Site Manager, U.S. R&D Center (San Jose, CA)

Launched Li Auto’s overseas research hub, covering local strategy, budgeting, and talent acquisition.
Bridge Silicon Valley innovation with Beijing execution by running cross-border program reviews, ensuring roadmaps stay synchronized.

Baidu Apollo | Apr 2016 – Mar 2021

Algorithm Lead, L4 Prediction & Planning

Led the L4 prediction pre-decision algorithms for robo-taxi pilots, improving motion-forecasting reliability for complex urban interactions.
Shipped planning-and-control modules and deep-learning onboard components that powered Baidu’s autonomous fleets in Beijing and Guangzhou.

📚 Academic Achievements

Citation Snapshot

Top papers: 45
Total citations: 1130
h-index: 15 i10-index: 17

Selected Publications

DriveVLM: The convergence of autonomous driving and large vision-language models (CoRL 2024) X. Tian, J. Gu, B. Li, Y. Liu, Y. Wang, Z. Zhao, K. Zhan, P. Jia, X. Lang, H. Zhao.
Street Gaussians: Modeling dynamic urban scenes with Gaussian splatting (ECCV 2024) Y. Yan, H. Lin, C. Zhou, W. Wang, H. Sun, K. Zhan, X. Lang, X. Zhou, S. Peng.
Recondreamer: Crafting world models for driving scene reconstruction via online restoration (CVPR 2025)
C. Ni, G. Zhao, X. Wang, Z. Zhu, W. Qin, G. Huang, C. Liu, Y. Chen, Y. Wang, K. Zhan, X. Lang, X. Wang, W. Mei.
PlanAgent: A multimodal LLM agent for closed-loop vehicle motion planning (IEEE ITS 2024)
Y. Zheng, Z. Xing, Q. Zhang, B. Jin, P. Li, Y. Zheng, Z. Xia, K. Zhan, X. Lang, D. Zhao.
Unleashing generalization of end-to-end autonomous driving with controllable long video generation (arXiv 2024)
E. Ma, L. Zhou, T. Tang, Z. Zhang, D. Han, J. Jiang, K. Zhan, P. Jia, X. Lang, K. Yu.

Full publication list available on Google Scholar.

Patents & Service

20 patents granted/issued: 18 CN + 2 US across perception, planning, and HD mapping pipelines.
Reviewer: CVPR, ICCV, ECCV, NeurIPS, AAAI, IROS; Journals—TPAMI, T-ITS, T-IV.
Organizer: CVPR 2023 Autonomous Driving Workshop; frequent speaker on VLA model deployment.