詹锟 | Kun Zhan
Senior Director of VLA Models at Li Auto
Email: zk_1028@aliyun.com
WeChat: KevinZhan1990
Beijing, China
🔭 About Me
I’m Kun Zhan, Senior Director of Li Auto’s Vision-Language-Action (VLA) organization and Site Manager of the company’s Silicon Valley R&D center. I build scalable teams that take frontier research in multimodal models, reinforcement learning, and world modeling all the way into mass-production vehicles.
My journey started with a master’s degree in Automation from Beihang University, followed by leading Baidu Apollo’s behavior prediction team. Since joining Li Auto in 2021, I have been responsible for architecting and deploying three generations of autonomous-driving stacks, culminating in the current VLA framework that unifies large-model cognition with automotive-grade reliability.
My mission is to realize physical-world AGI, with autonomous driving as the proving ground for safe, embodied intelligence.
Key Highlights
- Scaled leadership: Direct a 100+ person global org spanning perception, prediction, planning, simulation, and foundation-model training.
- Production impact: Delivered Highway NoA (2022), City NoA (2023), End-to-End + VLM dual-system (2024), and the new VLA stack (2025) to mass-produced Li Auto vehicles.
- Global footprint: Built Li Auto’s U.S. research hub, aligning Silicon Valley exploration with Beijing HQ execution.
🌟 Research Interests
- Autonomous Driving: Vision-Language-Action (VLA) models, end-to-end driving systems, decision-making and planning
- Computer Vision: Object detection/tracking, scene understanding, BEV perception
- 3D & World Models: Dynamic scene reconstruction, generative simulation, reinforcement learning at fleet scale
- Multimodal LLMs: Applying large vision-language models to cognition, planning, and driver-vehicle interaction
💼 Work Experience
Li Auto | Apr 2021 – Present
Senior Director, Head of VLA Models & Algorithm Owner
- Built Li Auto’s end-to-end autonomous stack from scratch, progressing from E2E → VLM → VLA architectures that now operate on hundreds of thousands of customer vehicles.
- Assemble and mentor a 100+ member org across perception, planning, foundation models, simulation, and on-vehicle deployment with a strong applied-research culture.
- Established a dedicated world-model and RL group to accelerate closed-loop learning and reduce real-world testing costs by double digits.
Site Manager, U.S. R&D Center (San Jose, CA)
- Launched Li Auto’s overseas research hub, covering local strategy, budgeting, and talent acquisition.
- Bridge Silicon Valley innovation with Beijing execution by running cross-border program reviews, ensuring roadmaps stay synchronized.
Baidu Apollo | Apr 2016 – Mar 2021
Algorithm Lead, L4 Prediction & Planning
- Led the L4 prediction pre-decision algorithms for robo-taxi pilots, improving motion-forecasting reliability for complex urban interactions.
- Shipped planning-and-control modules and deep-learning onboard components that powered Baidu’s autonomous fleets in Beijing and Guangzhou.
📚 Academic Achievements
Citation Snapshot
- Top papers: 39
- Total citations: 969
-
h-index: 12 i10-index: 15
Selected Publications
- DriveVLM: The convergence of autonomous driving and large vision-language models (arXiv 2024)
X. Tian, J. Gu, B. Li, Y. Liu, Y. Wang, Z. Zhao, K. Zhan, P. Jia, X. Lang, H. Zhao. - Street Gaussians: Modeling dynamic urban scenes with Gaussian splatting (ECCV 2024)
Y. Yan, H. Lin, C. Zhou, W. Wang, H. Sun, K. Zhan, X. Lang, X. Zhou, S. Peng. - PlanAgent: A multimodal LLM agent for closed-loop vehicle motion planning (arXiv 2024)
Y. Zheng, Z. Xing, Q. Zhang, B. Jin, P. Li, Y. Zheng, Z. Xia, K. Zhan, X. Lang, D. Zhao. - Unleashing generalization of end-to-end autonomous driving with controllable long video generation (arXiv 2024)
E. Ma, L. Zhou, T. Tang, Z. Zhang, D. Han, J. Jiang, K. Zhan, P. Jia, X. Lang, K. Yu. - Recondreamer: Crafting world models for driving scene reconstruction via online restoration (CVPR 2025)
C. Ni, G. Zhao, X. Wang, Z. Zhu, W. Qin, G. Huang, C. Liu, Y. Chen, Y. Wang, K. Zhan, X. Lang, X. Wang, W. Mei.
Full publication list available on Google Scholar.
Patents & Service
- 20 patents granted/issued: 18 CN + 2 US across perception, planning, and HD mapping pipelines.
- Reviewer: CVPR, ICCV, ECCV, NeurIPS, AAAI, IROS; Journals—TPAMI, T-ITS, T-IV.
- Organizer: CVPR 2023 Autonomous Driving Workshop; frequent speaker on VLA model deployment.