詹锟 | Kun Zhan

Senior Director of VLA Models at Li Auto | Autonomous Driving Expert | AI Researcher

portrait.jpeg

Email: zk_1028@aliyun.com

WeChat: KevinZhan1990

Beijing, China

Google Scholar Email Contact

🔭 About Me

I’m Kun Zhan, Senior Director of Vision-Language-Action (VLA) team at Li Auto, as well as the Site Manager of Li Auto’s U.S. R&D Center in Silicon Valley.

I received my bachelor’s degree in Automation from Beihang University. I joined Baidu Apollo in 2017, leading work on behavior prediction. In 2021, I joined Li Auto to build our autonomous-driving technology stack from the ground up. Since then, our team has delivered a series of milestones — Highway NoA (2022), City NoA (2023), End-to-End + VLM dual-system architecture (2024), and the VLA framework (2025).

Over the years, I’ve led development across the full autonomous-driving pipeline — from behavior prediction and static/dynamic perception to large-scale foundation-model research in world models, VLMs, and reinforcement learning. Our team pioneered the dual-system architecture that unifies large-model cognition with engineering reliability.

My long-term mission is to realize physical world AGI — autonomous driving is only the first step toward more intelligent, dependable, and embodied AI systems.

🌟 Research Interests

  • Autonomous Driving: Vision-Language-Action(VLA) Model, End-to-end autonomous driving systems, decision-making and planning
  • Computer Vision: Object detection and tracking, scene understanding
  • 3D Vision: 3D perception, reconstruction, and modeling
  • Large Language Models: Applications of multimodal large models in autonomous driving
  • World Models: Environment modeling and prediction, reinforcement learning

💼 Work Experience

Li Auto | Apr 2021 – Present

Head of VLA Models, VLA Algorithm Owner

  • Direct a more than 100-person org covering perception, planning, foundation-model training, and on-vehicle inference.
  • Delivered three generations of AI stacks (E2E → VLM → VLA) into mass-production pipelines.
  • Established and lead a dedicated world-model group for reinforcement learning and closed-loop simulation.
  • Previously headed the Prediction, Static BEV, Driving Perception, and Large-Model teams.

Site Manager, U.S. R&D Center (San Jose, CA)

  • Built Li Auto’s overseas research hub.
  • Own local strategy, budgeting, and cross-border collaboration with Beijing HQ, accelerating global talent acquisition and technology transfer.

Baidu | Apr 2016 – Mar 2021

Autonomous-Driving Engineering, Apollo L4 Team

  • Served as Algorithm Lead for Prediction; designed L4 prediction pre-decision algorithms for robo-taxi operations.
  • Developed planning-and-control (PnC) algorithms and deep-learning onboard modules for L4 pilots.

📚 Academic Achievements

Citation Statistics

  • Top Papers: 36
  • Total Citations: 868
  • h-index: 12
  • i10-index: 14

Selected Publications

  1. Drivevlm: The convergence of autonomous driving and large vision-language models (2024) X Tian, J Gu, B Li, Y Liu, Y Wang, Z Zhao, K Zhan, P Jia, X Lang, H Zhao arXiv preprint arXiv:2402.12289 | Citations: 308

  2. Street gaussians: Modeling dynamic urban scenes with gaussian splatting (2024) Y Yan, H Lin, C Zhou, W Wang, H Sun, K Zhan, X Lang, X Zhou, S Peng European Conference on Computer Vision, 156-173 | Citations: 250

  3. Planagent: A multi-modal large language agent for closed-loop vehicle motion planning (2024) Y Zheng, Z Xing, Q Zhang, B Jin, P Li, Y Zheng, Z Xia, K Zhan, X Lang, D Zhao arXiv preprint arXiv:2406.01587 | Citations: 37

  4. Unleashing generalization of end-to-end autonomous driving with controllable long video generation (2024) E Ma, L Zhou, T Tang, Z Zhang, D Han, J Jiang, K Zhan, P Jia, X Lang, K Yu arXiv preprint arXiv:2406.01349 | Citations: 33

  5. Recondreamer: Crafting world models for driving scene reconstruction via online restoration (2025) C Ni, G Zhao, X Wang, Z Zhu, W Qin, G Huang, C Liu, Y Chen, Y Wang, K Zhan, X Lang, X Wang, W Mei Proceedings of the Computer Vision and Pattern Recognition Conference, 1559-1569 | Citations: 29

Patents

  • 18 Chinese Patents
  • 2 US Patents

Academic Service

  • Program Committee/Reviewer: CVPR, ICCV, ECCV, NeurIPS, AAAI, IROS
  • Journal Reviewer: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Intelligent Transportation Systems (T-ITS), IEEE Transactions on Intelligent Vehicles (T-IV)
  • Workshop Organizer: Autonomous Driving Workshop at CVPR 2023