He Zhu

I am HE ZHU (朱赫), an M.Sc. student in Smart City and Data Sciences at Peking University, advised by Prof. Wenjia Zhang and Prof. Guanhua Chen. I received my B.E. in Computer Science from Southern University of Science and Technology, advised by Prof. Zipei Fan and Prof. Xuan Song.

My research centers on post-training data for LLMs: what makes it good, how to synthesize and select it at scale (FANNO, InstructDiff, Tag-Instruct, AlignDiff), and how to design training objectives that acquire capabilities efficiently without forgetting what the model already knows (ASFT, CFT). I have also applied this perspective to domain-specific foundation models for urban intelligence (PlanGPT, PlanGPT-VL, UrbanClaw). I am currently a research intern at Microsoft Research Asia, GenAI Group, supervised by Dr. Li Dong, and I serve as an Area Chair for EMNLP 2026.

Email  /  CV  /  Google Scholar  /  Github  /  Phone

Open to discussion or collaboration. Feel free to drop me an email if you're interested in my research.

profile photo
News

2026.06: I will serve as an Area Chair for EMNLP 2026.
2026.04: InstructDiff was accepted by ACL 2026 Main.
2026.01: ASFT was accepted by ICLR 2026.
2025.11: Joined MSRA GenAI Group as a Research Intern.
2025.10: PlanGPT-VL was accepted by EMNLP Industry 2025.
2025.08: PlanGPT was selected as an ACL Industry 2025 Oral (Top 1%).
2025.05: Three first-author papers (FANNO, Tag-Instruct, PlanGPT) were accepted by ACL 2025.

Publication & Preprint
* denotes equal contribution.   † denotes corresponding author.

First Author / Corresponding Author

Anchored Supervised Fine-Tuning
He Zhu*, Junyou Su*, Peng Lai, Wenjia Zhang, Linyi Yang, Guanhua Chen†

ICLR, 2026
arXiv / code

Anchored SFT studies fine-tuning objectives that acquire new capabilities while preserving a model's original knowledge.

PlanGPT: Enhancing Urban Planning with a Tailored Language Model
He Zhu, Guanhua Chen, Wenjia Zhang†

ACL Industry, 2025 (Oral, Top 1%)
arXiv / Project Page

The first systematic study of LLMs for urban planning, covering data, models, benchmarks, and applications.

FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only
He Zhu, Yifan Ding, Yicheng Tao, Zhiwen Ruan, Yixia Li, Wenjia Zhang, Yun Chen, Guanhua Chen†

ACL Findings, 2025
paper / code
Tag-Instruct: Controlled Instruction Complexity Enhancement
He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, Guanhua Chen†

ACL Findings, 2025
paper / code
Personalized Individual Trajectory Prediction via Meta-Learning
He Zhu, Liyu Zhang, Zipei Fan

SIGSPATIAL, 2022 (Oral)
paper
PlanGPT-VL: Vision-Language Model for Urban Planning Maps
He Zhu*, Junyou Su*, Minxin Chen*, Yun Chen, Guanhua Chen, Wenjia Zhang†

EMNLP Industry, 2025
arXiv / code / Project Page
InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning
Junyou Su*, He Zhu*†, Guanhua Chen†

ACL, 2026 Main
arXiv / code

A domain-adaptive data selection method for efficient LLM fine-tuning.

AlignDiff: Exploiting Model-Intrinsic Information for Better Data Selection
Peng Lai*, He Zhu*, Zhiwen Ruan, Dongdong Zhang, Yun Chen, Peng Li, Furu Wei, Yang Liu, Guanhua Chen†

Under Review, 2026
paper
Towards Fair and Comprehensive Evaluation of Routers in Collaborative LLM Systems
Wanxing Wu*, He Zhu*, Yixia Li*, Yun Chen, Guanhua Chen†

Under Review, 2026
paper / code
Can AI Reason Like an Urban Planner? Benchmarking Large Language Models Against Professional Judgment
Yijie Deng*, He Zhu*, Wen Wang, Junyou Su, Minxin Chen, Wenjia Zhang

arXiv, 2026
arXiv / pdf / code

UPBench evaluates whether LLMs can reason with professional planning judgment across planning knowledge and cognitive levels.

PlanBench-V: A Spatial Planning Map Benchmark for Vision-Language Models
Minxin Chen*, He Zhu*, Junyou Su, Wen Wang, Yijie Deng, Wenjia Zhang

arXiv, 2026
arXiv / pdf / code / Project Page

A spatial planning map benchmark for evaluating VLM perception, reasoning, association, and implementation in planning contexts.

Other selected works: Dripper, KDD 2026  ·  CFT, Under Review 2026  ·  Topic Over Source, Under Review 2026  ·  LayAlign, NAACL Findings 2025  ·  ToolExpNet, ACL Findings 2025  ·  HHGNN, ICRA 2024
Projects
PlanGPT Series  ·  plangpt.github.io
I lead the PlanGPT series, a suite of tailored foundation models for urban planning, including PlanGPT, PlanGPT-VL, PlanGPT-R1, UP-Bench, and PlanBench-V. The models have been deployed at planning and design institutes across China to support real-world planning workflows.
UrbanClaw  ·  app.urbanclaw.net
I lead UrbanClaw, an AI-powered urban planning assistant with multi-agent collaboration, tool use, and vision capabilities. It is deployed at planning and design institutes across China for day-to-day planning work.
Education
Peking University  ·  2024–Present
M.Sc. in Smart City & Data Sciences.
Advisors: Prof. Wenjia Zhang & Prof. Guanhua Chen.
Summer Research Intern at UC Berkeley.
Southern University of Science and Technology  ·  2020–2024
B.E. in Computer Science  ·  GPA 90.2/100 (top 10%).
Advisors: Prof. Zipei Fan & Prof. Xuan Song.
Research Assistant at SUSTech-NLP, UTokyo CSIS, and NUS SoC.
Experience
Microsoft Research Asia · GenAI Group, Beijing

• Research Intern, supervised by Dr. Li Dong
• Nov. 2025 to Present
Shanghai AI Laboratory · OpenData Lab, Shanghai

• Research Intern, foundation language models
• May 2025 to Oct. 2025
Previously a Research Intern at SenseTime, Foundation Language Model Center (Jun.–Sep. 2024) and a Project Assistant at LocationMind, Tokyo (Jun.–Dec. 2023).
Honors & Service

Outstanding Youth League Secretary, Peking University 2025
Peking University Student Representative 2025
Outstanding Graduate & Outstanding Thesis, SUSTech CS (Top 5%) 2024
Annual Outstanding Student, SUSTech 2021, 2022, 2023
Area Chair: EMNLP 2026 2026
Reviewer: ACL, EMNLP, NAACL 2024-2026

Layout from the academic homepage of Ruihan Yang and the classic Jon Barron template.