He Zhu

I am HE ZHU (朱赫), an M.Sc. student in Smart City and Data Sciences at Peking University, advised by Prof. Wenjia Zhang and Prof. Guanhua Chen. I received my B.E. in Computer Science from Southern University of Science and Technology, advised by Prof. Zipei Fan and Prof. Xuan Song.

My research centers on post-training data for LLMs: what makes it good, how to synthesize and select it at scale (FANNO, InstructDiff, Tag-Instruct, AlignDiff), and how to design training objectives that acquire capabilities efficiently without forgetting what the model already knows (ASFT, CFT). I have also applied this perspective to domain-specific foundation models for urban intelligence (PlanGPT, PlanGPT-VL, UrbanClaw). I am currently a research intern at Microsoft Research Asia, GenAI Group, supervised by Dr. Li Dong, and I serve as an Area Chair for EMNLP 2026.

Email / CV / Google Scholar / Github / Phone

Open to discussion or collaboration. Feel free to drop me an email if you're interested in my research.

News

2026.06: I will serve as an Area Chair for EMNLP 2026.
2026.04: InstructDiff was accepted by ACL 2026 Main.
2026.01: ASFT was accepted by ICLR 2026.
2025.11: Joined MSRA GenAI Group as a Research Intern.
2025.10: PlanGPT-VL was accepted by EMNLP Industry 2025.
2025.08: PlanGPT was selected as an ACL Industry 2025 Oral (Top 1%).
2025.05: Three first-author papers (FANNO, Tag-Instruct, PlanGPT) were accepted by ACL 2025.

Publication & Preprint

* denotes equal contribution. † denotes corresponding author.

First Author / Corresponding Author

Anchored Supervised Fine-Tuning
He Zhu*, Junyou Su*, Peng Lai, Wenjia Zhang, Linyi Yang, Guanhua Chen†

ICLR, 2026
arXiv / code

Anchored SFT studies fine-tuning objectives that acquire new capabilities while preserving a model's original knowledge.

PlanGPT: Enhancing Urban Planning with a Tailored Language Model
He Zhu, Guanhua Chen, Wenjia Zhang†

ACL Industry, 2025 (Oral, Top 1%)
arXiv / Project Page

The first systematic study of LLMs for urban planning, covering data, models, benchmarks, and applications.

FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only
He Zhu, Yifan Ding, Yicheng Tao, Zhiwen Ruan, Yixia Li, Wenjia Zhang, Yun Chen, Guanhua Chen†

ACL Findings, 2025
paper / code

Tag-Instruct: Controlled Instruction Complexity Enhancement
He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, Guanhua Chen†

ACL Findings, 2025
paper / code

Personalized Individual Trajectory Prediction via Meta-Learning
He Zhu, Liyu Zhang, Zipei Fan

SIGSPATIAL, 2022 (Oral)
paper

PlanGPT-VL: Vision-Language Model for Urban Planning Maps
He Zhu*, Junyou Su*, Minxin Chen*, Yun Chen, Guanhua Chen, Wenjia Zhang†

EMNLP Industry, 2025
arXiv / code / Project Page

InstructDiff: Domain-Adaptive Data Selection via Differential Entropy for Efficient LLM Fine-Tuning
Junyou Su*, He Zhu*†, Guanhua Chen†

ACL, 2026 Main
arXiv / code

A domain-adaptive data selection method for efficient LLM fine-tuning.

AlignDiff: Exploiting Model-Intrinsic Information for Better Data Selection
Peng Lai*, He Zhu*, Zhiwen Ruan, Dongdong Zhang, Yun Chen, Peng Li, Furu Wei, Yang Liu, Guanhua Chen†

Under Review, 2026
paper

Towards Fair and Comprehensive Evaluation of Routers in Collaborative LLM Systems
Wanxing Wu*, He Zhu*, Yixia Li*, Yun Chen, Guanhua Chen†

Under Review, 2026
paper / code

Can AI Reason Like an Urban Planner? Benchmarking Large Language Models Against Professional Judgment
Yijie Deng*, He Zhu*, Wen Wang, Junyou Su, Minxin Chen, Wenjia Zhang

arXiv, 2026
arXiv / pdf / code

UPBench evaluates whether LLMs can reason with professional planning judgment across planning knowledge and cognitive levels.

PlanBench-V: A Spatial Planning Map Benchmark for Vision-Language Models
Minxin Chen*, He Zhu*, Junyou Su, Wen Wang, Yijie Deng, Wenjia Zhang

arXiv, 2026
arXiv / pdf / code / Project Page

A spatial planning map benchmark for evaluating VLM perception, reasoning, association, and implementation in planning contexts.

Other selected works: Dripper, KDD 2026 · CFT, Under Review 2026 · Topic Over Source, Under Review 2026 · LayAlign, NAACL Findings 2025 · ToolExpNet, ACL Findings 2025 · HHGNN, ICRA 2024

Projects

PlanGPT Series · plangpt.github.io
I lead the PlanGPT series, a suite of tailored foundation models for urban planning, including PlanGPT, PlanGPT-VL, PlanGPT-R1, UP-Bench, and PlanBench-V. The models have been deployed at planning and design institutes across China to support real-world planning workflows.

UrbanClaw · app.urbanclaw.net
I lead UrbanClaw, an AI-powered urban planning assistant with multi-agent collaboration, tool use, and vision capabilities. It is deployed at planning and design institutes across China for day-to-day planning work.

Education

Peking University · 2024–Present
M.Sc. in Smart City & Data Sciences.
Advisors: Prof. Wenjia Zhang & Prof. Guanhua Chen.
Summer Research Intern at UC Berkeley.

Southern University of Science and Technology · 2020–2024
B.E. in Computer Science · GPA 90.2/100 (top 10%).
Advisors: Prof. Zipei Fan & Prof. Xuan Song.
Research Assistant at SUSTech-NLP, UTokyo CSIS, and NUS SoC.

Experience

Microsoft Research Asia · GenAI Group, Beijing

• Research Intern, supervised by Dr. Li Dong
• Nov. 2025 to Present

Shanghai AI Laboratory · OpenData Lab, Shanghai

• Research Intern, foundation language models
• May 2025 to Oct. 2025

Previously a Research Intern at SenseTime, Foundation Language Model Center (Jun.–Sep. 2024) and a Project Assistant at LocationMind, Tokyo (Jun.–Dec. 2023).

Honors & Service

Outstanding Youth League Secretary, Peking University 2025

Peking University Student Representative 2025

Outstanding Graduate & Outstanding Thesis, SUSTech CS (Top 5%) 2024

Annual Outstanding Student, SUSTech 2021, 2022, 2023

Area Chair: EMNLP 2026 2026

Reviewer: ACL, EMNLP, NAACL 2024-2026

Layout from the academic homepage of Ruihan Yang and the classic Jon Barron template.