Research
My research goal is to empower AI with strong reasoning capabilities to help human solve real-world problems reliably, and eventually boost science discovery process. My current milestone is to investigate:
(1) the intrinsic bottleneck of AI reasoning, from both architecture and algorithm perspectives. [ICLR 2025]
(2) Improving AI reasoning through post-training, especially from the aspect of reinforcement learning. [ICLR 2024]
Previously, I worked on representation learning and neuro-symbolic learning during my undergraduate study.
|
Highlighted Recent Publications (See full list on Google Scholar)
* denotes equal contribution
|
|
Eliminating Position Bias of Language Models: A Mechanistic Approach
Ziqi Wang,
Hanlin Zhang,
Xiner Li,
Kuan-Hao Huang,
Chi Han,
Shuiwang Ji,
Sham M. Kakade,
Hao Peng,
Heng Ji
ICLR, 2025
Paper
/
Twitter
We propose a method to eliminate the position bias in LMs, which help LMs to better conduct reasoning.
|
|
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint
Wei Xiong*,
Hanze Dong*,
Chenlu Ye*,
Ziqi Wang,
Han Zhong,
Heng Ji,
Nan Jiang,
Tong Zhang
ICML, 2024; ICLR ME-FoMo, 2024 (Oral Presentation)
Paper
/
Twitter
On-Policy matters for Direct Policy Optimization!
|
|
Enabling Language Models to Implicitly Learn Self-Improvement
Ziqi Wang,
Le Hou,
Tianjian Lu,
Yuexin Wu,
Yunxuan Li,
Hongkun Yu,
Heng Ji
ICLR, 2024
Paper
/
Slides
/
Twitter
Teaching models self-improvement with reinforcement learning.
|
Service
Reviewer: ICLR, NeurIPS (Top reviewer in 2024), ICML, ACL, EMNLP, NAACL, Pattern Recognition
|
Miscellanea
I am interested in Physics and Astronomy in my spare time (I was a Physics student before trained as a Computer Science student). I like Repairment and DIY since they help me understand the bottom mechanism. I am a big fan of John Carmack. I also learn a lot of life advice from Elon Musk.
|
The website is adapted from Jon Barron. Last update: Feb, 2025.
|
|