Ruihang Li (李睿航)

Ruihang Li is a PhD student in a joint program between the University of Science and Technology of China (USTC) and the Shanghai Innovation Institute (SII), advised by Prof. Wenjie Wang and Dr. Jiaqi Wang. His research focuses on Reinforcement Learning, Visual Generation, and Unified Multimodal Models.

He is currently a research intern at Baidu ERNIE, focusing on training unified multimodal models. He was a research intern at Tencent Hunyuan Frontier Lab, where he worked on RL and evaluation for visual generation. Additionally, he implemented a robust RL pipeline for the unified multimodal model DeepGen, which has gained 580+ GitHub stars and 2500+ HuggingFace downloads.

Previously, he interned at Microsoft Research Asia, closely collaborating with Han Hu, Zheng Zhang, and Houwen Peng on LLM pretraining. He received his B.S. from USTC in 2023. He enjoys vibe coding and exploring the unknown, and aspires to push the boundaries of multimodal machine intelligence.

profile photo

Research

I'm interested in reinforcement learning, visual generation, unified multimodal models, and evaluation for generative systems.

project image

Optimizing Visual Generative Models via Distribution-wise Rewards


Ruihang Li, Mengde Xu, Shuyang Gu, Leigang Qu, Fuli Feng, Han Hu, Wenjie Wang
ICML 2026 Main

Proposes a distribution-wise RL framework for visual generation to mitigate reward hacking and mode collapse. By employing an efficient subset-replace strategy, this approach significantly improves the visual quality and diversity on the SiT model.

project image

DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing


Dianyi Wang*, Ruihang Li*, Feng Han*, Chaofan Ma*, Wei Song*, Siyuan Wang*, Yibin Wang*, Yi Xin, Hongjian Liu, Zhixiong Zhang, Shengyuan Ding, Tianhang Wang, Zhenglin Cheng, Tao Lin, Cheng Jin, Kaicheng Yu, Jingjing Chen, Wenjie Wang, Zhongyu Wei, Jiaqi Wang
Technical Report, 2026
paper / page / code / blog (量子位)

Presents a lightweight unified multimodal model for image generation and editing. Utilizes MR-GRPO for stable 1,500-step RL training, improved text rendering, and stronger 5B-model generation and editing performance.

project image

GenArena: How Can We Achieve Human-Aligned Evaluation for Visual Generation Tasks?


Ruihang Li, Leigang Qu, Jingxu Zhang, Dongnan Gui, Mengde Xu, Xiaosong Zhang, Han Hu, Wenjie Wang, Jiaqi Wang
Preprint, 2026
paper / page / code

Introduces GenArena, a pairwise comparison framework using open-source VLM judges for visual generation evaluation. It improves evaluation accuracy by +25.4% and produces rankings with an 86% match rate to LMArena across 15+ models.

project image

ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws


Ruihang Li, Yixuan Wei, Miaosen Zhang, Nenghai Yu, Han Hu, Houwen Peng
EMNLP 2024 Main
paper / code

Presents ScalingFilter, a reference-free text data filtering method for LLM pretraining. By inversely applying scaling laws, it improves downstream performance by 1.12% over previous state-of-the-art filtering methods while preserving stronger semantic diversity.


Design and source code from Jon Barron's website