Pingbang Hu

胡平邦

I speak TeX
Illinois, United States

About Me

I'm a third-year Ph.D. candidate at University of Illinois Urbana-Champaign (UIUC) 🌽 advised by Jiaqi Ma. During my PhD, I've had the immense delight of interning at Amazon AWS AI Lab 🗽 and National Institute of Informatics 🇯🇵. Previously, I obtained my Master degree from UIUC 🌽 and dual Bachelor degree from University of Michigan 〽️ and Shanghai Jiao Tong University 🇨🇳.

I'm interested in the broad area of machine learning and artificial intelligence, with the goal being to draw theoretical insights from practical problems and develop algorithms with provable guarantees and desirable properties such as efficiency, robustness, and fairness. Recently, my research focuses on understanding data, including the following three aspects:

  1. Data Attribution: Understanding how training data influences AI models.

  2. Data Curation: How to curate/generate/augment (synthetic) data that further helps models generalize?

  3. Data-Centric Privacy: Can above be done without compromising privacy when safety-critical or sensitive data is involved? This includes (differential) privacy, machine unlearning, etc.

Previously I have worked on graph neural networks and fast graph algorithms. Generally speaking, I hold a strong interest in theoretical stuffs that involves geometry.

🗞️ News

  • [Jun. 2026] 🚀 Incoming intern @SIG Deep Learning team, come hanging out in Philly 🦅!

  • [Jan. 2026] 👨‍🎓 Incoming fellow @Anthropic Alignment Science team, come hanging out in San Francisco 🌉!

  • [Oct. 2025] 📚 We are organizing the Symposium on Information Retrieval and Language Models at UIUC!

  • [Oct. 2025] 🎙️ Giving a tutorial on recent tricks in computing gradient-based data attribution, including GraSS!

  • [Sep. 2025] 🍻 Please check out our new survey paper 📝 on data attribution!

  • [Sep. 2025] 🍻 Two papers 📝 accepted by NeurIPS 2025 with one first-authored and one co-first-authored!

  • [Aug. 2025] 🎓 Get my M.S. Applied Math Degree at UIUC!

  • [Jul. 2025] 📚 We are organizing the 3rd Workshop on Regulatable Machine Learning in conjunction with NeurIPS 2025!

  • [Jul. 2025] 🎙️ Giving a talk on Data Attribution at the Guided Generation Group (GGG)!

  • [Jun. 2025] 🤖 Attending the first AI Startup School held by Y Combinator, see you in San Francisco 🌉!

  • [Mar. 2025] 🚀 Interning @Amazon AWS AI Deep Engine Science team, come hanging out in New York 🗽!

  • [Jan. 2025] 🍻 One paper 📝 accepted by ICLR 2025.

  • [Nov. 2024] 🏆 Received the Graduate Conference Travel Award from UIUC!

  • [Oct. 2024] 🏆 Received the NeurIPS 2024 Scholar Award, see you in Vancouver!

  • [Sep. 2024] 🍻 Two papers 📝 accepted by NeurIPS 2024 with one Spotlight.

  • [Jun. 2024] 📚 We launched the ongoing Data Attribution Reading Group.

  • [May 2024] 🚀 Interning @National Institute of Informatics, come hanging out in Tokyo 🇯🇵!

🔖 Misc

I'm from Taiwan 🇹🇼! In my spare time, I enjoy street photography 📷 and playing drums 🥁.

Education

Aug. 2023 - Present
Ph.D. in Information Science
University of Illinois Urbana-Champaign
Aug. 2023 - Aug. 2025
M.S. in Mathematics
University of Illinois Urbana-Champaign
Sep. 2021 - Apr. 2023
B.S. in Computer Science | minor in Mathematics
University of Michigan
Sep. 2019 - Aug. 2023
B.E. in Electrical and Computer Engineering | minor in Computer Science
Shanghai Jiao Tong University

Interests

Data Valuation
Algorithmic Design
Mathematical Statistics
Mathematical Analysis
Learning Theory
Graph Theory
Geometry & Topology

Awards & Scholarships

Graduate Conference Travel Award
Nov. 2024
NeurIPS 2024 Scholar Award
Oct. 2024
Excellent Internship Award at National Institute of Informatics
Aug. 2024
Hong Kong, Macao and Taiwan Overseas Chinese Student Scholarship
Oct. 2021
Undergraduate Excellent Scholarship
Nov. 2020
Bao Gang Excellent Scholarship
Jun. 2020
Hong Kong, Macao and Taiwan Overseas Chinese Student Scholarship
Dec. 2019

Selected Research

A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation
Yiwen Tu*, 
Pingbang Hu*, 
Jiaqi W. Ma
Sep 18th 2025
NeurIPS 2025
#Trustworthy
#Data Attribution
#Unlearning

We design the first efficient machine unlearning evaluation metric with provable guarantees.

arXiv
GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection
Pingbang Hu, 
Joseph Melkonian, 
Weijing Tang, 
Han Zhao, 
Jiaqi W. Ma
Sep 18th 2025
NeurIPS 2025
#Data Attribution
#Optimization

We propose an efficient gradient compression algorithm to accelerate and scale gradient-based data attribution methods to billion-scale models.

arXiv
Talk
Poster
Slide
GitHub
Most Influential Subset Selection: Challenges, Promises, and Beyond
Yuzheng Hu, 
Pingbang Hu, 
Han Zhao, 
Jiaqi W. Ma
Sep 25th 2024
NeurIPS 2024
#Data Attribution
#Learning Theory

We provide a comprehensive study of the common practices in the Most Influential Subset Selection (MISS) problem.

arXiv
Poster
GitHub
Last Updated on Oct 21st 2025