Yi Hu (胡逸)

yihu.jpg

Email: huyi2002 (at) stu (dot) pku (dot) edu (dot) cn

I am now a first-year Ph.D. student at the Institute for Artificial Intelligence, Peking University, advised by Prof. Muhan Zhang. Welcome to visit our group page: Graphpku!

I received my B.S. in Physics from School of Physics, Peking University, where I was fortunate to work with Prof. Huichao Song on ML applications in heavy-ion collision.

I am dedicated to exploring the reasoning mechanisms of large language models (LLMs) and researching how to enhance the models’ reasoning capabilities to the level of human experts. Additionally, I have a keen interest in various topics related to LLMs, including efficiency, alignment, and applications across different downstream domains etc. If you share these research interests, please feel free to get in touch!

(I’ve recently moved to this site, the website is still under construction.)

News

Nov 07, 2024 Why 9.9 > 9.11 is so hard for LLMs? 😫 Check out our new paper: Number Cookbook: Number Understanding of Language Models and How to Improve It, where we introduce a comprehensive benchmark covering four common numerical representations and 17 distinct numerical tasks and investigate the numerical understanding and processing ability (NUPA) of LLMs.
Sep 10, 2024 I begin my Ph.D. in the Institute for Artificial Intelligence, Peking University! ✨🥳
Jul 20, 2024 ✋ I will be at ICML 2024 @ Vienna, Austria, presenting our paper: Case-Based or Rule-Based: How Do Transformers Do the Math?. 🤩 Looking forward to meeting researchers and having discussions!
Jul 02, 2024 I graduate from School of Physics, Peking University and accomplish my Bachelor degree today 🎓🥳!
May 02, 2024 Our paper: Case-Based or Rule-Based: How Do Transformers Do the Math? is accepted by ICML 2024 !!! In our paper, we demonstrate that current large language models are performing case-based reasoning rather than rule-based reasoning like humans in math reasoning tasks, revealing the intrinsic reasons for the models’ limitations in length generalization. To bridge this gap and shift the models’ reasoning paradigm closer to rule-based reasoning, we propose Rule-Following Fine-Tuning. This method enhances the models’ ability to follow rules, thereby improving their performance in length generalization.

Selected publications

  1. ICML2024
    Case-Based or Rule-Based: How Do Transformers Do the Math?
    Yi Hu, Xiaojuan Tang, Haotong Yang, and Muhan Zhang
    2024
  2. preprint
    Number Cookbook: Number Understanding of Language Models and How to Improve It
    Haotong Yang, Yi Hu, Shijia Kang, Zhouchen Lin, and Muhan Zhang
    2024