250423_phybench
🔥🔥 We collaborate with School of Physics, Peking University and release PHYBench, a physical reasoning benchmark for modern LLMs. Covering mechanics, electromagnetism, thermodynamics, optics, modern physics, and advanced physics, the benchmark spans difficulty levels from high school exercises to undergraduate problems and Physics Olympiad challenges. Check out our paper: PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models.