Emergence of Reasoning Abilities During Training of Large Language Models
Supervisors
Suitable for
Abstract
Prerequisites: Strong ML background; interest in learning/opimization dynamics
and representation learning. Some familiarity
with training neural networks is
recommended but not necessary.
Background
● Reasoning-like behavior in LLMs often appears to emerge gradually rather than being explicitly
programmed.
Understanding when and how such capabilities arise during training is a key open
question in modern AI. This topic connects
empirical deep learning with theoretical questions
about emergence, generalization, and learning dynamics.
Focus
● This project studies the temporal development of reasoning-related behaviors in LLMs or smaller
proxy
models. The main question is: how do reasoning capabilities evolve over the course of
training, and what patterns characterize
their emergence?
● The expected contribution is insight into the dynamics of capability formation in large models.
Method
The student will draw on literature related to emergence in neural networks, scaling behavior, and training
dynamics. Experiments may involve analyzing checkpoints, training curves, or simplified models that
exhibit reasoning-like
behavior.
[1] Kaplan, Jared, et al. "Scaling laws for neural language models." arXiv preprint arXiv:2001.08361 (2020).
[2] Wei, Jason, et al. "Emergent Abilities of Large Language Models." Transactions on Machine Learning Research.
Goals:
● Essential: Review work on emergent abilities and training dynamics in LLMs.
● Essential: Analyze how performance
on reasoning tasks evolves during training or over model
scale.
● Stretch: Connect empirical findings to theoretical
intuitions about generalization and emergence.