Do LLMs truly understand the world? More importantly, do they need to?
Large language models like GPT-4 can write, argue, explain, and persuade, compressing centuries of human linguistic behavior into billions of parameters.1 But does this imitation imply understanding?
From my viewpoint, understanding means describing in simple terms. By that measure, LLMs qualify: they compress vast linguistic patterns into a neural network. But not all compression is equal. True understanding reveals why things work. Newton didn’t just summarize planetary motion with F = ma, he uncovered the law beneath it, i.e., a few variables that predict any trajectory.2 That’s compression that generalizes. LLMs feel similar. They compress linguistic patterns exceptionally well. But do they reveal the why for the language? Or are they just alchemy at scale?
I don’t have the answers. But I think the questions matter:
- Can LLMs predict linguistic phenomena we’ve never observed, like Newton’s laws do for motion?
- Is there a simpler model, an “F = ma” for language, waiting to be discovered?
- Will LLMs lead us to that insight, or distract us from finding it?
But do LLMs need to truly understand the world for them to be useful?
Perhaps not. Euler wrote 866 papers applying integration and differential equations before the theory was rigorous.3 His work let engineers calculate how beams would bend and how fluids would flow, before the math was considered rigorous. The math worked in practice, a century before Cauchy and Weierstrass proved why.4
LLMs may be the same: immensely useful now, even if true understanding comes later. So don’t wait. When Cauchy and Weierstrass finally proved why calculus worked, engineers kept using it exactly as before for most of the practical purposes. Understanding may come for LLMs too, and it may change nothing about how we use them.
— Mohit
The ideas in this post are my own. Generative AI was used to assist with drafting and editing.
References
-
OpenAI has not disclosed GPT-4’s exact parameter count. Estimates suggest hundreds of billions to over a trillion parameters. See GPT-4 Technical Report. ↩
-
Newton’s laws of motion, published in Philosophiæ Naturalis Principia Mathematica (1687). ↩
-
Euler’s collected works comprise 866 distinct publications. His work on beam deflection (E8, 1732) and fluid dynamics (E226, 1757) remains foundational in engineering. See Euler Archive. ↩
-
Rigorous foundations of calculus were established by Augustin-Louis Cauchy (~1820s) and Karl Weierstrass (~1860s), nearly a century after Euler’s major contributions. ↩
Read next: