The Box

[md]

The hardest box to escape is the one you cannot see.

For years, the entire field optimized next-token prediction. Lower perplexity meant a better model. This was the metric, the target, and the definition of progress. Then o1 trained for reasoning instead of prediction, and it turned out to be better at the things we actually cared about.

The box was only visible once someone stepped outside.

Constraints never announce themselves. They look like the way things are done, and because everyone works within them, there is no contrast to make them visible. Token prediction looked like the objective. Fixed context looked like a hardware limitation. Dense rewards looked like a training requirement. From inside, every wall looks like the edge of the world.

The field moves when someone breaks through a wall.

Before o1, reasoning as a training objective was not on the map. Within months of its release, every major lab reproduced it. The path had always existed, but someone had to walk it first.

Benchmarks cannot answer what else is possible because they were built inside the current box. The important problems are often invisible for the same reason. The box shapes what registers as a problem at all.