People say a model "learns." It sounds dignified. The reality is more like a kid with flashcards who keeps guessing wrong until they finally guess right.
The whole process, in one paragraph
Show the model an example. Let it guess. Compare its guess to the right answer. Nudge its internal settings — millions of tiny knobs called weights — slightly in the direction that would have made the guess better. Repeat. Two trillion times.
That is training. That is the whole thing.
A concrete example
Suppose you want to teach a model to recognize cats in photos.
- Show it a photo. Let it guess: "is this a cat?"
- It guesses "no" (it has no idea yet).
- The right answer is "yes."
- The training process tweaks the knobs so that, next time it sees a similar photo, it is slightly more likely to guess "yes."
Show it ten million photos like this. Eventually, the knobs settle into a configuration that gets cats right most of the time.
The model has not learned what a cat is. It has learned what cats look like, statistically, in pixels.
Why this matters
- The model is only as good as the examples it was trained on. Garbage examples → garbage model.
- The model can do things its training data implied but never explicitly showed it. This is the surprising, useful part.
- The model cannot do things its training data has no information about. This is the boring, often-ignored part.
The bottom line
Training is guessing → grading → adjusting → repeating. There is no insight, no aha moment, no understanding. Just an enormous number of small corrections that add up to something that looks, from the outside, a lot like skill.
Next: why the training data is the most important thing in the entire pipeline — and why almost no one talks about it.