Left Null Space — The Error Your Model Cannot Learn
At some point a model stops improving, but not in a dramatic way. The loss doesn’t blow up. It doesn’t fluctuate. It simply settles.
You tweak the learning rate. You train longer. You restart with better initialization.
Nothing changes.
The model isn’t unstable — it has just reached the edge of what it can represent.
What remains isn’t a training issue anymore. It’s a structural limitation.
Linear algebra gives that leftover error a name: the left null space.
What’s Actually Happening
We usually picture training as a simple loop:
change the weights → predictions change → error reduces
But here the pattern shifts:
change the weights → predictions move slightly → the error remains
The optimizer is still running, updates are still happening, yet progress feels cosmetic.
At this point you’re not really improving the model anymore — you’re moving inside the space it already understands.
The remaining difference isn’t due to bad training. It exists because the model has no way to represent it.
A tiny math example (no fear)
When Ax Can’t Reach b
Let’s imagine we collect two features for every data point:
the first value x and another value y
At first it looks like we have two independent measurements. But after looking closer, we notice something interesting:
the second feature is always twice the first.
So although the data appears two-dimensional, it really isn’t. All points sit along a single direction — like dots drawn along a straight path
Writing the data as a matrix
We can place a few samples together into a matrix:
This matrix represents the inputs given to the model.
The model learns weights:
And predictions come from multiplying them:
What the model is actually capable of
Carrying out the multiplication:
If you look carefully, every row shares the same pattern. We can rewrite it as:
This reveals an important limitation:
no matter how the weights change, the model can only move along one direction.
Training can slide predictions forward or backward on that line, but it cannot leave the line
When reality asks for more
Now suppose the true output is:
We would like the equation
to have a solution.
But that would require b to lie on the same line as the model’s predictions. It doesn’t.
So no choice of weights can reproduce it exactly.
What training really does
Instead, the model settles for the nearest possible output — we’ll call it b̂, the best approximation it can produce.
The remaining difference
never disappears — not because training failed, but because the model has no way to express it.
That leftover gap is the part of reality outside the model’s language. In linear algebra, it lives in the left null space.
Real life version
Imagine predicting salary using:
But:
months = 12 × yearsYour model really only has one degree of freedom.
Now HR introduces a performance bonus.
Your training keeps running — but the error never disappears.
The model isn’t lazy.
It literally has no way to represent “bonus”.
That missing expressiveness is the left null space.
A calmer analogy
Fitting a straight ruler onto a curved road.
You slide it. Rotate it. Press harder.
Eventually nothing improves.
You didn’t stop optimizing — you reached the limit of what a straight line can do.
The remaining gap is the left null space.
Why this matters in ML
This explains moments where:
Because training adjusts parameters. but architecture decides possibility. The left null space is where reality lives outside your model’s language.
The quiet takeaway
Null space means:
the model can move without changing behavior
Left null space means:
reality can change without the model being able to follow
