This is a quote from Astral Storm in the comments:
"If we were to hold to high fidelity target, the real benchmark would be an accurately recorded (as in measurement microphones) live performance. This means as close to vanilla performance as possible, making any mastering issues moot."
This makes sense, so why isn't the definition of neutral or a flat response directly correlated to this sort of replication of sound?
To me it is directly related. When a headphone measures a signal 'flat' and is reproduced 'flat' they should sound tonally the same but will still not sound the same in spatial and timing aspects even when the chain is flawless.
To accurately record music it would have to be holographic over a huge area, in other words the direction of wavefronts should be recorded as well.
This can't be done with current techniques.
The same goes for reproduction, that should be holographic as well and have the exact same SPL as the recording had and be flawless to come really close to the original.
Stereo reproduction, even when flawless, is but a very meagre representation of the soundpressure that once was during the recording.
Also NOT every one appreciates 'flat'.
Colouration of many sorts is often preferred over accurate.
Its the reason so many different headphones/speakers exist... to suit ones taste and wallet.
For me 'flat' and accurate is the way to go but don't mind some flavouring now and then depending on the recording.
IMO the recording quality is the real bottleneck closely followed by the transducers.