Correct. But the same holds true for dummy heads. Only if you measure the response of top speakers in an accoustically treated room with the use of a dummy/human head will you have a good reference. Otherwise you can only make comparisons and even then you are not 100% sure.
I think what Marv and others are doing with their rigs is try to remove the ear from the equation and use the results as a basis for comparison. I like this 'big picture' approach more; I have many issues correlating what I hear with what Tyll's and other HATS-based plots show beyond 2 KHz for example. Maybe C.U.N.T + In-Ear/HATS measurements combined can show a more detailed picture, but the latter will always have a big YMMV factor associated.