The main problem that designers of post hoc tests try to deal with is
-inflation. This refers to the fact that the more tests you conduct at = .05, the more likely you are to claim you have a significant result when you shouldn't have (i.e., a Type I error). Doing all possible pairwise comparisons on the above five means (i.e., 10 comparisons) would increase the overall chance of a Type I error to
* = 1 - (1 -
)10 = 1 - .599 = .401
i.e., a 40.1% chance of making a Type I error somewhere among your six t-tests (instead of 5%)!! And you can never tell which are the ones that are wrong and which are the ones that are right! [This figure is a maximum, it assumes 10 independent comparisons Ð which they aren't fully]. Tukey's HSD, and the other tests, try to overcome this problem in various ways.
The overall chance of a Type I error rate in a particular experiment is referred to as the experimentwise error rate (sometimes called Familywise error rate). In the above example, if you wanted to make all possible pairwise comparisons amongst the five means the experimentwise error rate would be 40.1%. Each comparison used an error rate of 5% but by the time we did ten comparisons, the overall chance of making a Type I error increased to 40.1%.