    Chapter 5: Analysing the Data Part II : Inferential Statistics # Probability

It is important for you to have a grasp of some basic ideas in probability. Although you will not be required to learn all of the combination/permutation rules presented in Howell, some notions are fundamental to your understanding of psychological research. This is because a misunderstanding of probability is at the root of many fallacious ideas.

A central idea to the relative frequency view of probability is the idea of sampling a vast number of times, and estimating some parameter on the basis of how our sampling procedures have turned out. There are other views of probability which are described in Howell but are not mentioned in Ray. There is a reason for that. While subjective probability is a fascinating area, and important for understanding how people anticipate their environments, it has historically been more of a subject of study for psychologists, rather than a tool of research. Nevertheless, you will often see people relying on subjective estimates of how likely things are when they donÕt have ready empirical estimates in hand, often citing them as if they were actual data.

If you were to toss a coin five times, and it turned up "heads" on every throw, would you be willing to bet that the next one must be a tail, since itÕs had too many heads in a row already? If so, you are committing the gamblerÕs fallacy discussed in both Ray (p. 124) and Howell (p. 132). This belief is false because tosses of a coin are independent of one another, i.e., the result of one toss does not affect the result of the next. While five heads in a row is unlikely, and six twice as unlikely, such sequences are not impossible, and in fact are likely to occur if you continue tossing the coin day in and day out for a long time. This idea of the long run is what we tend to ignore in our day-to-day thinking.

I said that six heads in a row is twice as unlikely as five. The derivation of this depends on the multiplication rule of probability for independent events. The probability of heads is .5, and using this rule, we find that the probability of two heads in a row is .5 x .5 or .25. As a rule, the probability of n heads in a row is .5n. Thus, p(5 heads in a row) is .55 or .03125 . Likewise, p(6 heads in a row) is .56 or .015625 . You will find that p(5H) / p(6H) = .03125/.015625 = 2 .

Another rule discussed in your texts is the addition rule. This simply states that for mutually exclusive events, the probability of either one or the other of the two events occurring is equal to the sum of the two probabilities. Trivially, this means that the probability of getting either a head or a tail on a given coin toss is equal to p(H) + p(T) = .5 +.5 =1.0 (this of course assumes that it is impossible to have it land on the edge! If you wish to explore the latter probability, you should toss a coin until it does exactly that, which may mean sitting and tossing it for a very long time, indeed.)

The assumption of independence becomes very important later on when we discuss 2, because this test relies on showing that two events are not independent, i.e., that the probabilities donÕt "add up," much as in HowellÕs example of being female and having a feminine name. Showing that two things are not independent is a vital first step in establishing some kind of scientific law that relates two or more constructs. The ideas of joint and conditional probability also are important to grasp. As Howell notes, joint probability may be compared to the results of the multiplicative law assuming independence to determine whether the joint probability (say, of having no children and not wearing a seatbelt) are independent. In HowellÕs example, you can see that p(no children)x p(not wear a belt) = .8 x .3 = .24 , however, the joint probability that was observed was actually 15/100 or .15, just a little more than half the expected value. This discrepancy argues against the independence of the variables.

The full probability for two events occurring is given by

p(A or D) = p(A) + p(D) Ð p(A and D)

If the two events A and D are independent the probability of them both occurring is zero and the above equation reduces to just the sum of the separate probabilities for event A and event B.

Conditional probability is also worth understanding. Conditional probability is the probability that something will happen given that something else has already happened. For example, in a room there are 50 women and 50 men with different names. The probability of selecting a woman is 50/100 = 0.5. The probability of selecting the person with the name "Jane" is 1/100 = 0.01. The chance of selecting a person named "Jane" given that we have already selected a woman is 1/50 = 0.02. But the chance of selecting a person named "Jane", given that we have already selected a man is considerably less than this, probably close to zero! In notation, we write P(A|B) to denote "the probability of A, given B." Consider another example that will come up again in the context of hypothesis testing. What is the probability that someone was Hanged, given that they are Dead, or p[H|D]? Nowadays, IÕd guess around .00001 . What is the probability that a person is Dead, given that they were Hanged, or p[D|H]? Rather substantial, I should sayÉ. around .99999. So not only should you never confuse joint and conditional probabilities, but you should never confuse p(A|B) with p(B|A).

One useful tool for thinking about probabilities is the Venn diagram. In more advanced statistics courses (and in later chapters of Howell, pp. 529-530) these are used to illustrate complex correlations. However, they can be used for much simpler demonstrations, as well. For example, in Figure 5.1 below, we consider the probability that we will randomly select either a person with an Anxiety Disorder (A) or a person with a Dissociative Disorder (D) from a group of 100 patients in a psychiatric hospital. There are 20 people in group A, and 20 people in group D, but p (A or D) is not .4 as you would expect with the addition rule you learned above. Rather, it is .35 . This is due to the fact that there are 5 people who have both an anxiety disorder and a dissociative disorder, illustrating the principle that for events that are not mutually exclusive, p(A or D) = p(A) + p(D) Ð p(A and D) . You can see how p(A and D) = 0 for events that are mutually exclusive (such as the probability of selecting someone who is both male and female Ð ignoring the small logical problem of the very rare occurrence of hermaphrodites!). Figure 5.1 Venn diagram for simple probability. Note that p(A and D) is a joint probability. Five people were diagnosed with both Anxiety and Depression. The two events are not mutually exclusive. A and D are not independent events.

 © Copyright 2000 University of New England, Armidale, NSW, 2351. All rights reserved Maintained by Dr Ian Price Email: iprice@turing.une.edu.au