go to the School of Psychology home page Go to the UNE home page
Chapter 7 - Analysing the Data Part IV - Analysis of Variance Chapter 1 - Behavioural Science and research Chapter 2 - Research Design Chapter 3 - Collecting the Data Chapter 4 - Analysing the Data Part I - Descriptive Statistics Chapter 5 - Analysing the Data Part II - Inferential Statistics Chapter 6 - Analysing the Data Part III - Common Statistical Tests Correlation Regression T-tests Chi-squared Readings and links

 

Chapter 6: Analysing the Data
Part III: Common Statistical Tests

 

Testing the significance of Pearson's r

We have looked at Pearson's r as a useful descriptor of the degree of linear association between two variables, and learned that it has two key properties of magnitude and direction. When it is near zero, there is no correlation, but as it approaches 1 or +1 there is a strong negative or positive relationship between the variables (respectively). But how do we know when a correlation is sufficiently different to zero to assert that a real relationship exists?

What we need is some estimate of how much variation in r we can expect just by random chance. That is, we need to construct a sampling distribution for r and determine its standard error. All variables are correlated to some extent; rarely will a correlation be exactly zero. What we need is to be able to draw a line, that tells us that above that line a correlation will be considered as a real correlation and below that line the correlation will be considered as probably due to chance alone.

From the following illustrations, we can see how for small N's, r can vary markedly even when the null hypothesis is true (i.e., if chance is a reasonable explanation for the correlation). For larger sample sizes, the correlations will cluster more tightly around zero but there will still be a considerable range of values. The illustrations that follow show the distribution of the correlation coefficient between an X and a Y variable for 100 random samples simulated on a computer, using N's of different sizes. When N=5, we can see almost the full range from 1 to +1. This is less apparent for N=10, 20, 30 and so on, until when we have simulated 100 cases there is little variability around zero.

Figure 6.1 The distribution of correlations between two random variables when sample size = 5

Figure 6.2 The distribution of correlations between two random variables when sample size = 10.

Figure 6.3 The distribution of correlations between two random variables when sample size = 20.

Figure 6.4 The distribution of correlations between two random variables when sample size = 100.

From the above figures, you can see that as the sample size increases, the more the correlations tend to cluster around zero. Because we have just correlated two random variables there should be no systematic or real relationship between the two. However just by chance you will get some that appear real.

If we look more carefully at the above figures for r = -.65, we see that in Figure 6.1 with samples of size 5, 18 out of the 100 samples had a correlation equal to or more extreme than -.65 (4 at -.95, 5 at -.85, 2 at -.75, and 7 at -.65). In Figure 6.2 with samples of size 10, only 6 samples had a correlation of -.65 or more extreme. And in Figure 6.3 and Figure 6.4 with samples of size 20 and 100, there were no samples having a correlation of -.65 or more extreme. So a correlation of -.65 is not an unusual score if the samples are only small. However, correlations of this size are quite rare when we use samples of size 20 or more.

The following table gives the significance levels for Pearson's correlation using different sample sizes.

Pearson's table

Table D. Critical values for Pearson r

(= N-2)
(N= number
of pairs)

Level of significance for one-tailed test

.05

.025

.01

.005

Level of significance for two-tailed test

.10

.05

.02

.01

1

.988

.997

.9995

.9999

2

.900

.950

.980

.990

3

.805

.878

.934

.959

4

.729

.811

.882

.917

5

.669

.754

.833

.874

6

.622

.707

.789

.834

7

.582

.666

.750

.798

8

.549

.632

.716

.765

9

.521

.602

.685

.735

10

.497

.576

.658

.708

11

.476

.553

.634

.684

12

.458

.532

.612

.661

13

.441

.514

.592

.641

14

.426

.497

.574

.628

15

.412

.482

.558

.606

16

.400

.468

.542

.590

17

.389

.456

.528

.575

18

.378

.444

.516

.561

19

.369

.433

.503

.549

20

.360

.423

.492

.537

21

.352

.413

.482

.526

22

.344

.404

.472

.515

23

.337

.396

.462

.505

24

.330

.388

.453

.495

25

.323

.381

.445

.487

26

.317

.374

.437

.479

27

.311

.367

.430

.471

28

.306

.361

.423

.463

29

.301

.355

.416

.456

30

.296

.349

.409

.449

35

.275

.325

.381

.418

40

.257

.304

.358

.393

45

.243

.288

.338

.372

50

.231

.273

.322

.354

60

.211

.250

.295

.325

70

.195

.232

.274

.302

80

.183

.217

.256

.284

90

.173

.205

.242

.267

100

.164

.195

.230

.254

 

 

 

 

© Copyright 2000 University of New England, Armidale, NSW, 2351. All rights reserved

UNE homepage Maintained by Dr Ian Price
Email: iprice@turing.une.edu.au