Chapter 4: Analysing the Data Part II : Descriptive Statistics

# Frequency Distributions

From the data in Figure 4.1, a large number of questions can be answered. For example, how many students had only 1 sexual partner in the past year? What was the maximum number of sexual partners? Did more students abstain from sex in the past year than who did not? What percent of students who responded had fewer than 5 sexual partners past year?

Some of these questions can be answered easily and others with more difficulty. For example, to answer the first question, youŐd have to count up all the ones. To answer the second question, you would search through the data to find the maximum. How easy this is depends on how distinctive the maximum is and how many numbers you have to search through. The answer to the third question would require first counting up how many students responded with zero and how many responded with other numbers. The last question would require knowing how many 0s, 1s, 2s, 3s, and 4s there are in the data compared to higher numbers.

Frequency distributions are a way of displaying this chaos of numbers in an organised manner so such questions can be answered easily. A frequency distribution is simply a table that, at minimum, displays how many times in a data set each response or "score" occurs. A good frequency distribution will display more information than this although with just this minimum information, many other bits of information can be computed.

Frequency distributions usually display information from top to bottom, with the scores in either ascending or descending order (SPSS displays the data in ascending order unless you tell it otherwise). In Output 4.1, the variable has been named "sexparts" and the range of possible values for this variable is displayed in the left-hand column. The number of times that score occured in the data set is displayed under "Frequency." So 83 of the 177 respondents had only one sexual partner last year, which is the answer to the first question. This is derived from the "83" in the "1.00" row. You can see that the most frequent response was "1" with "0" and "2" occurring next most frequently and about equally often. Interestingly, 3 people had 10 or more sexual partners last year. This is derived by noting that there is only 1 "10" response, 1 "14" response, and 1 "15" response, which sums to 3 responses greater than or equal to 10.

In general, it would more useful to answer these questions with proportions or percentages. It is quite easy to convert these absolute frequencies into proportions or percentages. A proportion, sometimes called relative frequency, is simply the number of times the score occurs in the data, divided by the total number of responses. So the relative frequency for the "3" response is 13/177 or about .073. Notice that the relative frequency, while not displayed in this frequency distribution, is simply the percent divided by 100. So the relative frequency for "0" is 0.158. Relative frequencies, like proportions, must be between 0 and 1 inclusive.

A more meaningful way of expressing a relative frequency is as a percentage. This is displayed in SPSS under the "Percent" column (Ignore the column labelled "Valid Percent"). As can be seen, 15.8 percent (from 100*28/177) of the students who responded didnŐt have a sexual partner last year. Because the percentages have to add up to 100, we know then that 100%-15.8% or 84.2% of students who responded reporting having at least one sexual partner in the last year. Thus, the answer to the third question is "No." Based on the responses we received, more students had sex last year than abstained from sex entirely.

Note the difference between reporting absolute values and reporting percentages. If we simply report that "3 people had more than 10 sexual partners for the year" we are very limited in drawing generalisations from this. We donŐt know if "3" is a small number or a large number. We canŐt draw any inferences about the general population from this information. It all depends on the total number in our sample. If the total was 177 as here, then we can conclude that about 1.7% of the student population has more than 10 sexual partners in one year. If the total was 30, then we would have a completely different story! Whenever we conduct research we always interested in drawing inferences from our sample to the population at large.

Output 4.1 Summarize í FrequenciesÉ

Output 4.1 The result of using SPSS Summarise í FrequenciesÉ on the "number of sex partners last year" variable.

Another useful statistic, which can be derived from a frequency distribution, is the "cumulative percent". The cumulative percent for a given score or data value corresponds to the percent of people who responded with that score or less than that score. So 79.1 percent of the respondents had no more than 2 sexual partners. If you defined a "promiscuous" person as someone who had more than 5 sexual partners in a year, then you would claim that, from these data, 6.2 percent of UNE students could be called promiscuous (notice the generalisation). This comes from the fact that the cumulative percent for 5 is 93.8%. That is, 93.8% of students had 5 or fewer sexual partners last year. So 100.0% - 93.8% or 6.2% of students had more than 5 sexual partners.