## How Do Psychologists Analyze and Interpret Research Data?

Statistics is oftentimes used in Psychology because research data are usually gathered quantitatively. There are two types of Statistics - descriptive and inferential. The classification depends upon the function of statistical principles in analyzing and interpreting research data.

### Descriptive Statistics

Descriptive Statistics is used to summarize the characteristics of sets of data by describing their overall tendency or variability.

**Measures of Central Tendency:**

*Mean* is the statistical term for average. One can obtain the mean by adding all the scores and dividing the sum by the number of scores. Suppose one class section gets the scores 30, 29, 29, 25, 20, 18, 26, 3 for the first quiz in Psychology. The mean in this example is the sum of the scores divided by 8 (because there are 8 scores overall), which is 22.5. Because the mean takes into account all the scores, it is duly affected by extreme scores. In the above example, the student who got the score of 3 pulled the rest of the class and their standing from other sections by affecting their average score. Should the student studied harder and scored 20, the average performance of the class would have risen to 24.6. The same thing could happen in a section with poor scores and 1 very high score. In a section where 7 students get a score of 5, and 1 student gets a score of 60, the average performance of the class could shoot up to 11.9, well above the performance of most students. Thus, the problem with using the mean as a statistical tool to measure the overall tendency of a set of scores is that it is easily distorted by extremely low and high scores.

*Median* is the middlemost score in a set of scores. Picture a class of kindergarten students falling in line according to their height. The middle student is the median in height. Thus, in order to find out the median in a set of scores, it is important to arrange the scores from highest to lowest, or vice versa, and locating the middlemost score. For example, in a score set of {7, 7, 5, 4, 1}, the median is 5. When a data set can be equally divided into two groups, such that no middlemost score can be found, the average of the two middlemost scores is obtained. Thus, in a score set of {27, 25, 20, 19}, the median is the sum of 25 and 20 divided by 2, resulting to 22.5. Notice that unlike the mean, only the arrangement of the scores matters, not the value. Because of this, the median is unaffected by extreme scores, and is useful when such extreme scores are present in a set of data.

*Mode* is the most frequently occurring score in a set of scores. Imagine an elementary school in Africa. A typical classroom may well be composed primarily of black-colored children. Similarly, in a score set of {5, 4, 4, 3, 3, 3, 2, 1, 1}, the mode is undoubtedly 3. Just like the median, the mode is unaffected by extreme scores.

**Measures of Variability:**

*Range* is the distance between the highest and the lowest score. It gives a very basic and general idea of the variation of a score set. For example, person A can jump as high as 3 yards, while person B can jump as high as 4 yards. The range of person B's high jump is higher compared to person A. Thus, if person A's jump score set is {0, 3, 2, 1.5, 2, 2.5} and person B's score set is {0, 4, 2, 2, 2, 1}, person A's range of high jump is 3, while person B's range of high jump is 4, even if person A's mean score is just the same as person B's, which is 1.83. The formula for getting the range is H-L, where H refers to the highest score, and L refers to the lowest score.

*Standard Deviation* is the average difference of all the scores from the mean. Thus, you have to get the mean of the scores first, then get the difference of all the scores from that mean, and then get the average of the sum of those differences. The formula for getting the standard deviation is (Σ|X-M|)/N, where (Σ|X-M|) refers to the sum of the integral or absolute value of the difference of all the scores (X) from the mean (M), and (N) refers to the number of the differences. For example, in a score set of {10, 8, 8, 7, 5, 3, 3, 2, 2, 2}, the mean is 5.0. The summation of the absolute difference of all the scores from the mean can be calculated as follows: |10-5|+|8-5|+|8-5|+|7-5|+|5-5|+|3-5|+|3-5|+|2-5|+|2-5|+|2-5|, or 5+3+3+2+0+2+2+3+3+3, which is equal to 26. Because there are 10 scores, and consequently 10 absolute differences from the mean, then the standard deviation is 26/10, or 2.6. Thus, the average distance of the scores from the mean is 2.6.

Note: The concern over extreme scores is applicable to all measures of variability and the mean.

### Inferential Statistics

Inferential Statistics is commonly used in Experimental Research to compare results from the experimental group and the control group. The difference between the two groups is considered statistically significant only upon meeting a 0.01-0.05 level of confidence. This means that the difference between the two groups is due to chance only 1-5% of the time.

An important aspect in reporting about the significant difference of two groups is indicating the separate tendency and variability of each group (as discussed on Descriptive Statistics). For example, Benbow and Stanley (1983) found that SAT scores of males and females are significantly different from each other. However, the difference is too small to be considered important. In addition, the overlap between the scores is so extensive that very few males outperform the highest-scoring females. It is therefore wrong to conclude from this research that all males do better than females in SAT scores.