Finding and Using Health Statistics

2. Common Terms and Equations

Correlation

Correlation is a statistical measure of the extent to which two variables relate to one another. One commonly used measure of the linear correlation between two continuous variables is Pearson’s correlation coefficient (denoted by the symbol ρ for population or the letter r for a sample).

Given the values of two variables for a set of observations (X is usually used to denote the independent variable and Y for the dependent variable), Pearson’s correlation coefficient can be calculated using a mathematical formula. As a result of the formula used to compute the correlation coefficient, its value will always lie between -1 and 1.

If the Pearson’s correlation coefficient for a sample is positive (r > 0), then the independent variable (X) and the dependent variable (Y) are positively correlated. In other words, large values for X correspond to large values for Y, and vice versa. If r < 0 (negative correlation coefficient), then X and Y are negatively correlated. In other words, larger values for X correspond to smaller values for Y, and vice versa. If r = 0 then there is not a relationship among the variables. The strength of the correlation is indicated by how close r is to 1 or -1.

For example, if r= -0.8, X and Y have a strong negative correlation. However, if r= 0.3, the correlation is positive, but it is not a strong correlation.

As a rule of thumb, correlation coefficients greater than 0.7 or less than -0.7 are considered strong.