![]()
![]()
When a change in one quantity (or data set) is accompanied by a change in another quantity (or data set) then we can say both quantities (or data sets) are correlated.
You need to take care not to think in terms of cause and effect.
Just because there is correlation does not imply there is a relationship between the two quantities.
Where there is cause and effect there will be correlation, however there can be correlation without cause and effect.
Generally if all the points are confined to a line then we can say there is perfect correlation.
It is possible to measure correlation precisely, using the coefficient of correlation.
If you have a correlation coefficient of 1 then the two sets of data will increase together.
If you have a correlation coefficient of -1 then the two sets of data will decrease together.
If you have a correlation coefficient of 0 then the
two sets of data have no
correlation.
The correlation coefficient between two sets of data will lie between +1 and -1.
By various mathematical manipulations (that we don’t need to know) it can be shown that the correlation coefficient r between two sets of data (x and y) is given by:

Instead of calculating the coefficient from the data values, it is possible to rank the data and calculate the coefficient from the ranking, rs where n is the number of data items and d is the difference between the corresponding data in the ranked series:

Correlation is not a good indicator of causal connection, therefore should not be used as basis for prediction.
The reliability of r and rs increases as the number of data items increase, therefore should not be relied upon for small sets of data.