Correlation does not imply causation.
If you are in the data science field, you might have heard the quote quite a few times. “Correlation does not imply causation.” Let’s dig it down to find the crux of the saying.
There is an interesting XKCD comic about correlation.
What is the correlation?
The association between any two random variables in a dataset is termed as correlation in common statistical terms. Most of the time, we measure the linear dependent relationship between the two variables.
E.g., A and B
What is causation?
Causality, also referred to as Causation, is a property that connects one process with another. It is understood that the first is partly responsible for the second, and the second is dependent on the first. Hence we can say that causation is the “slight/partial” guarantee that given an event A (cause), event B (effect) has to occur in the sequence.
E.g., A -> B ( A implies B if A then B)
Let’s get back to our original statement: “Correlation does not imply Causation.”
Correlation is a mathematical quantity where as Causation is an physical quantity (observation).
Since there is no evident relationship between the two quantities, we can not say that the representative quantities have a cause and effect model since two random variables are correlated.