Experimentation and Causal Inference
Why Data Scientists Should Learn Causal Inference
Climb up the ladder of causation
--
Nobel Prize Goes To …
By now, you should have heard that three Economics methodologists — David Card, Joshua Angrist, and Guido Imbens — won the Nobel Prize. Their contributions to research methodology (i.e., Causal Inference) both cheer up and puzzle the data community:
What is Causal Inference anyway?
How does it differ from other tracks of Data Science?
As an ex-academic working in the tech sector, I’ve been exposed to both sides of the fence and become quite familiar with their distinctive use cases. In today’s post, let’s start with conceptual clarifications and the centrality of causal reasoning in business decision-making. Then, we move on to elaborate on the reasons why Data Scientists should start adopting a causal mentality and how they can do so.
Data Science as A Field
Data Science is an umbrella concept that includes a wide range of sub-fields, which require different data skills. They follow either correlation- or causation-based tracks. Machine Learning is probably the poster boy in the correlational track and stealing the thunder right now. In contrast, its causal sister is less prominent but deserves more attention in the industry.
As Prof. Judea Pearl, the 2011 Turing Award winner, puts it:
“Machine Learning systems have made astounding progress at analyzing data patterns, but that is the low-hanging fruit of Artificial Intelligence.
To reach the higher fruit, AI needs a ladder, which we call the Ladder of Causation.”
From his WSJ report “AI Can’t Reason Why”
In many real-life scenarios, merely knowing two things are related is not actionable; instead, we want to move up the ladder of causation and answer these “what if” questions: