Data can show you what happened and be used to project into the future with some degree of confidence, but it can never tell you why things are the way they are.
No matter how massive your dataset, how sophisticated your pipeline, or how tight your confidence intervals, raw data alone cannot answer the questions that actually move a business forward: What would happen if we changed something? Would revenue go up if we launched this feature? Would churn go down if we changed the pricing page? Would engagement improve if we redesigned the onboarding flow?
These are causal questions. And data, on its own, is silent on causality.
The brilliance of the turkey problem
I love this example, originally from Bertrand Russell and later sharpened by Nassim Taleb. Imagine a turkey fed by a butcher every single day for a thousand days. Every data point reinforces the same conclusion: the butcher is a generous, reliable provider.
The trend is undeniable and the confidence intervals get tighter with each passing day. Any reasonable statistical model would predict that tomorrow will bring yet another meal.
But then comes the Wednesday before Christmas: the turkey undergoes what Taleb calls a “sharp revision of belief.”
The turkey’s model was predictively excellent, right up until the moment it mattered most. It captured the pattern perfectly but understood nothing about the mechanism behind it. This is the difference between prediction and understanding, between correlation and causation. And in business, getting this wrong is how you end up scaling the wrong strategy with full confidence.
What Causal Inference actually is
At its core, causality is about understanding what would happen under different scenarios: one where you act, and one where you don’t. Causal inference simply gives us a structured way to approach this comparison.
We start by formulating a precise causal question … before looking at the data. This matters more than people think. Hypothesizing after results are known is one of the most common ways analyses go wrong. It feels like insight, but it’s actually pattern-matching dressed up as discovery.
Now let’s talk about the other ingredients of a good Causal Analysis.
John Dewey said, “A problem well stated is a problem half solved.” He was right, and this is where most causal analyses either succeed or fail: not in the modeling, but in the framing.
For example, a vague causal question in a CRM context might be: “Are our push notifications driving sales?”
A sharper version would be: “Does sending a ‘We Miss You’ coupon to users who have been inactive for 60 days lead to more purchases over the next 24 hours compared to sending no notification at all?”
Notice that a well-defined causal question has four components.
What is the intervention in question? What specific action, policy, or feature change are you introducing? This needs to be concrete and measurable. “Improving the user experience” is not a treatment. “Displaying a redesigned checkout page with a single-step payment flow” is.
Here, the intervention is sending the ‘We Miss You’ coupon.
What outcome are you measuring? What KPI do you expect the treatment to move? Daily active users, average order value, churn rate, conversion: pick the metric that matters most for your business question. You can track secondary outcomes, but be honest about which one is primary.
The clearly defined outcome is “Purchases over the next 24 hours”
Who is your population, and over what time period? “All users” is almost never precise enough. Think about it: your user base includes people who signed up three years ago and never came back. Be specific. Are you studying new sign-ups from the last 30 days? High-value customers in a particular region? Active users during a promotional window?
The population is users who have been inactive for 60 days.
What is the comparison? This is the counterfactual: the scenario you’re measuring against. Are you comparing against users who saw the old feature, users who received no intervention at all, or users who received some kind of placebo? The choice of counterfactual shapes everything that follows.
A cleaner comparison would be to have a control group that receives no notification at all, essentially an A/B test.
The tools are the easy part
Once you have a well-framed question, you apply one of several well-established methods to estimate the causal effect. A/B tests, difference-in-differences, regression discontinuity, instrumental variables, synthetic controls: these are the workhorses of causal inference, and they each have their place depending on the situation.
But here’s the thing: learning the tools is genuinely the straightforward part. Books like The Effect by Nick Huntington-Klein, Causal Inference: The Mixtape by Scott Cunningham, or Everyday Causal Inference cover these methods in detail. The hard part is everything that comes before: asking the right question, choosing the right comparison, and being honest about what your data can and cannot support.
The part most skip: Trying to prove yourself wrong
After you estimate your effect, the work isn’t done. In fact, the most important step comes next: falsification.
One of the core values of science is actively trying to prove yourself wrong. You run robustness checks, you test placebo outcomes, you see what would happen if your assumptions are violated. You stress-test every angle you can think of.
And only after repeated, genuine attempts to break your conclusion, if it still stands, do you have something worth acting on. Even then, you hold it provisionally. You make decisions based on the best available evidence, knowing that stronger evidence might come along tomorrow and change the picture.
This is not a weakness of the method. It’s the entire point. The goal was never certainty. The goal is making better decisions under uncertainty, and knowing exactly how uncertain you are.
Being explicit about assumptions is the opposite of fragility. And the difference with causal inference is that it forces you to declare what you know and what you believe about the world before claiming you’ve learned something from the data.
That kind of transparency is far less common in traditional analytics, where the temptation is to look at results first and then build a narrative around them. Explicit assumptions can be questioned, tested, and defended. Hidden assumptions conceal bias, and everything works fine until it doesn’t, with no way to understand why.



