Causal Inference at Twitter

By Alex Tabarrok

Twitter engineering had a nice tweet thread on how they use econometrics and causal inference:

 You may have heard about this year’s Economics Nobel Prize winners – David Card, Josh Angrist (@metrics52) & Guido Imbens.

Their publicly available work has helped us solve tough problems @Twitter, and we’re excited to celebrate by sharing how their findings have inspired us. Understanding causal relationships is core to our work on identifying growth opportunities and measuring impact.

This year’s winners laid the foundation for cutting-edge techniques we use to understand where Twitter can improve and how changes affect our platform experience.
To share a few exciting causal inference applications at Twitter:

While online experimentation is helpful to understand the impact of a product change, it may not be the most efficient way to measure long-term impact. We built a causal estimation framework on the idea of statistical ‘surrogacy’ (Athey et al 2016) – when we can’t wait to observe long-run outcomes, we create a model based on intermediate data.

Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index

Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed. We combine this framework with our online experimentation platform to form a feedback/validation loop and to help accurately infer product success. One of the challenges we face is understanding the impact of different actions at Twitter (likes, Retweets etc.) Engagement actions often occur sequentially and at different surface areas. How to disentangle the effect of multiple actions presents many challenges.
We use Double Machine Learning to understand the causal impact of engagement actions.

Our work leverages research by Chernozhukov et al. (2018), and is influenced by Imbens & Rubin (2015).

Causal Inference for Statistics, Social, and Biomedical Sciences
This framework helps the team to interpret search experiments and make Twitter a better place to serve the public conversation. These applications promote a better understanding of tradeoffs among competing signals, helping our engineering team to iterate fast under more principled measurement and decision frameworks, making Twitter a better platform to create and share ideas and information.

We’re grateful for the role that academic research plays in driving innovation across society. We couldn’t do this work without the methodological foundation of the winners’ work and contributions across academia. Work like this inspires product innovation and engineering ideas alike, and we look forward to all that is yet to come.

More details on Twitter Data Science work will be introduced in our upcoming Engineering Blog posts.