Home
Contact Us
Job Descriptions
Cool Things
Talks and Presentations
ExP Tools
What's a HiPPO?
Controlled experiments, also called randomized experiments and A/B tests, have had a profound influence on multiple fields, including medicine, agriculture, manufacturing, and advertising. While the theoretical aspects of offline controlled experiments have been well studied and documented, the practical aspects of running them in online settings, such as web sites and services, are still being developed. Online experiments are most useful in conjunction with agile development methodologies, where the necessary ingredients exist for rapid feedback and improvement cycles. As the usage of controlled experiments grows in these online settings, it is becoming more important to understand the opportunities and pitfalls one might face when using them in practice. Multiple lessons learned from running controlled experiments online were documented in the Practical Guide to Controlled Experiments on the Web (Kohavi, et al., 2007). In this follow-on paper, we focus on “advanced” pitfalls we have seen, which include a wide range of topics from incorrectly computing confidence intervals when reporting percent effects (as opposed to absolute effects) to surprising observations about the impact of the choice of metrics on statistical power, to the influence of robots and ways to remove them (a problem unique to online settings). Online experiments allow for techniques like gradual ramp-up of treatments to avoid the possibility of exposing many customers to a bad (e.g., buggy) Treatment. With that ability, we discovered that it’s easy to incorrectly identify the winning Treatment because of Simpson’s paradox. We also share some results from an actual experiment on the MSN portal, where the value of additional ads was evaluated using controlled experiments, and how a monetary Overall Evaluation Criterion (OEC) was developed.
 
 
What others are saying
 
  • MS Experimentation team makes another winner

The MS team creating their own online experimentation platform continue to impress me with their willingness to share both their learning and their expertise.

It’s one thing to pump out “best practices” which are watered down samples of simple a/b findings. These guys not only talk about the complexities of multivariate testing, but give concrete examples and the statistics behind them. Offermatica, Optimost, even Memetrics had good stats behind their systems, but they rarely talked about the meaty stuff. The MS team has really been a great resource to help those past the simple A/B.

You don’t need to be a super stats-head to get the basic issues here, but if you don’t even take the time to understand the basics, you will waste lots of time and money. Well worth the read.

The Exp Platform, led by Ronny Kohavi, at MSFT publishes from this position of strength. Their latest, 7 pitfalls to controlled experiments on the web, is a solid read for those aspiring to live in this space.
I've blogged the guide to practical web experiments and it's also highly recommended. It provides an overview of the key issues to deal with in setting things up including sampling, failure versus success evaluation, and common pitfalls like day of the week effects.

Seven Pitfalls to Avoid when Running Controlled Experiments on the Web is a great white paper by Thomas Crook, Brian Frasca, Ronny Kohavi, Roger Longbotham from Microsoft. Check out the site for the MSFT Experimentation Platform while you're at it. Cool stuff.