Online Experimentation at Microsoft, Sept 2009
By Ron Kohavi, Thomas Crook, and Roger Longbotham

The paper won 3rd place at the Third Workshop on Data Mining Case Studies and Practice Prize.   PDF

Abstract: Knowledge Discovery and Data Mining techniques are now commonly used to find novel, potentially useful, patterns in data (Fayyad, et al., 1996; Chapman, et al., 2000). Most KDD applications involve post-hoc analysis of data and are therefore mostly limited to the identification of correlations. Recent seminal work on Quasi-Experimental Designs (Jensen, et al., 2008) attempts to identify causal relationships. Controlled experiments are a standard technique used in multiple fields. Through randomization and proper design, experiments allow establishing causality scientifically, which is why they are the gold standard in drug tests. In software development, multiple techniques are used to define product requirements; controlled experiments provide a way to assess the impact of new features on customer behavior. The Data Mining Case Studies workshop calls for describing completed implementations related to data mining. Over the last three years, we built an experimentation platform system (ExP) at Microsoft, capable of running and analyzing controlled experiments on web sites and services. The goal is to accelerate innovation through trustworthy experimentation and to enable a more scientific approach to planning and prioritization of features and designs (Foley, 2008).Along the way, we ran many experiments on over a dozen Microsoft properties and had to tackle both technical and cultural challenges. We previously surveyed the literature on controlled experiments and shared technical challenges (Kohavi, et al., 2009). This paper focuses on problems not commonly addressed in technical papers: cultural challenges, lessons, and the ROI of running controlled experiments.

Longer version of the above

Online Experimentation at Microsoft (sanitized Thinkweek 2009)

By Ron Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca, Randy Henne, Juan Lavista Ferres, Tamir Melamed

The ThinkWeek paper was recognized as a top-30 ThinkWeek at Microsoft.

What others are said about this paper:

  • Jared Spool wrote (tech.groups.yahoo.com/group/webanalytics): Amazingly wonderful and brilliant paper!

    If that won 3rd place, I need to read the first two, because that was outstanding. Congrats!

Jared M. Spool
User Interface Engineering

  • Greg Linden wrote: ... if you have not seen it, the paper "Online Experimentation at Microsoft" that was presented at a workshop at KDD 2009 has great tales of experimentation woe at the Redmond giant. Section 7 on "Cultural Challenges" particularly is worth a read.
Quicklink to this page: http://bit.ly/expMicrosoft