Microsoft's Experimentation Platform

Accelerating software innovation through trustworthy experimentation

Home
Contact Us
Job Descriptions
Cool Things about Microsoft
ExP Articles
ExP Talks & Presentations
ExP Tools
What's a HiPPO?
Talks & Presentations
 

4/9/2010

 

SD Forum 2010: Online Controlled Experiments: Listening to the Customers, not to the HiPPO 

 

On-Demand: ‘Best of’ Which test Won's 2009 Webinars: Top 7 Testing Pitfalls
Nov 18, 2009
 
Webinar On-Demand Audio
Slides from talk: PPT
 
by Ronny Kohavi, GM Microsoft Experimentation Platform
 

 
Experimentation Platform at Microsoft - Diversity Job Fair
Sept 23, 2009
 
Tamir Melamed, Principal Dev Manager, ExP
Seth Eliot, Senior Test Manager, ExP
Leslie Corneto, Microsoft Staffing
 
Presentation : PPTXPDF
 
For more info on Microsoft's Industry-Leading Benefits click here.
 

 
 
Talk at Seattle Tech Startups talk: Online Experimentation at Microsoft 
Sept 9, 2009
 
 
Abstract: Knowledge Discovery and Data Mining techniques are now commonly used to find novel, potentially useful, patterns in data (Fayyad, et al., 1996; Chapman, et al., 2000). Most KDD applications involve post-hoc analysis of data and are therefore mostly limited to the identification of correlations. Recent seminal work on Quasi-Experimental Designs (Jensen, et al., 2008) attempts to identify causal relationships. Controlled experiments are a standard technique used in multiple fields. Through randomization and proper design, experiments allow establishing causality scientifically, which is why they are the gold standard in drug tests. In software development, multiple techniques are used to define product requirements; controlled experiments provide a way to assess the impact of new features on customer behavior.  The Data Mining Case Studies workshop calls for describing completed implementations related to data mining. Over the last three years, we built an experimentation platform system (ExP) at Microsoft, capable of running and analyzing controlled experiments on web sites and services. The goal is to accelerate innovation through trustworthy experimentation and to enable a more scientific approach to planning and prioritization of features and designs (Foley, 2008).  Along the way, we ran many experiments on over a dozen Microsoft properties and had to tackle both technical and cultural challenges. We previously surveyed the literature on controlled experiments and shared technical challenges (Kohavi, et al., 2009).  This paper focuses on problems not commonly addressed in technical papers: cultural challenges, lessons, and the ROI of running controlled experiments.  
 
By Ronny Kohavi, Thomas Crook, and Roger Longbotham
  

 
KDD 2009 Tutorial: Planning, Running, and Analyzing Controlled Experiments on the Web
June 2009
 

Tutorial Description

The web provides an unprecedented opportunity to evaluate ideas quickly using controlled experiments, also called randomized experiments, A/B tests (and their generalizations), split tests, and MultiVariable Tests (MVT). Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. Data Mining and Knowledge Discovery techniques can then be used to analyze the data from such experiments. The tutorial will provide a survey and practical guide to running controlled experiments based on the recently published survey article in the Data Mining and Knowledge Discovery Journal, co-authored with the two of the tutorial co-presenters (http://exp-platform.com/dmkd_survey.aspx), and based on the book “Always Be Testing” co-authored by the 3rd tutorial co-presenter (http://www.amazon.com/Always-Be-Testing-Complete-Optimizer/dp/0470290633). The book includes use of industry tools, such as Google Website Optimizer and recently ranked #2 on Amazon’s sales rank for computers/e-commerce books. The tutorial includes multiple real-world examples of actual controlled experiments (many with surprising results), a review the theory and the statistics used to plan and analyze such experiments, and a discussion of the limitations and pitfalls that might face experimenters. 

 

A video of a related talk can be found on the videolectures website: http://videolectures.net/cikm08_kohavi_pgtce/

KDD 2009 Data Mining Case Studies Workshop, 3rd place winner: Online Experimentation at Microsoft (June 28, 2009)

KDD 2009 tutorial: Planning, Running, and Analyzing Controlled Experiments on the Web (June 28, 2009)

 

Slides from this tutorial:
  • Controlled Experiments tutorial part 1: PPTX, PDF
  • Controlled Experiments tutorial part 2: PPTX, PDF
  • Controlled Experiments tutorial part 3: PPTX, PDF

 

By Ronny Kohavi, Roger Longbotham, John Quarto-vonTivadar

 


 
Oct 29, 2008
 
ACM 17th Conference on Information and Knowledge management, video is also available.
 

 
Talk at IMTC 2008
4/3/2008
 
 
Randy Henne, Principal Group Program Manager, Experimentation Platform EMEA, Microsoft Corporation

 
Talk at Stanford
Jan, 25, 2008
 
 
Recording available (about 90 minutes, start at 3:10 minutes after intro by Terry Winograd)
 

 
Talk at Emetrics 2007 in Washington DC

 

Practical Guide to Controlled Experiments (includes multiple new examples).

 
Ronny Kohavi, General Manager, Experimentation Platform, Microsoft Corporation

 
Talk at Ebay Research Labs 
June 6, 2007
 
Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO (Paper)

Ronny Kohavi, General Manager, Experimentation Platform, Microsoft Corporation

The web provides an unprecedented opportunity to evaluate ideas quickly using controlled experiments, also called randomized experiments (single-factor or factorial designs), A/B tests (and their generalizations), split tests, Control/Treatment tests, and parallel flights. Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior. We will look at several real-world examples with significant return-on-investment (ROI), where it was clearly better to experiment and listen to the customers and not to the Highest Paid Person’s Opinion (HiPPO). We will review the important ingredients of running controlled experiments, and discuss their limitations (both technical and organizational). We describe common architectures for experimentation systems and analyze their advantages and disadvantages. Based on our extensive practical experience with multiple systems and organizations, we share key lessons that will help practitioners in running trustworthy controlled experiments.
 
Based on paper accepted to KDD 2007, co-authored with Randy Henne and Dan Sommerfield.
 
 

 
ACM Data Mining SIG

June 14, 2006

This was the first talk of the Data Mining SIG of the local Bay Area ACM chapter in Palo Alto, CA.

 

Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce (PPT)
More information at the event site: http://sfbayacm.org/events/2006-06-14.php

 

Ron Kohavi, General Manager, Experimentation Platform, Microsoft Corporation

 

Electronic Commerce is now entering its second decade, with Amazon.com and eBay now in existence for ten years.  With massive amounts of data, an actionable domain, and measurable ROI, multiple companies use data mining and knowledge discovery to understand their customers and improve interactions.  The talk will cover important lessons and challenges using e-commerce examples across two dimensions: (i) business-level to technical, and (ii) the mining lifecycle from data collection, data warehouse construction, to discovery and deployment.   The talk will include examples from real-world A/B tests and Simpson's paradox.


 

 
Approaches to Online MVTs
December 12, 2007

Roger Longbotham, Principal Statistician, Experimentation Platform, Microsoft Corporation

This position paper introduces two approaches to carrying out MVTs online that I believe are superior to the most common approach but are rarely if ever used. By publishing this position paper I hope to get feedback from other practitioners on agreements and disagreements with my arguments. I realize that some disagreements may come because of different priorities or technical challenges. It will be helpful to hear many perspectives and the reasons for the agreement/disagreement. Please email me at Roger.Longbotham@microsoft.com