Experimenting on Humans - Advanced A/B Tests - QCon SF 2014

Experimenting on Humans
Aviran Mordo
Head of Back-end Engineering
@aviranm
http://www.linkedin.com/in/aviran
http://www.aviransplace.com
Talya Gendler
Back-end Team Leader
www.linkedin.com/in/talyagendler

Wix In Numbers
Over 55M users + 1M new users/month
Static storage is >1.5Pb of data
3 data centers + 3 clouds (Google, Amazon, Azure)
1.5B HTTP requests/day
800 people work at Wix, of which ~ 300 in R&D

Agenda
Basic A/B testing
Experiment driven development
PETRI – Wix’s 3rd generation open source experiment system
Challenges and best practices
Complexities and effect on product

Home page results
(How many registered)

Our gallery manager
What can we improve?

Product Experiments
Toggles & Reporting
Infrastructure

How do you know what is running?

Why so many?
If I “know” it is better, do I really need
to test it?

Sign-up
Choose
Template
The theory
Edit site Publish Premium

Conclusion
EVERY new feature is A/B tested
We open the new feature to a % of users
Measure success
 If it is better, we keep it
 If worse, we check why and improve
If flawed, the impact is just for % of our users

Sh*t happens (Test could fail)
New code can have bugs
Conversion can drop
Usage can drop
Unexpected cross test dependencies

Minimize affected users
(in case of failure)
Gradual exposure (percentage of…)
Language
GEO
Browser
User-agent
OS
Company employees
User roles
Any other criteria you have
(extendable)
All users

Not all users are equal
First time visitors = Never visited wix.com
New registered users = Untainted users

Start new experiment (limited population)

First trial failed
Performance had to be improved

Halting the test results in loss of data.
What can we do about it?

Solution – Pause the experiment!
• Maintain NEW experience for already exposed users
• No additional users will be exposed to the NEW feature

PETRI’s pause implementation
Use cookies to persist assignment
If user changes browser assignment is unknown
Server side persistence solves this
You pay in performance & scalability

Decision
Keep feature Drop feature
Improve code &
resume experiment
Keep backwards compatibility for
exposed users forever?
Migrate users to another equivalent
feature
Drop it all together (users lose
data/work)

Reaching statistical significance
Numbers look good but sample size is small
We need more data!
Expand
Control Group (A)
Test Group (B)

Keep user experience consistent
Control
Group
(A)
Test
Group
(B)

Keeping persistent UX
Signed-in user (Editor)
Test group assignment is determined by the user ID
 Guarantee toss persistency across browsers
Anonymous user (Home page)
 Test group assignment is randomly determined
Can not guarantee persistent experience if changing
browser
11% of Wix users use more than one desktop browser

Always exclude robots
Don’t let Google index a losing page
Don’t let bots affect statistics

Possible states >= 2^(# experiments)
# of active
experiment
Possible # of
states
10 1024
20 1,048,576
30 1,073,741,824
Wix has ~200 active experiments = 1.606938e+60

Managing an ever changing production env.
Supporting 2^N different users is challenging
How do you know which experiment causes errors?

Specialized tools
Override options (URL parameters, cookies, headers…)
Near real time user BI tools

Why should product care about the
system architecture

Share document with other users

Document owner is part of a test that
enables a new video component

What will the other user experience
when editing a shared document ?
Owner Friend

Assignment may be different than owner’s
Owner (B) Friend (A)

Possible solutions
Enable features by existing content
Enable features by document owner’s assignment
Exclude experimental features from shared
documents

Petri is more than just an A/B test framework
Feature toggle
A/B Test
Internal testing
Personalization
Continuous
deployment
Jira integration
Experiments
Dynamic
configuration
QA
Automated
testing

Petri is now an open source project
https://github.com/wix/petri

Q&A
http://goo.gl/L7pHnd
Aviran Mordo
Head of Back-end Engineering
@aviranm
http://www.linkedin.com/in/aviran
http://www.aviransplace.com
Talya Gendler
Back-end Team Leader
www.linkedin.com/in/talyagendler
https://github.com/wix/petri

Credits
http://upload.wikimedia.org/wikipedia/commons/b/b2/Fiber_optics_testing.jpg
http://goo.gl/nEiepT
https://www.flickr.com/photos/ilo_oli/2421536836
https://www.flickr.com/photos/dexxus/5791228117
http://goo.gl/SdeJ0o
https://www.flickr.com/photos/112923805@N05/15005456062
https://www.flickr.com/photos/wiertz/8537791164
https://www.flickr.com/photos/laenulfean/5943132296
https://www.flickr.com/photos/torek/3470257377
https://www.flickr.com/photos/i5design/5393934753
https://www.flickr.com/photos/argonavigo/5320119828

Why Petri
Modeled experiment lifecycle
Open source (developed using TDD from day 1)
Running at scale on production
No deployment necessary
Both back-end and front-end experiment
Flexible architecture

PERTI Server Your app
Laboratory
DB Logs

Experimenting on Humans - Advanced A/B Tests - QCon SF 2014

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Experimenting on Humans - Advanced A/B Tests - QCon SF 2014

Similar to Experimenting on Humans - Advanced A/B Tests - QCon SF 2014 (20)

More from Aviran Mordo

More from Aviran Mordo (9)

Recently uploaded

Recently uploaded (20)

Experimenting on Humans - Advanced A/B Tests - QCon SF 2014

Editor's Notes