The document discusses A/B testing and experimentation at Wix, an online website building platform. It describes Wix's third generation experiment system called PETRI, which allows them to A/B test every new feature on a percentage of users before fully releasing. It highlights challenges like unexpected failures, maintaining a consistent user experience, and managing hundreds of concurrent experiments. The document also provides an overview of PETRI's capabilities and why it was developed as an open source project.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014
1. Experimenting on Humans
Aviran Mordo
Head of Back-end Engineering
@aviranm
http://www.linkedin.com/in/aviran
http://www.aviransplace.com
Talya Gendler
Back-end Team Leader
www.linkedin.com/in/talyagendler
2.
3. Wix In Numbers
Over 55M users + 1M new users/month
Static storage is >1.5Pb of data
3 data centers + 3 clouds (Google, Amazon, Azure)
1.5B HTTP requests/day
800 people work at Wix, of which ~ 300 in R&D
4.
5. Agenda
Basic A/B testing
Experiment driven development
PETRI – Wix’s 3rd generation open source experiment system
Challenges and best practices
Complexities and effect on product
21. Conclusion
EVERY new feature is A/B tested
We open the new feature to a % of users
Measure success
If it is better, we keep it
If worse, we check why and improve
If flawed, the impact is just for % of our users
24. Sh*t happens (Test could fail)
New code can have bugs
Conversion can drop
Usage can drop
Unexpected cross test dependencies
25. Minimize affected users
(in case of failure)
Gradual exposure (percentage of…)
Language
GEO
Browser
User-agent
OS
Company employees
User roles
Any other criteria you have
(extendable)
All users
26. Not all users are equal
First time visitors = Never visited wix.com
New registered users = Untainted users
31. Halting the test results in loss of data.
What can we do about it?
32. Solution – Pause the experiment!
• Maintain NEW experience for already exposed users
• No additional users will be exposed to the NEW feature
33. PETRI’s pause implementation
Use cookies to persist assignment
If user changes browser assignment is unknown
Server side persistence solves this
You pay in performance & scalability
34. Decision
Keep feature Drop feature
Improve code &
resume experiment
Keep backwards compatibility for
exposed users forever?
Migrate users to another equivalent
feature
Drop it all together (users lose
data/work)
35.
36. Reaching statistical significance
Numbers look good but sample size is small
We need more data!
Expand
Control Group (A)
Test Group (B)
38. Keeping persistent UX
Signed-in user (Editor)
Test group assignment is determined by the user ID
Guarantee toss persistency across browsers
Anonymous user (Home page)
Test group assignment is randomly determined
Can not guarantee persistent experience if changing
browser
11% of Wix users use more than one desktop browser
39.
40. Always exclude robots
Don’t let Google index a losing page
Don’t let bots affect statistics
41.
42. Possible states >= 2^(# experiments)
# of active
experiment
Possible # of
states
10 1024
20 1,048,576
30 1,073,741,824
Wix has ~200 active experiments = 1.606938e+60
43. Managing an ever changing production env.
Supporting 2^N different users is challenging
How do you know which experiment causes errors?
51. Possible solutions
Enable features by existing content
Enable features by document owner’s assignment
Exclude experimental features from shared
documents
52.
53. Petri is more than just an A/B test framework
Feature toggle
A/B Test
Internal testing
Personalization
Continuous
deployment
Jira integration
Experiments
Dynamic
configuration
QA
Automated
testing
54. Petri is now an open source project
https://github.com/wix/petri
55. Q&A
http://goo.gl/L7pHnd
Aviran Mordo
Head of Back-end Engineering
@aviranm
http://www.linkedin.com/in/aviran
http://www.aviransplace.com
Talya Gendler
Back-end Team Leader
www.linkedin.com/in/talyagendler
https://github.com/wix/petri
57. Why Petri
Modeled experiment lifecycle
Open source (developed using TDD from day 1)
Running at scale on production
No deployment necessary
Both back-end and front-end experiment
Flexible architecture
Who here does A/B tests?
Who plans to do A/B test?
A/B test is embedded in our development process
Petri is based on our experience and lessons we learned
You divide your users into group and measure which reached your goal
What does it mean better?
What is your goal?
Measure conversion to register
The theory – we can make a better gallery
Our goal – make it easier for our users to build their sites (converting to premium)
It is not about winning, its about not losing
Lessons learned from 4 years of experience
Petri allows PM to manage their tests
A screenshot of the UI we built on top of PETRI
Premium link in the editor
If we shorten the funnel more users will reach the purchase page, thus increasing our sales
Why did it fail.
T-Shirt time
Who thinks we should start with 50%
Remember a test could fail
Product manager defines a limited new experiment
We also test new must have features
There is no A version.
Control group just don’t get it.
we need to improve before releasing to all users.
Lose mobile view ?
Unable to update ?
Pause is a temporary state until system improves and resume test
Server side state – performance vs correctness, cross-datacenters replicas
The end result of every A/B test is reaching a decision. For this we need enough numbers.
Add %, countries etc’
As discussed in the pause scenario, here too we cannot take away the ‘new’ experience
For anonymous users – this is the best we can do. This means sometimes (~11%) users will see different experiences.
What would you expect the result should be for a bot? A? B?
2-nd T-Shirt time!
Production is never in a ‘known’ state
At least 2^ (more than 2 options)
It is hard to know and we don’t always know exactly.
Try to understand what was opened recently / recreate and eliminate
Overrides also list of users.
The obvious answer may be – allow the friend to edit the component if it’s already in the site
But then – what if the friend deletes the component by mistake (or on purpose)? Then if he’s assigned to A he won’t be able to add it back.
Possible solution – assign by site owner instead of by user
(this means you must implement server side state) (why? Bcos you don’t know what lang/geo etc the site owner was when he got assigned – you only know his user id)
Not perfect, user may experience something else on his own document
Expose features internally to company employees
Select assignment by sites (not only by users)