SlideShare a Scribd company logo
1 of 24
Scaling up to 30M users
Scaling Software, Scaling Data & Scaling People
The Wix Experience

Devcon TLV Feb 2013

 Aviran Mordo
 Server Group Manager
 Wix
 @aviranm
About Wix
Wix in Numbers




•   Wix was founded in 2006
•   30M registered users from most countries
•   Over 1,000,000 new users every month
•   ~1,000,000 new websites every month
•   Over 150 TByte of users media files
     – More than 1 billion users media files
     – More than 1.5 TByte uploaded files daily
• Over 300 Servers in 2+1 datacenters + Google + Amazon
Wix Initial Architecture

                                                                                  Wix             MySQL
   • Tomcat, Hibernate, Custom web framework                                    (Tomcat)           DB

       –   Everything generated from HBM files
       –   Built for fast development
       –   Statefull login (tomcat session), EHCache, File uploads
       –   Not considering performance, scalability, fast feature rollout, evaluate
       –   It reflected the fact that we didn’t really know what is our business
       –   We know that we will need to replace it when we grow.
       –   However, we failed to understand how difficult that can be!




                                                                                      HTML 5



                                                                        Flash


2006        2007        2008       2009        2010        2011        2012                2013
Wix Initial Architecture


After two years, we have found out that
• Our initial architecture allowed us to progress vary fast
• However, as we progressed, we slowed down
• So, we learned that
    –   Don’t worry about ‘building it right from the start’ – you won’t
    –   You are going to replace stuff you are building in the initial stages
    –   Be ready to do it
    –   Get it up to customers as fast as you can. Get feedback. Evolve.
    –   Our mistake was not planning for gradual re-write
    –   Build for gradual re-write as you learn the problems and find the right
        solutions
Distributed Cache

   Next we added EHCache as Hibernate 2nd-level cache
   • Why?
       – Cause it is in the design
   • How was it?
       –   Black Box cache
       –   How do we know what is the state of our system?
       –   How to invalidate the cache?
       –   When to invalidate it?
       –   How does “operations” manage the cache?
   • Did we really need it? No!
   • We eventually dropped it

                                                                       HTML 5



                                                               Flash


2006        2007       2008          2009   2010       2011   2012        2013
Editor & Public Segments

   • The Challenge - Updates to our Server imposed downtime for our
     customer’s websites
       – Any Server or Database update has the potential of bringing down all Wix sites
       – Is a symptom of a larger issue
   • The Server served two different concerns
       – Wix Users editing websites
       – Viewing Wix Sites, the sites created by the Wix editor
   • The two concerns require different SLA
       – Wix Sites should never ever have a downtime!
       – Wix Sites should work as fast as possible, always!
       – However, an editing system does not require this level of SLA.


                                                                            HTML 5



                                                                    Flash


2006       2007       2008        2009       2010       2011       2012        2013
Editor & Public Segments

• The two concerns evolve independently
    – Releases of Editing feature should have no impact on
                                                                       Public
      existing Wix sites operations!                                 (Tomcat)
                                                                                Public
                                                                                 DB
• Our Solution
    – Split the Server into two Segments – Public and Editor
                                                                       Editor   Editor
• The Public segment targets serving websites for                    (Tomcat)    DB

  Wix Users
    – Has mostly read-only usage pattern – only updated
      when a site is published
    – Simple publishing system
    – Simple and readonly means it is easier to have higher SLA and DRP
    – MySQL used as NoSQL – single large table with XML text fields
• The Editor segment
    – Exposes the Wix Editing APIs, as well as user account and galleries
      management APIs.
    – Has different release schedule compared to the Public segment
Editor & Public Segments

What we have learned
• MySQL is a damn good NoSQL engine
                                                                       Public     Public
    –   Our public DB was (mainly) one huge table                    (Tomcat)      DB
    –   Queries & Updates are by primary key
    –   Instead of relations, we use text/xml or text/json columns     Editor     Editor
    –   No updates for Blobs – immutable data                        (Tomcat)      DB

    –   No Transactions
• Use indirection table to blob table
    – Insert a new blob value, update the pointer to the new blob, async delete
• MySql auto-generated keys cause problems
    – Locks on key generation
    – Require a single instance to generate keys
• We use GUID keys
    – Can be generated by any client
    – No locks in key value generation
    – Enabler for Master-Master replication
Wix on Managed Hosting




            Co-Location                Managed Hosting                            Cloud
       Own and maintain your        Lease both hardware and          Instantly lease hardware
          own hardware                    maintenance
       Provisioning == buy and      Overnight provisioning             Instant provisioning
       deliver your new server                                         Unlimited resources
        Reliable software on          Reliable software on            Reliable software on
         reliable hardware             reliable hardware              unreliable hardware


                                                                                          HTML 5



                                                                            Flash


2006           2007       2008        2009       2010         2011         2012              2013
Wix Media Segment

   • The Challenge – Our static storage reached over 500 GByte of small files
       – The “upload to app server, post process files, copy to lighttpd server, serve by
         lighttpd” pattern proved inefficient, slow and error prone
       – Disk IO became slow and inefficient as the number of files increased
       – We needed a solution we can grow with –
            • HTTP connections
            • number of files
       – We needed control over caching and Http headers
   • We needed dynamic image manipulations
       –   Rebuild a few millions of media files is not simple




                                                                              HTML 5



                                                                      Flash


2006        2007        2008        2009       2010        2011      2012        2013
Prospero – Wix Media Storage

• Our Solution
   – Lighttpd based
   – Sharded on the file name
   – Two copies of each file
   get 37D815B5.jpg      Go to 37 range servers                       Fallback if not found

            00-1f               20-ef                    40-5f                   60-7f




              0.static   HTTP      2.static       HTTP     4.static      HTTP       6.static




              1.static   HTTP      3.static       HTTP    5.static       HTTP      7.static
Prospero – Wix Media Storage

• Dynamic Image processing
    – Picture Pyramid
    – Picture resize, crop and sharpen “on the fly”
    – Thumbnail generation
• Eventual Consistency solutions scale
    – But you have to build for when eventual consistency is not consistent
• Media files caching headers are critical
    – Max-age, ETag, if-modified-since, etc.
    – Think how to tune those parameters for media files, as per your specific needs
• We tried Amazon S3 and Google for secondary storage
    – However, Amazon proved unreliable (connections, availability)
• We found that using a CDN in front of Prospero is very effective
•   Initially, files where stored on the filesystem
• T We added Tokyo Tyrant backend for small files
• M We added Memcached (Redis) layer for “in transit” files
Prospero – Wix Media Storage

• Our current architecture



         Google Cloud                              x36
                                                     x36
           Storage                                M T x32
                                                   M T
                                                     M T

                             Second fallback        Chicago

                                                     First fallback




                      CDN                       x36
                                                  x36
                                               M T x32
                               If not in CDN    M T
                                                  M T

   get 37D815B5.jpg                              Austin
CDN

• Use a CDN!
• CDN acts as a great connection manager
    – We have CDN hit ratio’s of over 99.9%
• Use the “Cache Killer” pattern
    –   http://static.wix.com/client/css/viewer.css?v=327
    –   http://static.wix.com/client/1.3.2/css/viewer.css
    –   Makes flushing files from the CDN redundant
    –   Enabler for longer caching periods
• There are many vendors
    – We started with 1 CDN vendor
    – We are now working with two CDN vendors
    – Different CDN vendors have advantages at different geo
• Tune HTTP Headers per CDN Vendor
    – CDN Vendors interpret HTTP headers differently
Development Velocity

   • The Challenge – Our codebase became large and entangled
       – Feature rollout became harder over time, requiring longer and longer manual
         regression
       – The longer the regression was, the harder is became to make “a good release”
       – Strange full-table scans queries generated by Hibernate, which we still have no
         idea what code is responsible for…
   • The solution
       –   Mid 2010 – Wix Framework – modern base libraries
       –   Beginning 2011 – CI / CD / TDD techniques + DevOps culture
       –   Mid 2011 – Scala
                                                           CI / CD / TDD + DevOps
       –   SOA Architecture (not WSDL)
                                                                         Scala
                                                      Framework



                                                                                         HTML 5



                                                                                 Flash


2006        2007        2008       2009        2010               2011       2012           2013
People are the key

• Train the people you already have
    – We sent our entire QA department to learn Java
    – Developers learn TDD and CI/CD methodologies.
• Hiring the right people is key to success
    –   Hire only the best developers (only seniors)
    –   Don’t count only on the interview, you need to test actual coding
    –   Anyone who interviews can drop a candidate
    –   Hire people who will challenge you (no “yes man”)
    –   Get people you can trust with “root” access to production
• Never stop hiring
    – If we find an excellent person we will create a position for him even if we do
      not have one open.
• Wix is doubling its size every year
    – Yes we are currently hiring.
    – We’re considering to start hiring and training junior developers.
Wix’s CI / CD / TDD + DevOps model

• Abandon “VERSION” paradigm – move feature centric life
• Make small and frequent release as soon as possible
    – Today we release about 10 times a day, gaining velocity
• Empower the developer
    –   The developer is responsible from product idea to 100,000 active users
    –   Remove every obstacle in the developer’s path
    –   Big cultural change from waterfall – affects the whole company
    –   The developer is responsible for his app operations
• Automate everything – CI/CD/TDD
    – CI – Continuous Integration
    – CD – Continuous Delivery / Deployment
    – TDD – Automated unit-tests, integration tests, GUI tests
• Measure Everything (The lean startup way)
    – A/B test every new feature
    – Monitor real KPIs (business, not CPU)
CI / CD @ Wix – Release Process

• Make an RC
   – Runs build, unit-tests, integration tests
CI / CD @ Wix – Release Process

• Deploy as GA
   – Using Chef, Noah, Artifactory
   – Runs Self-Tests
CI / CD @ Wix – Release Process

• Monitor
   – Deployment, NewRelic, App-Info, Recent Events
• Rollback
Products we’ve built (partial list)
   • Wix Mobile
   • Wix HTML5
       – Full HTML 5 support – total rewrite of our Flash product
   • Third Party Applications (TPAs)
       – With over 200,000 installations in the 3 first months
   • Answers
       – Wix unique support system
   • Wix Billing System (PCI Compliant)
                                                                                       Billing
       – Support complex business models for TPAs                                TPA
       – Support diverse geo                                            eCommerce
                                                                  App Builder
   • eCommerce                                                  HTML 5
                                                             Answers
       – Based on Magento
                                                    Mobile
   • BI
                                                                                HTML 5



                                                                       Flash


2006       2007       2008        2009       2010        2011         2012         2013
Wix Hackathon

• http://www.wix.com/publicevents/hackathon2013
Scaling up to 30M users - The Wix Story

More Related Content

What's hot

Scaling wix with microservices and multi cloud - 2015
Scaling wix with microservices and multi cloud - 2015Scaling wix with microservices and multi cloud - 2015
Scaling wix with microservices and multi cloud - 2015Aviran Mordo
 
Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015Aviran Mordo
 
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014Experimenting on Humans - Advanced A/B Tests - QCon SF 2014
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014Aviran Mordo
 
Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-ServicesRandy Shoup
 
Microservices Architecture for Content Management Systems using AWS Lambda an...
Microservices Architecture for Content Management Systems using AWS Lambda an...Microservices Architecture for Content Management Systems using AWS Lambda an...
Microservices Architecture for Content Management Systems using AWS Lambda an...Mitoc Group
 
The Next Big Thing: Serverless
The Next Big Thing: ServerlessThe Next Big Thing: Serverless
The Next Big Thing: ServerlessDoug Vanderweide
 
Cloud Native Camel Riding
Cloud Native Camel RidingCloud Native Camel Riding
Cloud Native Camel RidingChristian Posta
 
ITLCHN 18 - Automation & DevOps - Automic
ITLCHN 18 -  Automation & DevOps - AutomicITLCHN 18 -  Automation & DevOps - Automic
ITLCHN 18 - Automation & DevOps - AutomicIT Expert Club
 
DCSF 19 Modern Orchestrated IT for Enterprise CMS
DCSF 19  Modern Orchestrated IT for Enterprise CMSDCSF 19  Modern Orchestrated IT for Enterprise CMS
DCSF 19 Modern Orchestrated IT for Enterprise CMSDocker, Inc.
 
Fuse integration-services
Fuse integration-servicesFuse integration-services
Fuse integration-servicesChristian Posta
 
Baking Stash in the AWS Cloud at Netflix
Baking Stash in the AWS Cloud at NetflixBaking Stash in the AWS Cloud at Netflix
Baking Stash in the AWS Cloud at NetflixAtlassian
 
Gradual migration to MicroProfile
Gradual migration to MicroProfileGradual migration to MicroProfile
Gradual migration to MicroProfileRudy De Busscher
 
Community day 2013 applied architectures
Community day 2013   applied architecturesCommunity day 2013   applied architectures
Community day 2013 applied architecturesPanagiotis Kefalidis
 
An evolution of application networking: service mesh
An evolution of application networking: service meshAn evolution of application networking: service mesh
An evolution of application networking: service meshChristian Posta
 
Tour to docgen lightning experience
Tour to docgen lightning experienceTour to docgen lightning experience
Tour to docgen lightning experienceKadharBashaJ
 
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...devang-dsshah
 
Java one kubernetes, jenkins and microservices
Java one   kubernetes, jenkins and microservicesJava one   kubernetes, jenkins and microservices
Java one kubernetes, jenkins and microservicesChristian Posta
 

What's hot (20)

Scaling wix with microservices and multi cloud - 2015
Scaling wix with microservices and multi cloud - 2015Scaling wix with microservices and multi cloud - 2015
Scaling wix with microservices and multi cloud - 2015
 
Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015
 
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014Experimenting on Humans - Advanced A/B Tests - QCon SF 2014
Experimenting on Humans - Advanced A/B Tests - QCon SF 2014
 
Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-Services
 
Microservices Architecture for Content Management Systems using AWS Lambda an...
Microservices Architecture for Content Management Systems using AWS Lambda an...Microservices Architecture for Content Management Systems using AWS Lambda an...
Microservices Architecture for Content Management Systems using AWS Lambda an...
 
The Next Big Thing: Serverless
The Next Big Thing: ServerlessThe Next Big Thing: Serverless
The Next Big Thing: Serverless
 
Cloud Native Camel Riding
Cloud Native Camel RidingCloud Native Camel Riding
Cloud Native Camel Riding
 
ITLCHN 18 - Automation & DevOps - Automic
ITLCHN 18 -  Automation & DevOps - AutomicITLCHN 18 -  Automation & DevOps - Automic
ITLCHN 18 - Automation & DevOps - Automic
 
Microservices in Azure
Microservices in AzureMicroservices in Azure
Microservices in Azure
 
From Heroku to Amazon AWS
From Heroku to Amazon AWSFrom Heroku to Amazon AWS
From Heroku to Amazon AWS
 
SOA to Microservices
SOA to MicroservicesSOA to Microservices
SOA to Microservices
 
DCSF 19 Modern Orchestrated IT for Enterprise CMS
DCSF 19  Modern Orchestrated IT for Enterprise CMSDCSF 19  Modern Orchestrated IT for Enterprise CMS
DCSF 19 Modern Orchestrated IT for Enterprise CMS
 
Fuse integration-services
Fuse integration-servicesFuse integration-services
Fuse integration-services
 
Baking Stash in the AWS Cloud at Netflix
Baking Stash in the AWS Cloud at NetflixBaking Stash in the AWS Cloud at Netflix
Baking Stash in the AWS Cloud at Netflix
 
Gradual migration to MicroProfile
Gradual migration to MicroProfileGradual migration to MicroProfile
Gradual migration to MicroProfile
 
Community day 2013 applied architectures
Community day 2013   applied architecturesCommunity day 2013   applied architectures
Community day 2013 applied architectures
 
An evolution of application networking: service mesh
An evolution of application networking: service meshAn evolution of application networking: service mesh
An evolution of application networking: service mesh
 
Tour to docgen lightning experience
Tour to docgen lightning experienceTour to docgen lightning experience
Tour to docgen lightning experience
 
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
 
Java one kubernetes, jenkins and microservices
Java one   kubernetes, jenkins and microservicesJava one   kubernetes, jenkins and microservices
Java one kubernetes, jenkins and microservices
 

Viewers also liked

Lessons Learned Monitoring Production
Lessons Learned Monitoring ProductionLessons Learned Monitoring Production
Lessons Learned Monitoring ProductionAviran Mordo
 
Scaling Wix engineering
Scaling Wix engineering Scaling Wix engineering
Scaling Wix engineering Aviran Mordo
 
Strategies in continuous delivery
Strategies in continuous deliveryStrategies in continuous delivery
Strategies in continuous deliveryAviran Mordo
 
The Art of A/B Testing
The Art of A/B TestingThe Art of A/B Testing
The Art of A/B TestingAviran Mordo
 
Wix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous DeliveryWix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous DeliveryAviran Mordo
 
Road to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comAviran Mordo
 
Wix.com Back-end Engineering Guild Manifesto
Wix.com Back-end Engineering Guild ManifestoWix.com Back-end Engineering Guild Manifesto
Wix.com Back-end Engineering Guild ManifestoAviran Mordo
 
Scaling Wix with microservices architecture and multi-cloud platforms - Reve...
 Scaling Wix with microservices architecture and multi-cloud platforms - Reve... Scaling Wix with microservices architecture and multi-cloud platforms - Reve...
Scaling Wix with microservices architecture and multi-cloud platforms - Reve...Aviran Mordo
 
Introduction to HTTP protocol
Introduction to HTTP protocolIntroduction to HTTP protocol
Introduction to HTTP protocolAviran Mordo
 
Intercom's first pitch deck!
Intercom's first pitch deck!Intercom's first pitch deck!
Intercom's first pitch deck!Eoghan McCabe
 

Viewers also liked (11)

Lessons Learned Monitoring Production
Lessons Learned Monitoring ProductionLessons Learned Monitoring Production
Lessons Learned Monitoring Production
 
Scaling Wix engineering
Scaling Wix engineering Scaling Wix engineering
Scaling Wix engineering
 
Strategies in continuous delivery
Strategies in continuous deliveryStrategies in continuous delivery
Strategies in continuous delivery
 
The Art of A/B Testing
The Art of A/B TestingThe Art of A/B Testing
The Art of A/B Testing
 
Wix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous DeliveryWix Dev-Centric Culture And Continuous Delivery
Wix Dev-Centric Culture And Continuous Delivery
 
Road to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.com
 
Wix.com Back-end Engineering Guild Manifesto
Wix.com Back-end Engineering Guild ManifestoWix.com Back-end Engineering Guild Manifesto
Wix.com Back-end Engineering Guild Manifesto
 
Weebly Website Blog
Weebly Website BlogWeebly Website Blog
Weebly Website Blog
 
Scaling Wix with microservices architecture and multi-cloud platforms - Reve...
 Scaling Wix with microservices architecture and multi-cloud platforms - Reve... Scaling Wix with microservices architecture and multi-cloud platforms - Reve...
Scaling Wix with microservices architecture and multi-cloud platforms - Reve...
 
Introduction to HTTP protocol
Introduction to HTTP protocolIntroduction to HTTP protocol
Introduction to HTTP protocol
 
Intercom's first pitch deck!
Intercom's first pitch deck!Intercom's first pitch deck!
Intercom's first pitch deck!
 

Similar to Scaling up to 30M users - The Wix Story

Continuous Delivery at Wix
Continuous Delivery at WixContinuous Delivery at Wix
Continuous Delivery at WixYoav Avrahami
 
Moving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScaleMoving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScalemmoline
 
Workflow driven development
Workflow driven developmentWorkflow driven development
Workflow driven developmentDmitryDemyankov
 
What's New for the Windows Azure Developer? Lots!!
What's New for the Windows Azure Developer?  Lots!!What's New for the Windows Azure Developer?  Lots!!
What's New for the Windows Azure Developer? Lots!!Michael Collier
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your StartupAmazon Web Services
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk UpdateESUG
 
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...Andrew Miller
 
AAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing Workloa
AAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing WorkloaAAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing Workloa
AAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing WorkloaWASdev Community
 
Mobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best PracticesMobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best PracticesAndrew Ferrier
 
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ivan Zoratti
 
My sql roadmap 2008 2009
My sql roadmap 2008 2009My sql roadmap 2008 2009
My sql roadmap 2008 2009xKinAnx
 
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...Serdar Basegmez
 
From vagrant to production - Mark Eijsermans
From vagrant to production - Mark EijsermansFrom vagrant to production - Mark Eijsermans
From vagrant to production - Mark EijsermansDevopsdays
 
The Kubernetes WebLogic revival (part 1)
The Kubernetes WebLogic revival (part 1)The Kubernetes WebLogic revival (part 1)
The Kubernetes WebLogic revival (part 1)Simon Haslam
 
Simple Cloud with Amazon Lightsail - Mike Coleman
Simple Cloud with Amazon Lightsail - Mike ColemanSimple Cloud with Amazon Lightsail - Mike Coleman
Simple Cloud with Amazon Lightsail - Mike ColemanAmazon Web Services
 

Similar to Scaling up to 30M users - The Wix Story (20)

Continuous Delivery at Wix
Continuous Delivery at WixContinuous Delivery at Wix
Continuous Delivery at Wix
 
Moving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScaleMoving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScale
 
Workflow driven development
Workflow driven developmentWorkflow driven development
Workflow driven development
 
What's New for the Windows Azure Developer? Lots!!
What's New for the Windows Azure Developer?  Lots!!What's New for the Windows Azure Developer?  Lots!!
What's New for the Windows Azure Developer? Lots!!
 
Scaling the Platform for Your Startup
Scaling the Platform for Your StartupScaling the Platform for Your Startup
Scaling the Platform for Your Startup
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk Update
 
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
 
AAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing Workloa
AAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing WorkloaAAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing Workloa
AAI-2075 Evolving an IBM WebSphere Topology to Manage a Changing Workloa
 
Mobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best PracticesMobile and IBM Worklight Best Practices
Mobile and IBM Worklight Best Practices
 
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 
Owd multi repo-v2
Owd multi repo-v2Owd multi repo-v2
Owd multi repo-v2
 
My sql roadmap 2008 2009
My sql roadmap 2008 2009My sql roadmap 2008 2009
My sql roadmap 2008 2009
 
React Tech Salon
React Tech SalonReact Tech Salon
React Tech Salon
 
Dibi Conference 2012
Dibi Conference 2012Dibi Conference 2012
Dibi Conference 2012
 
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...
 
From vagrant to production - Mark Eijsermans
From vagrant to production - Mark EijsermansFrom vagrant to production - Mark Eijsermans
From vagrant to production - Mark Eijsermans
 
The Kubernetes WebLogic revival (part 1)
The Kubernetes WebLogic revival (part 1)The Kubernetes WebLogic revival (part 1)
The Kubernetes WebLogic revival (part 1)
 
Simple Cloud with Amazon Lightsail - Mike Coleman
Simple Cloud with Amazon Lightsail - Mike ColemanSimple Cloud with Amazon Lightsail - Mike Coleman
Simple Cloud with Amazon Lightsail - Mike Coleman
 
Db trends final
Db trends   finalDb trends   final
Db trends final
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Scaling up to 30M users - The Wix Story

  • 1. Scaling up to 30M users Scaling Software, Scaling Data & Scaling People The Wix Experience Devcon TLV Feb 2013 Aviran Mordo Server Group Manager Wix @aviranm
  • 3. Wix in Numbers • Wix was founded in 2006 • 30M registered users from most countries • Over 1,000,000 new users every month • ~1,000,000 new websites every month • Over 150 TByte of users media files – More than 1 billion users media files – More than 1.5 TByte uploaded files daily • Over 300 Servers in 2+1 datacenters + Google + Amazon
  • 4. Wix Initial Architecture Wix MySQL • Tomcat, Hibernate, Custom web framework (Tomcat) DB – Everything generated from HBM files – Built for fast development – Statefull login (tomcat session), EHCache, File uploads – Not considering performance, scalability, fast feature rollout, evaluate – It reflected the fact that we didn’t really know what is our business – We know that we will need to replace it when we grow. – However, we failed to understand how difficult that can be! HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013
  • 5. Wix Initial Architecture After two years, we have found out that • Our initial architecture allowed us to progress vary fast • However, as we progressed, we slowed down • So, we learned that – Don’t worry about ‘building it right from the start’ – you won’t – You are going to replace stuff you are building in the initial stages – Be ready to do it – Get it up to customers as fast as you can. Get feedback. Evolve. – Our mistake was not planning for gradual re-write – Build for gradual re-write as you learn the problems and find the right solutions
  • 6. Distributed Cache Next we added EHCache as Hibernate 2nd-level cache • Why? – Cause it is in the design • How was it? – Black Box cache – How do we know what is the state of our system? – How to invalidate the cache? – When to invalidate it? – How does “operations” manage the cache? • Did we really need it? No! • We eventually dropped it HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013
  • 7. Editor & Public Segments • The Challenge - Updates to our Server imposed downtime for our customer’s websites – Any Server or Database update has the potential of bringing down all Wix sites – Is a symptom of a larger issue • The Server served two different concerns – Wix Users editing websites – Viewing Wix Sites, the sites created by the Wix editor • The two concerns require different SLA – Wix Sites should never ever have a downtime! – Wix Sites should work as fast as possible, always! – However, an editing system does not require this level of SLA. HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013
  • 8. Editor & Public Segments • The two concerns evolve independently – Releases of Editing feature should have no impact on Public existing Wix sites operations! (Tomcat) Public DB • Our Solution – Split the Server into two Segments – Public and Editor Editor Editor • The Public segment targets serving websites for (Tomcat) DB Wix Users – Has mostly read-only usage pattern – only updated when a site is published – Simple publishing system – Simple and readonly means it is easier to have higher SLA and DRP – MySQL used as NoSQL – single large table with XML text fields • The Editor segment – Exposes the Wix Editing APIs, as well as user account and galleries management APIs. – Has different release schedule compared to the Public segment
  • 9. Editor & Public Segments What we have learned • MySQL is a damn good NoSQL engine Public Public – Our public DB was (mainly) one huge table (Tomcat) DB – Queries & Updates are by primary key – Instead of relations, we use text/xml or text/json columns Editor Editor – No updates for Blobs – immutable data (Tomcat) DB – No Transactions • Use indirection table to blob table – Insert a new blob value, update the pointer to the new blob, async delete • MySql auto-generated keys cause problems – Locks on key generation – Require a single instance to generate keys • We use GUID keys – Can be generated by any client – No locks in key value generation – Enabler for Master-Master replication
  • 10. Wix on Managed Hosting Co-Location Managed Hosting Cloud Own and maintain your Lease both hardware and Instantly lease hardware own hardware maintenance Provisioning == buy and Overnight provisioning Instant provisioning deliver your new server Unlimited resources Reliable software on Reliable software on Reliable software on reliable hardware reliable hardware unreliable hardware HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013
  • 11. Wix Media Segment • The Challenge – Our static storage reached over 500 GByte of small files – The “upload to app server, post process files, copy to lighttpd server, serve by lighttpd” pattern proved inefficient, slow and error prone – Disk IO became slow and inefficient as the number of files increased – We needed a solution we can grow with – • HTTP connections • number of files – We needed control over caching and Http headers • We needed dynamic image manipulations – Rebuild a few millions of media files is not simple HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013
  • 12. Prospero – Wix Media Storage • Our Solution – Lighttpd based – Sharded on the file name – Two copies of each file get 37D815B5.jpg Go to 37 range servers Fallback if not found 00-1f 20-ef 40-5f 60-7f 0.static HTTP 2.static HTTP 4.static HTTP 6.static 1.static HTTP 3.static HTTP 5.static HTTP 7.static
  • 13. Prospero – Wix Media Storage • Dynamic Image processing – Picture Pyramid – Picture resize, crop and sharpen “on the fly” – Thumbnail generation • Eventual Consistency solutions scale – But you have to build for when eventual consistency is not consistent • Media files caching headers are critical – Max-age, ETag, if-modified-since, etc. – Think how to tune those parameters for media files, as per your specific needs • We tried Amazon S3 and Google for secondary storage – However, Amazon proved unreliable (connections, availability) • We found that using a CDN in front of Prospero is very effective • Initially, files where stored on the filesystem • T We added Tokyo Tyrant backend for small files • M We added Memcached (Redis) layer for “in transit” files
  • 14. Prospero – Wix Media Storage • Our current architecture Google Cloud x36 x36 Storage M T x32 M T M T Second fallback Chicago First fallback CDN x36 x36 M T x32 If not in CDN M T M T get 37D815B5.jpg Austin
  • 15. CDN • Use a CDN! • CDN acts as a great connection manager – We have CDN hit ratio’s of over 99.9% • Use the “Cache Killer” pattern – http://static.wix.com/client/css/viewer.css?v=327 – http://static.wix.com/client/1.3.2/css/viewer.css – Makes flushing files from the CDN redundant – Enabler for longer caching periods • There are many vendors – We started with 1 CDN vendor – We are now working with two CDN vendors – Different CDN vendors have advantages at different geo • Tune HTTP Headers per CDN Vendor – CDN Vendors interpret HTTP headers differently
  • 16. Development Velocity • The Challenge – Our codebase became large and entangled – Feature rollout became harder over time, requiring longer and longer manual regression – The longer the regression was, the harder is became to make “a good release” – Strange full-table scans queries generated by Hibernate, which we still have no idea what code is responsible for… • The solution – Mid 2010 – Wix Framework – modern base libraries – Beginning 2011 – CI / CD / TDD techniques + DevOps culture – Mid 2011 – Scala CI / CD / TDD + DevOps – SOA Architecture (not WSDL) Scala Framework HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013
  • 17. People are the key • Train the people you already have – We sent our entire QA department to learn Java – Developers learn TDD and CI/CD methodologies. • Hiring the right people is key to success – Hire only the best developers (only seniors) – Don’t count only on the interview, you need to test actual coding – Anyone who interviews can drop a candidate – Hire people who will challenge you (no “yes man”) – Get people you can trust with “root” access to production • Never stop hiring – If we find an excellent person we will create a position for him even if we do not have one open. • Wix is doubling its size every year – Yes we are currently hiring. – We’re considering to start hiring and training junior developers.
  • 18. Wix’s CI / CD / TDD + DevOps model • Abandon “VERSION” paradigm – move feature centric life • Make small and frequent release as soon as possible – Today we release about 10 times a day, gaining velocity • Empower the developer – The developer is responsible from product idea to 100,000 active users – Remove every obstacle in the developer’s path – Big cultural change from waterfall – affects the whole company – The developer is responsible for his app operations • Automate everything – CI/CD/TDD – CI – Continuous Integration – CD – Continuous Delivery / Deployment – TDD – Automated unit-tests, integration tests, GUI tests • Measure Everything (The lean startup way) – A/B test every new feature – Monitor real KPIs (business, not CPU)
  • 19. CI / CD @ Wix – Release Process • Make an RC – Runs build, unit-tests, integration tests
  • 20. CI / CD @ Wix – Release Process • Deploy as GA – Using Chef, Noah, Artifactory – Runs Self-Tests
  • 21. CI / CD @ Wix – Release Process • Monitor – Deployment, NewRelic, App-Info, Recent Events • Rollback
  • 22. Products we’ve built (partial list) • Wix Mobile • Wix HTML5 – Full HTML 5 support – total rewrite of our Flash product • Third Party Applications (TPAs) – With over 200,000 installations in the 3 first months • Answers – Wix unique support system • Wix Billing System (PCI Compliant) Billing – Support complex business models for TPAs TPA – Support diverse geo eCommerce App Builder • eCommerce HTML 5 Answers – Based on Magento Mobile • BI HTML 5 Flash 2006 2007 2008 2009 2010 2011 2012 2013

Editor's Notes

  1. Managed Hosting costs - $0.13 / GByte storage (counting two copies), which includes 100TByte traffic per host (effectively free traffic)Cloud costs – S3 - $0.06 / Gbyte (Standard Storage) + $0.05 / GByte
  2. Akamai (Cotendo)And Level3
  3. Key performance indicators