Mathew Beane discusses strategies for optimizing and scaling Magento applications on clustered infrastructure. Some key points include:
- Using Puppetmaster to build out clusters with standard webnodes and database configurations.
- Magento supports huge stores and is very flexible and scalable. Redis is preferred over Memcache for caching.
- Important to have application optimization, testing protocols and deployment pipelines in place before scaling.
- Common components for scaling include load balancers, proxying web traffic, clustering Redis with Sentinel and Twemproxy, adding read servers and auto-scaling.
4. • It is not ready yet….
• We are finalizing a release of the core configuration soon.
• Moving to RHEL /CentOS 7
• Performance improvements from move to 7 are worth the wait.
• Hoping to see it ready for Lonestar PHP
Sneak Peak:
• Uses Puppetmaster to build out cluster
• Initial release should be the standard (n)webnodes + single db config
• Vagrant boxes will be available to allow for development environments
5. • Open-source e-commerce platform
• Based on Zend Framework 1
• Very flexible, it’s built to modify
• Extremely scalable, supports huge stores
• Market leader and still growing
• Magento 2 is right around the corner
• It’s is in development beta now
• http://magento.com/developers/magento2
6.
7. Before you start:
Application is optimized and clean
Development Pipeline in place
Deployment infrastructure is solid
Rigorous testing protocols are in place
Get started:
Add a Load Balancer
Proxy Web Traffic to web(n) servers
Cluster Redis using Sentinel / Twemproxy
Add Varnish if application permits
Add MySQL Read servers
Build in auto-scaling and your done
8. • PHP 5.4+ - Must use PHP-FPM instead of mod_php
• Nginx is easier to use in a clustered environment
– The configurations are cleaner
– Proxy to PHP-FPM nodes it’s fast, clean and easy
• Nexcess Whitepaper: https://github.com/nexcess/ee-whitepaper-v1-configs
• Redis is the preferred cache for Magento
– Using Sentinel and twemproxy you can horizontally scale
– Cluster twemproxy to eliminate all single threaded bottlenecks and single points of
failure
• Turn off MySQL Query Caching – This is single threaded and not of use with Magento
• Varnish is great, however your application must be suitable
• Use Zend Server 8 Z-Ray to profile and tune your Magento Application
9. You must be able to package your
application in order to deploy it to a
cluster.
Application Packaging
Maintain separate packages for Core
Magento and all of your extensions.
Composer / Modman
Never edit core files. Ensure that you can
see any core file changes.
Keep a clean core
Choose a branching methodology and build
around it.
Release / Feature Branch
Build testing into all of your development
cycles, never release without complete
testing. Build unit tests where possible.
Testing
Make pull requests part of your workflow.
Use Pull Requests
10. • When deploying an application to a cluster of application servers this is a requirement
• There are many choices:
– Capistrano: Written in Ruby but well accepted and great to work with
– Jenkins: Another example of Deployment Automation
– Bamboo: Part of the Atlassian stack also includes testing and other features.
– Roll Your Own: This is more common, using bash scripts and other tools you can build a project
deployment tool fairly easily.
I highly suggest researching Fabrizo Branca’s work on the subject:
http://www.slideshare.net/aoepeople/rock-solid-magento
Also check out Joshua Warren’s slides on Test Driven Development and his stem to stern tutorial Rock
Solid Magento Development.
http://www.slideshare.net/joshuaswarren/
11. "Don't rate potential over performance."
- Jim Fassel
Blaze Meter
Using Blazemeter you can
easily build repeatable tests,
with very nice graphs.
(based on JMeter)
Gatling
http://gatling.io/
On par with Blazemeter.
JMeter
Very effective, without
having to purchase a SaaS
Siege
Can be used minimally to
simulate some types of load.
12. Vs.
HARDWARE
• HAProxy: a free fast and reliable solution
featuring high availability, load balancing and
proxying for TCP and HTTP-based
applications.
• HAProxy can be used for web servers,
database servers, Redis and any other TCP
based application.
• F5 Hardware load balancers are a standard
• Rackspace offers a very easy to use web
interface to maintain a hybrid infrastructure
• Hardware load balancers offer a turn-key
mature solution.
SOFTWARE
13. Before you start:
Application is optimized and clean
Development Pipeline in place
Deployment infrastructure is solid
Rigorous testing protocols are in place
Get started:
Add a Load Balancer
Proxy Web Traffic to web(n) servers
Cluster Redis using Sentinel / Twemproxy
Add Varnish if application permits
Add MySQL Read servers
Build in auto-scaling and your done
14. Hardware Load Balancer
HAProxy Load Balancer
Sentinel / twemproxy
High Availability
MySQL / Percona
Master/Slave
Single Write /
Multiple Read Servers
Database
Apache / Nginx
PHP 5.4 +
Multiple Web Servers
Varnish
Web
File Server (NFS / NAS)
Redis / Memcache
Deployment Tools
Monitoring
Other
Typical Cluster Components
15. Expensive, with many
features out of the box
that will make for an easy
turnkey solution.
Using Sentinel and
twemproxy wrapped by
your load balancer will
give you amazing
performance out of your
Redis cluster.
Read servers can be load
balanced with Magento.
This is easy to achieve,
but this doesn’t solve the
checkout issue.
Defacto software load
balancing and High
Availability server.
This is the most common
use for a load balancer,
and is a great place to
start your setup.
• Budget concerns will drive this decision
• Hosting Choices will affect availability, costs and toolsets
• Start locally with HAProxy and build test clusters using vagrant
• HAProxy can still be used, with a hardware load balancer in place.
16. • Simple to Load Balance, most sites start here
• Challenges include the following:
– Session Management: should be a no brainer if your using redis for sessions
– Shared filesystem: Use NFS for media, keep all code local to web servers
– Fencing: easily solved with either software or hardware
– Log Collection: Rsyslog or Splunk
• Keep the system builds simple, repeatable and automate if possible
• How to automate:
– Create an complete image to work from - include puppet so it can pull from puppetmaster
– Puppetmaster spins up your webserver stack
– You have your deployment process in place, so tie it into puppet and pull the code
• Be prepared to lose nodes, the more you have the more likely failure is
• When a node runs amok, you must be prepared to kill it dead
17. “Redis clustering using sentinel is easy to set up. Adding twemproxy allows for a highly scalable Redis cluster
and you get auto fail over and a ton of other benefits with this configuration. This arrangement can also
remove your Redis single point of failure.”
http://aepod.com/clustering-magento-redis-caching-with-sentinel-keepalived-twemproxy-and-twemproxy-agent/
• Sentinel Monitors Redis clusters
• twemproxy handles sharding
• twemproxy agent monitors sentinel
• Very robust when setup, nothing is single
threaded, everything is HA and the
speed….
• Pretty much transparent to Magento
despite the complexity
18. • Percona XtraDB to cluster MySQL
• Percona XtraBackup for duplicating and rebuilding nodes
• Percona Toolkit to help debug any issues your running into
• Difficult to scale Write Servers
• Scale out your read servers as needed, but MySQL reads are rarely the bottleneck
• Typically Slave server is used for backup and hot swap, NOT clustering.
A couple quick tips:
• Not all tables in Magento are InnoDB, converting the MyISAM and Memory tables is OK
• Usually overtime you will need to be able to kill read servers and refresh (STONITH)
• Use your Master server as a read server in the load balancer pool, when you kill all your read
servers, it can fall back to master.
19. • Detailed Auto-Scaling will have to wait until LonestarPHP
– 3 Hour tutorial, which will include web server auto-scaling using puppet
– April 16th 2015 in Dallas
• Insert puzzle building analogy joke here: http://www.wikihow.com/Assemble-Jigsaw-Puzzles
• Each hosting environment has its own quirks and add on top of that the business logic
requirements you will almost always have a unique infrastructure for every client
• Build small pieces and work them into the larger picture, you can get a lot of performance
with a few minor changes.
• Test everything you do, keep detailed notes on the configurations and compare against the
previous tests
20. • Mathew Beane <mbeane@robofirm.com>
• Twitter: @aepod
• Blog: http://aepod.com/
Rate this talk:
https://joind.in/13837
(Slides will be available)
Thanks to the following:
My Family
The Magento Community
Robofirm
Fabrizo Branca (deployments)
Thjis Feryn (sentinel)
Rackspace
Editor's Notes
Sits on projector while people file in.
Mathew Beane, Director, Systems Engineering at Robofirm
Magento Certified developer and part of the Zend Z-Team, where I am contributing to Zend Server
PHP developer since 2000, Magento developer since 2009.
Robofirm is a Magento Solutions provider and a Magento Parter focused on Mid-level to Enterprise clients.
Based out of New York City, however the bulk of our developers are in Dallas or Minneapolis.
We are currently hiring talented Magento developers. Talk with Mike or Ryan after this presentation, they will be the guys heckling me and shooting nerf darts at me.
About magento.
Dislike magento because of: EAV, Zend Framework 1?, Too bloated, sites are a mess, may like some other platform
Talk about how sites need to scale because they almost all inevitably grow. This is pretty exciting work to be doing as a developer.
Obviously Magento Mountain Climbers, you can tell by the color of their tents?
Anyhow, talk a little about optimization in a moment, we will get to load balances later on and we have already talked about metrics and testing a little, and will be coming back if there is time.
Needs to highlight using metrics to prove performance as you go. Applications will vary, so you need to methodically use metrics to judge performance.
Breaking Points, where you need to adjust first.
php-fpm on it's own nodes, proxied by nginx.
Varnish / Turpentine
NFS becomes bottleneck
Redis Clustering pairing twemproxy (nutcracker) with redis is a no-brainer as it multiplexes the redis queries, even if you just put it on the same node as redis is running. and
Database
Rinse and Repeat
Load balancer Breakout
Fencing is the process of isolating a node of a computer cluster or protecting shared resources when a node appears to be malfunctioning.
As the number of nodes go up, so does the likelihood of failure
Kill a node.. errant node which might have run amok with cluster resources is simply shot in the head
Scalearc and write-ahead-logging as a solution for single writes