7. Nuvola
scuoladigitale.info
● > 3M HTTP requests / day
● > 1000 databases
● ~ 0.5T mysql data
● ~ 180M query / day
● ~ 105M of media files
● ~ 18T of media files
● From ~5k to ~200k sessions in 5 minutes
8. Scalability
Your app is scalable if it can adapt to
support an increasing amount of data
or a growing number of users.
9. “But… I don’t have an increasing load”
(http://www.freepik.com/free-photos-vectors/smile - Smile vector designed by Freepik)
10. “Scalability doesn’t matter to you.”
(http://www.freepik.com/free-photos-vectors/smile - Smile vector designed by Freepik)
11.
12. “I do have an increasing load”
(http://www.freepik.com/free-photos-vectors/smile - Smile vector designed by Freepik)
29. 6 main areas
1. web server
2. sessions
3. database
4. filesystem
5. async tasks
6. logging
There are some more (http caching, frontend, etc): next talk!
36. Web server / Cache
PHP CACHE
APPLICATION CACHE
DOCTRINE CACHE
37. Web server / PHP cache
OPcache
OPcache improves PHP performance by storing precompiled script bytecode
in shared memory, thereby removing the need for PHP to load and parse
scripts on each request.
http://php.net/manual/en/intro.opcache.php
39. Web server / PHP cache
OPcache
Bytecode caching
opcache.enable = On
opcache.validate_timestamps = 0
Need to manually reset OPcache on deploy!
https://tideways.io/profiler/blog/fine-tune-your-opcache-configuration-to-avoid-caching-suprises
40. PHP code / Application cache
● Put application cache in ram
● Use cache warmers during deploy
releaseN/var/cache -> /var/www/project/cache/releaseN
“/etc/fstab”
tmpfs /var/www/project/cache tmpfs size=512m
41. PHP code / Doctrine cache
● Configure Doctrine to use cache
● Disable Doctrine logging and profiling on prod
doctrine.orm.default_metadata_cache:
type: apcu
doctrine.orm.default_query_cache:
type: apcu
doctrine.orm.default_result_cache:
type: apcu
44. PHP code / Profiling
Blackfire
New Relic
Tideways
45.
46.
47. PHP code / Recap
● Easy
● No need to change your PHP code
● It’s most configuration and tuning
● You can do one by one and measure how it affects performance
● Need to monitor and profile: New Relic for PHP
● Don’t waste time on micro-optimization
Take away: use cache!
48. Sessions
● Think session management as a service
● Use centralized Memcached or Redis (Ec2
or ElasticCache on AWS)
● Avoid sticky sessions (load balancer set up)
49. Session / Memcached
No bundle required
https://labs.madisoft.it/scaling-symfony-sessions-with-memcached
54. Session / Recap
● Very easy
● No need to change your PHP code
● Redis better than Memcached: it has persistence and many other features
● Let AWS scale for you and deal with failover and sysadmin stuff
Take away: use Redis
62. Database / Big db problems
● Very slow backup. High lock time
● If mysql crashes, restart takes time
● It takes time to download and restore in dev
● You need expensive hardware (mostly RAM)
63. Database / Short-term solutions
Use a managed db service like AWS RDS
● It scales for you
● It handles failover and backup for you
But:
● It’s expensive for big db
● Problems are only mitigated but they are still there
66. Database / Sharding
● Very fast backup. Low lock time
● If mysql crashes, restart takes little time
● Fast to download and restore in dev
● No need of expensive hardware
● You arrange your dbs on many machines
67. Database / Sharding
● How can Symfony deal with them?
● How to execute a cli command on one of them?
● How to apply a migration (ie: add column) to 1000 dbs?
● …...
69. Database / Sharding
Define a DBAL connection and a ORM
entity manager for each db
https://symfony.com/doc/current/doctrine/multiple_entity_managers.html
73. Database / Doctrine sharding
● Suited for multi-tenant applications
● Global database to store shared data (ie: user data)
● Need to use uuid
http://docs.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/sharding.html
75. Database / Sharding
ShardManager Interface
$shardManager = new PoolingShardManager();
$currentCustomerId = 3;
$shardManager->selectShard($currentCustomerId);
// all queries after this call hit the shard where customer
// with id 3 is on
$shardManager->selectGlobal();
// the global db is selected
76. Database / Sharding
● It works but it’s complex to be managed
● No documentation everywhere
● Need to manage shard configuration: adding a new
shard?
● Need to parallelize shard migrations: Gearman?
● Deal with sharding in test environment
77. Database / Recap
● NOSQL is not used to scale SQL: they have different purposes. You can
use both.
● Sharding is difficult to implement
● Need to change your code
● Short-term solution is to use AWS to leverage some maintenance
● Doctrine ORM sharding works well but you need to write code and
wrappers. Best suited for multi-tenant apps
● When it’s done, you can scale without any limit
Take away: do sharding if your REALLY need it
85. Filesystem / Recap
● Easy
● Need to change your PHP code
● Ready-made bundles
● Avoid local filesystem and NAS
Take away: use FlystemBundle with S3
93. RabbitMQ
Putting some machines (containers) inside an
auto-scaling group!
They can scale based on:
● Hardware parameters: cpu / memory
● Number of queue items
● Add your custom metrics!
94. Async tasks / Recap
● You need an external system and some new machines / containers
● Need to change your PHP code
● Ready-made bundles and libraries
● Avoid blocking sync tasks. Put the message on the queue and move on.
Take away: use RabbitMQ with auto-scaling
consumers
97. Logging
● You need an external system
● Take a look at managed ones: loggly.com, logz.io, scalyr.com
● Don’t need to change your PHP code
● You can’t avoid it in a distributed system
Take away: use a managed service
98. Scaling / Recap
● Sessions and filesystem: easy. Do it
● PHP code: not difficult. Think of it. Save money.
● Database: very hard. Think a lot
● Async tasks: think of it if you have many of them.
● Logging: necessary. Easy to implement if you choose a
managed service