This document provides tips for optimizing MySQL performance. It discusses setting up configuration files to log slow queries and enable the performance schema. It describes using indexes to optimize queries and proper SQL techniques to avoid N+1 problems and handle duplicates. The document also covers transactions, isolation levels, and advanced MySQL features like JSON, computed columns, and CHECK constraints. The overall message is that following best practices for configuration, indexing, and writing efficient SQL can help boost application performance.
4. € whoami
● Federico Razzoli
● Freelance consultant
● Writing SQL since MySQL 2.23
hello@federico-razzoli.com
● I love open source, sharing,
Collaboration, win-win, etc
● I love MariaDB, MySQL, Postgres, etc
○ Even Db2, somehow
5. This talk applies to...
● MySQL
● Percona Server
● MariaDB
And most information applies, with some changes, to:
● All other relational DBMSs
6. This talk is not about...
● ORMs
● PHP code
● Query optimisation
● SQL_MODE
● MySQL characteristics that I don’t want to advertise
○ For a reason
○ But you are allowed to ask questions that I hope
you don’t ask
○ I will still say “thank you for your question”
7. Why do I want to
talk about MySQL
at a PHP event?
10. Slow log
ls -1 $( mysql -e 'SELECT @@datadir' ) | grep slow
● Empty the slow log before a test
○ echo '' > /path/to/slowlog
● Check the slow log when you want to check your queries
○ Includes query duration, rows returned and some details on the execution
plan
13. Queries with no results
SELECT *
FROM performance_schema.events_statements_summary_by_digest
WHERE
(
TRIM(DIGEST_TEXT) LIKE 'SELECT%'
OR TRIM(DIGEST_TEXT) LIKE 'CREATE%TABLE%SELECT%'
OR TRIM(DIGEST_TEXT) LIKE 'DELETE%'
OR TRIM(DIGEST_TEXT) LIKE 'UPDATE%'
OR TRIM(DIGEST_TEXT) LIKE 'REPLACE%'
)
AND SUM_ROWS_SENT = 0
AND SUM_ROWS_AFFECTED = 0
ORDER BY SUM_ROWS_EXAMINED DESC
LIMIT 10
G
16. An index is an ordered data structure
● Think to a phone book
● It is a table with an index on (last_name, first_name)
● First takeaway: the order of columns matters
● Your mind contains a pretty good SQL optimiser
● When you want to know which queries can be optimised with a certain index,
think to a phone book
17. Which queries can be optimised?
● WHERE last_name = 'Baker'
● WHERE first_name = 'Tom'
● WHERE first_name = 'Tom' AND last_name = 'Baker'
● WHERE last_name = 'Baker' AND first_name = 'Tom'
18. Rule #1:
A query can use a whole index
Or a leftmost part of an index
19. Which queries can be optimised?
● WHERE last_name = 'Baker'
● WHERE last_name <> 'Baker'
● WHERE last_name > 'Baker'
● WHERE last_name >= 'Baker'
● WHERE last_name < 'Baker'
● WHERE last_name =< 'Baker'
20. Which queries can be optimised?
● WHERE last_name > 'B' AND last_name < 'C'
● WHERE last_name BETWEEN 'B' AND 'BZZZZZZZZZZZ';
● WHERE last_name LIKE 'B%'
● WHERE last_name LIKE '%ake%'
● WHERE last_name LIKE '%r'
21. Rule #2:
You can use an index to find a value
Or a (closed/open) range
22. Which queries can be optimised?
● WHERE last_name = 'Nimoy' OR first_name = 'Leonard'
● WHERE last_name = 'Nimoy' OR last_name = 'Shatner'
24. Which queries can be optimised?
● WHERE last_name = 'Nimoy' AND first_name = 'Leonard'
● WHERE last_name = 'Nimoy' AND first_name > 'Leonard'
● WHERE last_name > 'Nimoy' AND first_name = 'Leonard'
● WHERE last_name > 'Nimoy' AND first_name > 'Leonar0d'
27. N + 1 problem
Don’t:
foreach ( SELECT * FROM author WHERE a.LIKE 'P%'; )
SELECT * FROM book WHERE author_id = ?;
Do:
SELECT a.first_name, a.last_name, b.*
FROM book b
JOIN author a
ON b.id = a.book_id
WHERE a.last_name = 'P%';
28. Dealing with duplicates
INSERT INTO product (id, ...) VALUES (24, ...);
INSERT IGNORE INTO product (id, ...) VALUES (24, ...);
INSERT INTO product (id, ...)
ON DUPLICATE KEY UPDATE name = 'Sonic screwdriver';
REPLACE INTO product (id, ...) VALUES (24, ...);
DELETE IGNORE ...
UPDATE IGNORE ...
29. Insert many rows
INSERT INTO user
(first_name, last_name, email)
VALUES
('William', 'Hartnell', 'first@bbc.co.uk'),
('Tom', 'Baker', 'tom@gmail.com'),
('Jody', 'Wittaker', 'first_lady@hotmail.com');
INSERT INTO `order` (user_id, product_id) VALUES
(LAST_INSERT_ID(), 24);
30. Delete/Update many tables
DELETE `order`, user
FROM `order`
INNER JOIN `order`
ON order.user_id = user.id
WHERE user = 24;
UPDATE `order`, user
FROM `order`
INNER JOIN `order`
ON order.user_id = user.id
SET status = 'CANCELLED'
WHERE user = 24;
32. Creating table with rows
CREATE TABLE past_order LIKE `order`;
INSERT INTO past_order
SELECT * FROM `order`
WHERE status IN ('SHIPPED', 'CANCELLED');
Or:
CREATE TABLE customer
SELECT u.id, u.first_name, u.last_name
FROM user u JOIN `order` o
ON u.id = o.user_id
WHERE o.status <> 'CANCELED';
34. What are transactions?
ACID
● Atomicity
○ All writes in a transaction fail or succeed altogether.
● Consistency
○ Data always switch from one consistent point to another.
● Isolation
○ Transactions are logically sequential.
● Durability
○ Data changes persist after system failures (crashes).
35. What are transactions?
START TRANSACTION;
SELECT … ;
UPDATE … ;
INSERT … ;
COMMIT;
START TRANSACTION;
DELETE … ;
INSERT … ;
ROLLBACK;
SET SESSION autocommit := 1; -- this is the default
DELETE … ;
36. What are transactions?
START TRANSACTION;
SELECT qty
FROM product
-- why did we use "qty > 0"?
WHERE id = 240 AND qty > 0
-- what is this?
IN SHARE MODE;
INSERT INTO orders (user_id, product_id) VALUES (24, 240);
UPDATE product SET qty = qty - 1 WHERE id = 240;
COMMIT;
37. Isolation levels
● READ UNCOMMITTED
○ You could see not-yet-committed changes.
● READ COMMITTED
○ Each query acquires a separate snapshot.
● REPEATABLE READ (default)
○ One snapshot for the whole transaction.
● SERIALIZABLE
○ Like REPEATABLE READ, but SELECTs are implicitly IN SHARE MODE
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
...
41. Use cases for READ COMMITTED?
● Delete/update rows by id from multiple tables
● Show user’s payment history
● For each exam, show users who passed it
42. READ ONLY transactions
● Make sense with REPEATABLE READ
● 2 SELECTs will see consistent data
● Attempts to change data will return an error
● Performance optimisation
○ But not as much as READ UNCOMMITTED
START TRANSACTION READ ONLY;
43. Ways to kill MySQL
Having SELECT privilege is enough to kill MySQL!
(or any RDBMS)
Method 1:
START TRANSACTION;
SELECT * FROM `order`;
SELECT SLEEP(3600 * 12);
44. Ways to kill MySQL
Having SELECT privilege is enough to kill MySQL!
(or any RDBMS)
Method 2:
START TRANSACTION;
SELECT * FROM `order` WHERE id = 24 FOR UPDATE;
SELECT SLEEP(3600 * 12);
46. CHECK constraints
MySQL 8.0+, MariaDB 10.2+
CREATE TABLE person (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL,
CHECK (email LIKE '_%@_%.__%'),
birth_date DATE NOT NULL,
death_date DATE,
CHECK (birth_date <= death_date OR death_date IS NULL)
);
47. Computed columns
CREATE TABLE person (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
full_name GENERATED ALWAYS AS
(CONCAT(first_name, ' ', last_name)),
email VARCHAR(100) NOT NULL,
birth_date DATE NOT NULL,
death_date DATE,
is_alive BOOL GENERATED ALWAYS AS (death_date IS NULL)
);
48. DEFAULT clauses
MySQL 8.0+, MariaDB 10.2+
CREATE TABLE person (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
full_name VARCHAR(100) NOT NULL
DEFAULT (CONCAT(first_name, ' ', last_name)),
email VARCHAR(100) NOT NULL,
birth_date DATE NOT NULL,
death_date DATE,
is_alive BOOL NOT NULL DEFAULT (death_date IS NULL)
);
49. DEFAULT v. Computed columns
● Computed values cannot be changed
● Both regular and computed columns can be indexed
○ Your DBA will not consider this option
● Indexed computed columns will work “implicitly”:
SELECT ...
WHERE CONCAT(first_name, ' ', last_name) =
'Peter Capaldi';
...but not in MariaDB
50. DEFAULT v. Computed columns
On a computed column, you can also build:
● CHECK constraints
○ Enforce a minimum length for full_name
○ Reject dead users
● UNIQUE indexes
○ REPLACE(last_name, ' ', ''), first_name
51. How to kill MySQL
In this case, it’s hard.
● DEFAULTs are normally lightweight
● The same is true for CHECKs
○ You cannot use a SELECT as a CHECK
● You can still try to make writes slow and fill the disk
with computed columns that produce big values
52. How to kill MySQL
For work-intensive workloads:
● UNIQUE may cause many disk reads
● FOREIGN KEYs cause many extra checks
54. Compose JSON values
CREATE TABLE person (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
... ,
data JSON NOT NULL DEFAULT (
JSON_OBJECT(
'id', id,
'emails', JSON_ARRAY(email_main, email_emergency),
'full_name', JSON_OBJECT(
'first', first_name,
'last', first_name,
)
)
)
);
55. Extract values from JSON
CREATE TABLE person (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email_main VARCHAR(100) NOT NULL
DEFAULT (JSON_EXTRACT(data, '$[0]')),
email_emergency VARCHAR(100) NOT NULL
DEFAULT (JSON_EXTRACT(data, '$[1]')),
first_name VARCHAR(50) NOT NULL
DEFAULT (JSON_EXTRACT(data, '$.full_name.first')),
last_name VARCHAR(50) NOT NULL
DEFAULT (JSON_EXTRACT(data, '$.full_name.last')),
data JSON NOT NULL ...
);