SlideShare a Scribd company logo
1 of 17
Download to read offline
Linux IO internals
for database administrators
Ilya Kosmodemiansky
ik@postgresql-consulting.com
Why this talk
• Linux is a most common OS for databases
• DBAs often run into IO problems
• Most of the information on topic is written by kernel
developers (for kernel developers) or is checklist-style
• Checklists are useful, but up to certain workload
The main IO problem for databases
• How to maximize page throughput between memory and
disks
• Things involved:
Disks
Memory
CPU
IO Schedulers
Filesystems
Database itself
• IO problems for databases are not always only about disks
A typical database
DRAM
Disks
Shared memory
Database
Linux
Page cache
User space
Kernel space
WAL buffer
WAL Datafile
Key things about such workload
• Shared memory segment can be very large
• Keeping in-memory pages synchronized with disk generates
huge IO
• WAL should be written fast and safe
• One and every layer of OS IO stack involved
Memory allocation and mapping
CPU L1
MMU TLB
Memory
L2 L3
page
table
Virtual addressing
Translation
Physical addressing
DBAs takeaways:
• Database with huge shared memory segment benefits from
huge pages
• But not from transparent huge pages
Databases operate large continuous shared memory segments
THP defragmentation can lead to severe performance
degradation in such cases
Freeing memory
Page cache
Swap
Free list Free list Free list
alloc()free()
page-out
vm.min_free_kbytes
reclaim
vm.swapiness
OOM-killer
vm.panic_on_oom
Page-out
• Page-out happens if:
someone calls fsync
30 sec timeout exceeded (vm.dirty_expire_centisecs)
Too many dirty pages (vm.dirty_background_ratio and
vm.dirty_ratio)
• It is reasonable to tune vm.dirty_* on rotating disks with
RAID controller, but helps a little on server class SSDs
DBAs takeaways:
• vm.overcommit_memory
0 - heuristic overcommit, reduces swap usage
1 - always overcommit
2 - do not overcommit (vm.overcommit_ratio = 50 by
default)
• vm.min_free_kbytes - reasonably high (can be easy 1000000
on a server with enough memory)
• vm.swappiness = 1
0 - swap disabled
60 - default
100 - swap preffered instead of other reaping mechanisms
• Your database would not like OOM-killer
OOM-killer: plague and cholera
• vm.panic_on_oom effectively disables OOM-killer, but that is
probably not the result you desire
• Or for a certain process: echo -17 > /proc/12465/oom_adj
but again
Filesystems: write barriers
Journal
Kernel buffer
Journalated fylesystem
Data
Journal entry
Data
DBAs takeaways:
• ext4 or xfs
• Disable write barrier (only if SSD/controller cache is protected
by capacitor/battery)
IO stack
shared_buffers
Page cache
VFS
EXT4
Block device interface
Disks
Buffer cache
Elevator/IO Scheduler
Elevators
• noop or none - then disks throughput is so high, that it can
not benefit from keen scheduling
PCIe SSDs
SAN disk arrays
• deadline - rotating disks
• CFQ - universal, default one
Thanks
to my collegues Alexey Lesovsky and Max Boguk for a lot of
research on this topic
Questions?
ik@postgresql-consulting.com

More Related Content

What's hot

Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQLPeter Eisentraut
 
The Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQLThe Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQLAshnikbiz
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverJohn Paulett
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsJignesh Shah
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStackEDB
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.Alexey Lesovsky
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDSDenish Patel
 
Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2 Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2 Denish Patel
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Denish Patel
 
PostgreSQL9.3 Switchover/Switchback
PostgreSQL9.3 Switchover/SwitchbackPostgreSQL9.3 Switchover/Switchback
PostgreSQL9.3 Switchover/SwitchbackVibhor Kumar
 
Architecture for building scalable and highly available Postgres Cluster
Architecture for building scalable and highly available Postgres ClusterArchitecture for building scalable and highly available Postgres Cluster
Architecture for building scalable and highly available Postgres ClusterAshnikbiz
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Masao Fujii
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PGConf APAC
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningLenz Grimmer
 
Tool it Up! - Session #3 - MySQL
Tool it Up! - Session #3 - MySQLTool it Up! - Session #3 - MySQL
Tool it Up! - Session #3 - MySQLtoolitup
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Ashnikbiz
 

What's hot (20)

Replication Solutions for PostgreSQL
Replication Solutions for PostgreSQLReplication Solutions for PostgreSQL
Replication Solutions for PostgreSQL
 
The Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQLThe Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQL
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And Failover
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStack
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
 
Really Big Elephants: PostgreSQL DW
Really Big Elephants: PostgreSQL DWReally Big Elephants: PostgreSQL DW
Really Big Elephants: PostgreSQL DW
 
Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2 Deploying postgre sql on amazon ec2
Deploying postgre sql on amazon ec2
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
PostgreSQL9.3 Switchover/Switchback
PostgreSQL9.3 Switchover/SwitchbackPostgreSQL9.3 Switchover/Switchback
PostgreSQL9.3 Switchover/Switchback
 
Architecture for building scalable and highly available Postgres Cluster
Architecture for building scalable and highly available Postgres ClusterArchitecture for building scalable and highly available Postgres Cluster
Architecture for building scalable and highly available Postgres Cluster
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery Planning
 
Rit 2011 ats
Rit 2011 atsRit 2011 ats
Rit 2011 ats
 
Tool it Up! - Session #3 - MySQL
Tool it Up! - Session #3 - MySQLTool it Up! - Session #3 - MySQL
Tool it Up! - Session #3 - MySQL
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
 

Viewers also liked

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
 
Infrastructure Monitoring with Postgres
Infrastructure Monitoring with PostgresInfrastructure Monitoring with Postgres
Infrastructure Monitoring with PostgresSteven Simpson
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaPostgreSQL-Consulting
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с дискомPostgreSQL-Consulting
 
PostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL-Consulting
 
10 things, an Oracle DBA should care about when moving to PostgreSQL
10 things, an Oracle DBA should care about when moving to PostgreSQL10 things, an Oracle DBA should care about when moving to PostgreSQL
10 things, an Oracle DBA should care about when moving to PostgreSQLPostgreSQL-Consulting
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practiceAlexey Lesovsky
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationAlexey Lesovsky
 
Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)
Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)
Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)Ontico
 
Максим Богук. Postgres-XC
Максим Богук. Postgres-XCМаксим Богук. Postgres-XC
Максим Богук. Postgres-XCPostgreSQL-Consulting
 
Илья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQL
Илья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQLИлья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQL
Илья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQLPostgreSQL-Consulting
 
Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.Alexey Lesovsky
 
#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6
#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6
#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6Nikolay Samokhvalov
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
PostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of HellPostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of HellAlexey Lesovsky
 

Viewers also liked (20)

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
 
Infrastructure Monitoring with Postgres
Infrastructure Monitoring with PostgresInfrastructure Monitoring with Postgres
Infrastructure Monitoring with Postgres
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
 
Как PostgreSQL работает с диском
Как PostgreSQL работает с дискомКак PostgreSQL работает с диском
Как PostgreSQL работает с диском
 
PostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQPostgreSQL Meetup Berlin at Zalando HQ
PostgreSQL Meetup Berlin at Zalando HQ
 
10 things, an Oracle DBA should care about when moving to PostgreSQL
10 things, an Oracle DBA should care about when moving to PostgreSQL10 things, an Oracle DBA should care about when moving to PostgreSQL
10 things, an Oracle DBA should care about when moving to PostgreSQL
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)
Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)
Как PostgreSQL работает с диском, Илья Космодемьянский (PostgreSQL-Consulting)
 
Иван Фролков. Tricky SQL
Иван Фролков. Tricky SQLИван Фролков. Tricky SQL
Иван Фролков. Tricky SQL
 
Максим Богук. Postgres-XC
Максим Богук. Postgres-XCМаксим Богук. Postgres-XC
Максим Богук. Postgres-XC
 
Илья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQL
Илья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQLИлья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQL
Илья Космодемьянский. Использование очередей асинхронных сообщений с PostgreSQL
 
Kosmodemiansky wr 2013
Kosmodemiansky wr 2013Kosmodemiansky wr 2013
Kosmodemiansky wr 2013
 
Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.Highload 2014. PostgreSQL: ups, DevOps.
Highload 2014. PostgreSQL: ups, DevOps.
 
#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6
#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6
#RuPostges в Yandex, эпизод 3. Что же нового в PostgreSQL 9.6
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
Pgconfru 2015 kosmodemiansky
Pgconfru 2015 kosmodemianskyPgconfru 2015 kosmodemiansky
Pgconfru 2015 kosmodemiansky
 
PostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of HellPostgreSQL Vacuum: Nine Circles of Hell
PostgreSQL Vacuum: Nine Circles of Hell
 

Similar to Linux internals for Database administrators at Linux Piter 2016

Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...PostgreSQL-Consulting
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Colin Charles
 
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best PracticesVMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best PracticesVMworld
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems Baruch Osoveskiy
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment StrategiesMongoDB
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment StrategyMongoDB
 
Databases love nutanix
Databases love nutanixDatabases love nutanix
Databases love nutanixNEXTtour
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)MongoDB
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailInternet World
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualizationSisimon Soman
 
XPDDS18: NVDIMM Overview - George Dunlap, Citrix
XPDDS18: NVDIMM Overview - George Dunlap, Citrix XPDDS18: NVDIMM Overview - George Dunlap, Citrix
XPDDS18: NVDIMM Overview - George Dunlap, Citrix The Linux Foundation
 
2015 deploying flash in the data center
2015 deploying flash in the data center2015 deploying flash in the data center
2015 deploying flash in the data centerHoward Marks
 
MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017Ivan Zoratti
 
2015 deploying flash in the data center
2015 deploying flash in the data center2015 deploying flash in the data center
2015 deploying flash in the data centerHoward Marks
 
OSSEU18: NVDIMM and Virtualization - George Dunlap, Citrix
OSSEU18: NVDIMM and Virtualization  - George Dunlap, CitrixOSSEU18: NVDIMM and Virtualization  - George Dunlap, Citrix
OSSEU18: NVDIMM and Virtualization - George Dunlap, CitrixThe Linux Foundation
 

Similar to Linux internals for Database administrators at Linux Piter 2016 (20)

Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
Linux IO internals for database administrators (SCaLE 2017 and PGDay Nordic 2...
 
Running MySQL on Linux
Running MySQL on LinuxRunning MySQL on Linux
Running MySQL on Linux
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best PracticesVMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
VMworld 2014: Advanced SQL Server on vSphere Techniques and Best Practices
 
Oracle Performance On Linux X86 systems
Oracle  Performance On Linux  X86 systems Oracle  Performance On Linux  X86 systems
Oracle Performance On Linux X86 systems
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Databases love nutanix
Databases love nutanixDatabases love nutanix
Databases love nutanix
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualization
 
XPDDS18: NVDIMM Overview - George Dunlap, Citrix
XPDDS18: NVDIMM Overview - George Dunlap, Citrix XPDDS18: NVDIMM Overview - George Dunlap, Citrix
XPDDS18: NVDIMM Overview - George Dunlap, Citrix
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
2015 deploying flash in the data center
2015 deploying flash in the data center2015 deploying flash in the data center
2015 deploying flash in the data center
 
MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017MySQL Performance Tuning London Meetup June 2017
MySQL Performance Tuning London Meetup June 2017
 
2015 deploying flash in the data center
2015 deploying flash in the data center2015 deploying flash in the data center
2015 deploying flash in the data center
 
OSSEU18: NVDIMM and Virtualization - George Dunlap, Citrix
OSSEU18: NVDIMM and Virtualization  - George Dunlap, CitrixOSSEU18: NVDIMM and Virtualization  - George Dunlap, Citrix
OSSEU18: NVDIMM and Virtualization - George Dunlap, Citrix
 
Deployment
DeploymentDeployment
Deployment
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Linux internals for Database administrators at Linux Piter 2016

  • 1. Linux IO internals for database administrators Ilya Kosmodemiansky ik@postgresql-consulting.com
  • 2. Why this talk • Linux is a most common OS for databases • DBAs often run into IO problems • Most of the information on topic is written by kernel developers (for kernel developers) or is checklist-style • Checklists are useful, but up to certain workload
  • 3. The main IO problem for databases • How to maximize page throughput between memory and disks • Things involved: Disks Memory CPU IO Schedulers Filesystems Database itself • IO problems for databases are not always only about disks
  • 4. A typical database DRAM Disks Shared memory Database Linux Page cache User space Kernel space WAL buffer WAL Datafile
  • 5. Key things about such workload • Shared memory segment can be very large • Keeping in-memory pages synchronized with disk generates huge IO • WAL should be written fast and safe • One and every layer of OS IO stack involved
  • 6. Memory allocation and mapping CPU L1 MMU TLB Memory L2 L3 page table Virtual addressing Translation Physical addressing
  • 7. DBAs takeaways: • Database with huge shared memory segment benefits from huge pages • But not from transparent huge pages Databases operate large continuous shared memory segments THP defragmentation can lead to severe performance degradation in such cases
  • 8. Freeing memory Page cache Swap Free list Free list Free list alloc()free() page-out vm.min_free_kbytes reclaim vm.swapiness OOM-killer vm.panic_on_oom
  • 9. Page-out • Page-out happens if: someone calls fsync 30 sec timeout exceeded (vm.dirty_expire_centisecs) Too many dirty pages (vm.dirty_background_ratio and vm.dirty_ratio) • It is reasonable to tune vm.dirty_* on rotating disks with RAID controller, but helps a little on server class SSDs
  • 10. DBAs takeaways: • vm.overcommit_memory 0 - heuristic overcommit, reduces swap usage 1 - always overcommit 2 - do not overcommit (vm.overcommit_ratio = 50 by default) • vm.min_free_kbytes - reasonably high (can be easy 1000000 on a server with enough memory) • vm.swappiness = 1 0 - swap disabled 60 - default 100 - swap preffered instead of other reaping mechanisms • Your database would not like OOM-killer
  • 11. OOM-killer: plague and cholera • vm.panic_on_oom effectively disables OOM-killer, but that is probably not the result you desire • Or for a certain process: echo -17 > /proc/12465/oom_adj but again
  • 12. Filesystems: write barriers Journal Kernel buffer Journalated fylesystem Data Journal entry Data
  • 13. DBAs takeaways: • ext4 or xfs • Disable write barrier (only if SSD/controller cache is protected by capacitor/battery)
  • 14. IO stack shared_buffers Page cache VFS EXT4 Block device interface Disks Buffer cache Elevator/IO Scheduler
  • 15. Elevators • noop or none - then disks throughput is so high, that it can not benefit from keen scheduling PCIe SSDs SAN disk arrays • deadline - rotating disks • CFQ - universal, default one
  • 16. Thanks to my collegues Alexey Lesovsky and Max Boguk for a lot of research on this topic