SlideShare a Scribd company logo
1 of 26
What Every Data Programmer  Needs to Know about Disks OSCON Data – July, 2011 - Portland Ted Dziuba @dozba tjdziuba@gmail.com Not proprietary or confidential. In fact, you’re risking a career by listening to me.
Who are you and why are you talking? First job: Like college but they pay you to go. A few years ago: Technical troll for The Register. Recently: Co-founder of Milo.com, local shopping engine. Present: Senior Technical Staff for eBay Local
The Linux Disk Abstraction Volume /mnt/volume File System xfs, ext Block Device HDD, HW RAID array
What happens when you read from a file? f = open(“/home/ted/not_pirated_movie.avi”, “rb”) avi_header = f.read(56) f.close() Disk controller user buffer page cache platter
What happens when you read from a file? Disk controller user buffer page cache ,[object Object]
Latency: 100 nanoseconds
Throughput: 12GB/sec on good hardwareplatter
What happens when you read from a file? Disk controller user buffer page cache ,[object Object]
Latency: 10 milliseconds
Throughput: 768 MB/sec on SATA 3
(Faster if you have a lot of money)platter
Sidebar: The Horror of a 10ms Seek Latency A disk read is 100,000 times slower than a memory read. 100 nanoseconds Time it takes you to write a really clever tweet 10 milliseconds Time it takes to write a novel, working full time
What happens when you write to a file? f = open(“/home/ted/nosql_database.csv”, “wb”) f.write(key) f.write(“,”) f.write(value) f.close() Disk controller user buffer page cache platter
What happens when you write to a file? f = open(“/home/ted/nosql_database.csv”, “wb”) f.write(key) f.write(“,”) f.write(value) f.close() Disk controller user buffer page cache platter You need to make this part happen Mark the page dirty, call it a day and go have a smoke.
Aside: Stick your finger in the Linux Page Cache Pre-Linux 2.6 used “pdflush”, now per-Backing Device Info (BDI) flush threads Dirty pages: grep –i “dirty” /proc/meminfo /proc/sys/vmLove: ,[object Object]
dirty_ratio: flush after some percent of memory is used
dirty_writeback_centisecs: how often to wake up and start flushingClear your page cache: echo 1 > /proc/sys/vm/drop_caches Crusty sysadmin’s hail-Mary pass: sync; sync; sync
Fsync: force a flush to disk f = open(“/home/ted/nosql_database.csv”, “wb”) f.write(key) f.write(“,”) f.write(value) os.fsync(f.fileno()) f.close() Disk controller user buffer page cache platter Also note, fsync() has a cousin, fdatasync() that does not sync metadata.
Aside: point and laugh at MongoDB Mongo’s “fsync” command: > db.runCommand({fsync:1,async:true});  wat. Also supports “journaling”, like a WAL in the SQL world, however… ,[object Object]
It’s not enabled by default.,[object Object]
Fsync: bitter lies platter …it’s a cache! Disk controller page cache ,[object Object]
Writeback is the demon,[object Object]
(Just dropped in) to see what condition your caches are in A Good Server platter Writethrough cache on controller Writethrough cache on disk Disk controller
(Just dropped in) to see what condition your caches are in An Even Better Server platter Battery-backedwriteback cache on controller Writethrough cache on disk Disk controller
(Just dropped in) to see what condition your caches are in The Demon Setup platter Battery-backed writeback cache or Writethrough cache Writeback cache on disk Disk controller
Disks in a virtual environment The Trail of Tears to the Platter Host page cache Virtual controller user buffer page cache Physical controller Hypervisor platter

More Related Content

What's hot

OSS についてあれこれ
OSS についてあれこれOSS についてあれこれ
OSS についてあれこれTakuto Wada
 
スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)
スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)
スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)NTT DATA Technology & Innovation
 
kube-system落としてみました
kube-system落としてみましたkube-system落としてみました
kube-system落としてみましたShuntaro Saiba
 
Zaim 500万ユーザに向けて〜Aurora 編〜
Zaim 500万ユーザに向けて〜Aurora 編〜Zaim 500万ユーザに向けて〜Aurora 編〜
Zaim 500万ユーザに向けて〜Aurora 編〜Wataru Nishimoto
 
Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager
 Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager
Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault ManagerSolarisJP
 
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)NTT DATA Technology & Innovation
 
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)Kuniyasu Suzaki
 
Database Security for PCI DSS
Database Security for PCI DSSDatabase Security for PCI DSS
Database Security for PCI DSSOhyama Masanori
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLMorgan Tocker
 
ソーシャルゲームの為のデータベース設計
ソーシャルゲームの為のデータベース設計ソーシャルゲームの為のデータベース設計
ソーシャルゲームの為のデータベース設計kaminashi
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向についてYasunori Goto
 
C/C++プログラマのための開発ツール
C/C++プログラマのための開発ツールC/C++プログラマのための開発ツール
C/C++プログラマのための開発ツールMITSUNARI Shigeo
 
Kubernetes ネットワーキングのすべて
Kubernetes ネットワーキングのすべてKubernetes ネットワーキングのすべて
Kubernetes ネットワーキングのすべてLINE Corporation
 
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...StreamNative
 
Ruby on Rails の特徴とそのエコシステム
Ruby on Rails の特徴とそのエコシステムRuby on Rails の特徴とそのエコシステム
Ruby on Rails の特徴とそのエコシステムTomoya Kawanishi
 
イベント駆動プログラミングとI/O多重化
イベント駆動プログラミングとI/O多重化イベント駆動プログラミングとI/O多重化
イベント駆動プログラミングとI/O多重化Gosuke Miyashita
 
なかったらINSERTしたいし、あるならロック取りたいやん?
なかったらINSERTしたいし、あるならロック取りたいやん?なかったらINSERTしたいし、あるならロック取りたいやん?
なかったらINSERTしたいし、あるならロック取りたいやん?ichirin2501
 

What's hot (20)

OSS についてあれこれ
OSS についてあれこれOSS についてあれこれ
OSS についてあれこれ
 
スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)
スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)
スケールアウトするPostgreSQLを目指して!その第一歩!(NTTデータ テクノロジーカンファレンス 2020 発表資料)
 
kube-system落としてみました
kube-system落としてみましたkube-system落としてみました
kube-system落としてみました
 
Zaim 500万ユーザに向けて〜Aurora 編〜
Zaim 500万ユーザに向けて〜Aurora 編〜Zaim 500万ユーザに向けて〜Aurora 編〜
Zaim 500万ユーザに向けて〜Aurora 編〜
 
Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager
 Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager
Solaris ディープダイブセミナー #4: A-3 障害管理アーキテクチャー Solaris Fault Manager
 
Fig 9-02
Fig 9-02Fig 9-02
Fig 9-02
 
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
その Pod 突然落ちても大丈夫ですか!?(OCHaCafe5 #5 実験!カオスエンジニアリング 発表資料)
 
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
 
Database Security for PCI DSS
Database Security for PCI DSSDatabase Security for PCI DSS
Database Security for PCI DSS
 
The InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQLThe InnoDB Storage Engine for MySQL
The InnoDB Storage Engine for MySQL
 
ソーシャルゲームの為のデータベース設計
ソーシャルゲームの為のデータベース設計ソーシャルゲームの為のデータベース設計
ソーシャルゲームの為のデータベース設計
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について不揮発メモリ(NVDIMM)とLinuxの対応動向について
不揮発メモリ(NVDIMM)とLinuxの対応動向について
 
いつやるの?Git入門
いつやるの?Git入門いつやるの?Git入門
いつやるの?Git入門
 
C/C++プログラマのための開発ツール
C/C++プログラマのための開発ツールC/C++プログラマのための開発ツール
C/C++プログラマのための開発ツール
 
Kubernetes ネットワーキングのすべて
Kubernetes ネットワーキングのすべてKubernetes ネットワーキングのすべて
Kubernetes ネットワーキングのすべて
 
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
 
Ruby on Rails の特徴とそのエコシステム
Ruby on Rails の特徴とそのエコシステムRuby on Rails の特徴とそのエコシステム
Ruby on Rails の特徴とそのエコシステム
 
イベント駆動プログラミングとI/O多重化
イベント駆動プログラミングとI/O多重化イベント駆動プログラミングとI/O多重化
イベント駆動プログラミングとI/O多重化
 
なかったらINSERTしたいし、あるならロック取りたいやん?
なかったらINSERTしたいし、あるならロック取りたいやん?なかったらINSERTしたいし、あるならロック取りたいやん?
なかったらINSERTしたいし、あるならロック取りたいやん?
 
Lockfree Queue
Lockfree QueueLockfree Queue
Lockfree Queue
 

Similar to What every data programmer needs to know about disks

Memory management in Linux
Memory management in LinuxMemory management in Linux
Memory management in LinuxRaghu Udiyar
 
Let Me Pick Your Brain - Remote Forensics in Hardened Environments
Let Me Pick Your Brain - Remote Forensics in Hardened EnvironmentsLet Me Pick Your Brain - Remote Forensics in Hardened Environments
Let Me Pick Your Brain - Remote Forensics in Hardened EnvironmentsNicolas Collery
 
Logical volume manager xfs
Logical volume manager xfsLogical volume manager xfs
Logical volume manager xfsSarwar Javaid
 
Data hiding and finding on Linux
Data hiding and finding on LinuxData hiding and finding on Linux
Data hiding and finding on LinuxAnton Chuvakin
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.Alexey Lesovsky
 
Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "Tanya Denisyuk
 
Live memory forensics
Live memory forensicsLive memory forensics
Live memory forensicsMehedi Hasan
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)MongoDB
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimizationLouis liu
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar
 
Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...
Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...
Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...Ron Munitz
 
Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Bud Siddhisena
 
Unix fundamentals
Unix fundamentalsUnix fundamentals
Unix fundamentalsBimal Jain
 

Similar to What every data programmer needs to know about disks (20)

Linux Recovery
Linux RecoveryLinux Recovery
Linux Recovery
 
Memory management in Linux
Memory management in LinuxMemory management in Linux
Memory management in Linux
 
Let Me Pick Your Brain - Remote Forensics in Hardened Environments
Let Me Pick Your Brain - Remote Forensics in Hardened EnvironmentsLet Me Pick Your Brain - Remote Forensics in Hardened Environments
Let Me Pick Your Brain - Remote Forensics in Hardened Environments
 
Vmfs
VmfsVmfs
Vmfs
 
Logical volume manager xfs
Logical volume manager xfsLogical volume manager xfs
Logical volume manager xfs
 
Hardware
HardwareHardware
Hardware
 
Data hiding and finding on Linux
Data hiding and finding on LinuxData hiding and finding on Linux
Data hiding and finding on Linux
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
 
Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "Алексей Лесовский "Тюнинг Linux для баз данных. "
Алексей Лесовский "Тюнинг Linux для баз данных. "
 
Live memory forensics
Live memory forensicsLive memory forensics
Live memory forensics
 
Clear cache memory
Clear cache memoryClear cache memory
Clear cache memory
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
FreeBSD Portscamp, Kuala Lumpur 2016
FreeBSD Portscamp, Kuala Lumpur 2016FreeBSD Portscamp, Kuala Lumpur 2016
FreeBSD Portscamp, Kuala Lumpur 2016
 
Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...
Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...
Bringing up Android on your favorite X86 Workstation or VM (AnDevCon Boston, ...
 
Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Linux filesystemhierarchy
Linux filesystemhierarchyLinux filesystemhierarchy
Linux filesystemhierarchy
 
Unix fundamentals
Unix fundamentalsUnix fundamentals
Unix fundamentals
 

More from iammutex

Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagramiammutex
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出iammutex
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redisiammutex
 
NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析iammutex
 
MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用iammutex
 
8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slide8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slideiammutex
 
Thoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency ModelsThoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency Modelsiammutex
 
Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告iammutex
 
redis 适用场景与实现
redis 适用场景与实现redis 适用场景与实现
redis 适用场景与实现iammutex
 
Introduction to couchdb
Introduction to couchdbIntroduction to couchdb
Introduction to couchdbiammutex
 
redis运维之道
redis运维之道redis运维之道
redis运维之道iammutex
 
Realtime hadoopsigmod2011
Realtime hadoopsigmod2011Realtime hadoopsigmod2011
Realtime hadoopsigmod2011iammutex
 
[译]No sql生态系统
[译]No sql生态系统[译]No sql生态系统
[译]No sql生态系统iammutex
 
Couchdb + Membase = Couchbase
Couchdb + Membase = CouchbaseCouchdb + Membase = Couchbase
Couchdb + Membase = Couchbaseiammutex
 
Redis cluster
Redis clusterRedis cluster
Redis clusteriammutex
 
Redis cluster
Redis clusterRedis cluster
Redis clusteriammutex
 
Hadoop introduction berlin buzzwords 2011
Hadoop introduction   berlin buzzwords 2011Hadoop introduction   berlin buzzwords 2011
Hadoop introduction berlin buzzwords 2011iammutex
 

More from iammutex (20)

Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redis
 
NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析NoSQL误用和常见陷阱分析
NoSQL误用和常见陷阱分析
 
MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用MongoDB 在盛大大数据量下的应用
MongoDB 在盛大大数据量下的应用
 
8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slide8 minute MongoDB tutorial slide
8 minute MongoDB tutorial slide
 
skip list
skip listskip list
skip list
 
Thoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency ModelsThoughts on Transaction and Consistency Models
Thoughts on Transaction and Consistency Models
 
Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告
 
redis 适用场景与实现
redis 适用场景与实现redis 适用场景与实现
redis 适用场景与实现
 
Introduction to couchdb
Introduction to couchdbIntroduction to couchdb
Introduction to couchdb
 
Ooredis
OoredisOoredis
Ooredis
 
Ooredis
OoredisOoredis
Ooredis
 
redis运维之道
redis运维之道redis运维之道
redis运维之道
 
Realtime hadoopsigmod2011
Realtime hadoopsigmod2011Realtime hadoopsigmod2011
Realtime hadoopsigmod2011
 
[译]No sql生态系统
[译]No sql生态系统[译]No sql生态系统
[译]No sql生态系统
 
Couchdb + Membase = Couchbase
Couchdb + Membase = CouchbaseCouchdb + Membase = Couchbase
Couchdb + Membase = Couchbase
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Hadoop introduction berlin buzzwords 2011
Hadoop introduction   berlin buzzwords 2011Hadoop introduction   berlin buzzwords 2011
Hadoop introduction berlin buzzwords 2011
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

What every data programmer needs to know about disks

  • 1. What Every Data Programmer Needs to Know about Disks OSCON Data – July, 2011 - Portland Ted Dziuba @dozba tjdziuba@gmail.com Not proprietary or confidential. In fact, you’re risking a career by listening to me.
  • 2. Who are you and why are you talking? First job: Like college but they pay you to go. A few years ago: Technical troll for The Register. Recently: Co-founder of Milo.com, local shopping engine. Present: Senior Technical Staff for eBay Local
  • 3. The Linux Disk Abstraction Volume /mnt/volume File System xfs, ext Block Device HDD, HW RAID array
  • 4. What happens when you read from a file? f = open(“/home/ted/not_pirated_movie.avi”, “rb”) avi_header = f.read(56) f.close() Disk controller user buffer page cache platter
  • 5.
  • 7. Throughput: 12GB/sec on good hardwareplatter
  • 8.
  • 11. (Faster if you have a lot of money)platter
  • 12. Sidebar: The Horror of a 10ms Seek Latency A disk read is 100,000 times slower than a memory read. 100 nanoseconds Time it takes you to write a really clever tweet 10 milliseconds Time it takes to write a novel, working full time
  • 13. What happens when you write to a file? f = open(“/home/ted/nosql_database.csv”, “wb”) f.write(key) f.write(“,”) f.write(value) f.close() Disk controller user buffer page cache platter
  • 14. What happens when you write to a file? f = open(“/home/ted/nosql_database.csv”, “wb”) f.write(key) f.write(“,”) f.write(value) f.close() Disk controller user buffer page cache platter You need to make this part happen Mark the page dirty, call it a day and go have a smoke.
  • 15.
  • 16. dirty_ratio: flush after some percent of memory is used
  • 17. dirty_writeback_centisecs: how often to wake up and start flushingClear your page cache: echo 1 > /proc/sys/vm/drop_caches Crusty sysadmin’s hail-Mary pass: sync; sync; sync
  • 18. Fsync: force a flush to disk f = open(“/home/ted/nosql_database.csv”, “wb”) f.write(key) f.write(“,”) f.write(value) os.fsync(f.fileno()) f.close() Disk controller user buffer page cache platter Also note, fsync() has a cousin, fdatasync() that does not sync metadata.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. (Just dropped in) to see what condition your caches are in A Good Server platter Writethrough cache on controller Writethrough cache on disk Disk controller
  • 24. (Just dropped in) to see what condition your caches are in An Even Better Server platter Battery-backedwriteback cache on controller Writethrough cache on disk Disk controller
  • 25. (Just dropped in) to see what condition your caches are in The Demon Setup platter Battery-backed writeback cache or Writethrough cache Writeback cache on disk Disk controller
  • 26. Disks in a virtual environment The Trail of Tears to the Platter Host page cache Virtual controller user buffer page cache Physical controller Hypervisor platter
  • 27.
  • 30. How are the caches configured?
  • 31. How big are the caches?
  • 35. Aside: Amazon EBS MySQL Amazon EBS Please stop doing this.
  • 36. What’s Killing That Box? ted@u235:~$ iostat -x Linux 2.6.32-24-generic (u235) 07/25/2011 _x86_64_ (8 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 0.15 0.14 0.05 0.00 0.00 99.66 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz%util sda 0.00 3.27 0.01 2.38 0.58 45.23 19.21 0.24
  • 37.
  • 38. Negligible seek time vs 10ms seek time
  • 39.
  • 44.
  • 48.

Editor's Notes

  1. Note that the page is actually in memory twice. Mmaped files fix this, but it’s beyond the scope of this discussion.Also this is why read performance on a lot of memory only NoSQL databases beats disk-backed SQL. Duh.
  2. Equate 100 nanoseconds to about 100 seconds. Then 10 milliseconds is about 3 months.
  3. This is where a lot of NoSQL databases get their performance, but more on that in a few minutes.
  4. There are threads that wake up every now and then to flush pages to disk.
  5. Fsync blocks until the data has been written to disk.
  6. With a battery-backed RAID controller, fsync can return very quickly with little risk of data loss.
  7. You need to dive into your vendor’s control tool to find this out.
  8. VMWare server is faithful to fsync, VMWare workstation is not. Xen usually queues I/O requests after they have been issued. The point is that you have no way of knowing. Your visibility of what happens to your data after you write or fsync ends at the hypervisor.
  9. Newer intel chips have the northbridge controller on-die. Southbridge bandwidth is usually <= 10GB/sec, and you are sharing this with other customers’ network and disk I/O. That, and you may be sharing drive spindles.
  10. EBS lies about the result of fsync. This is why Reddit is down all the time. You have been warned.
  11. EBS lies about the result of fsync. This is why Reddit is down all the time. You have been warned.