47. Amazon RDS for MySQL
のキャッシュ
o “14.13.1.5 Preloading the InnoDB Buffer
Pool for Faster Restart”
http://bit.ly/19tkg6y
o To avoid a lengthy warmup period after
restarting the server, particularly for
instances with large InnoDB buffer pools,
you can save the InnoDB buffer pool
state at server shutdown and restore
the buffer pool to the same state at
server startup.
Warmupには、Shutdown時にBuffer Poolを取得しておく必要が有る
48. Amazon Aurora DB クラスター
Aurora DB Clusterは、データベースのクラスター
構成で一般的な Shared Nothing のクラスターで
はない。データの読み書きが非対称であることが特徴
のシンプルな構成。鍵は、Log-Structured File
Systemを採用した、Cluster Volumeにある。
データベース・アクセスの現状とCPUが高速になり巨
大なメモリーが使える技術のもとでは、コスト・パ
フォーマンスに優れた、現実的な選択だと思う。
49. Amazon Aurora DB クラスター
Cluster Volume
Data Copy Data Copy Data Copy
92. Data Copy2 Data Copy2 Data Copy2
Data Copy1
Data Copy1
Cluster Volume
Replicaの作成は、Primary Instance
の仕事ではなく、Cacheとの間でCluster
Volume自身が行うのだろう。
Data Copyは、Disk単位ではなく
Segment 単位に行えばいい。
110. A A’ A’’MA MA’ MA’’
Checkpoint region
inode mapへのindex
logとは、別の固定領域に置かれる
Checkpoint Region
Checkpoint Regionで、inode map MA’’’を選べば、Foo’と
Barの値が得られるが、inode map MA’を選べば、Barとともに
古いFooの値が復元できる。
113. THE DESIGN AND IMPLEMENTATION
OF A LOG-STRUCTURED
FILE SYSTEM
M. Rosenblum and J. K. Ousterhout
University of California, Berkeley
ACM SIGOPS Operating Systems Review
1991年 http://bit.ly/1LmyTJx
121. Problems with Existing File
Systems
o For example, the Berkeley Unix fast file system
…. takes at least five separate disk 1/0s, each
preceded by a seek, to create a new file in Unix
FFS: two different accesses to the file’s
attributes plus one access each for the file’s
data, the directory’s data, and the directory’s
attributes. When writing small files in such a
system, less than 5% of the disk’s potential
bandwidth is used for new data; the rest of the
time is spent seeking.
Unixのファイルシステムでは、最低5回のdisk i/Oがあり、それに先立って
seekが行われる。結局、ディスクの帯域の5%しか新しいデータのために使
われていない。残りは、seekingに費やされる
122. Problems with Existing File
Systems
o The second problem with current file systems
is that they tend to write synchronously: the
application must wait for the write to complete,
rather than continuing while the write is
handled in the background. For example even
though Unix FFS writes file data blocks
asynchronously, file system metadata
structures such as directories and inodes are
written synchronously.
現在のファイル・システムでは、書き込みが同期的に行われている。Unix
FSSでは、データブロックの書き込みは、非同期的に行われるが、ディレク
トリーやi-nodeの書き込みは同期的に行われている。
151. directory operation log
o To restore consistency between directories and
inodes, Sprite LFS outputs a special record in the
log for each directory change. The record
includes an operation code (create, link, rename
or unlink), the location of the directory entry (i-
number for the directory and the position within
the directory), the contents of the directory
entry (name and i-number), and the new
reference count for the inode named in the entry.
These records are collectively called the directory
operation log; Sprite LFS guarantees that each
directory operation log entry appears in the log
before the corresponding directory block or
inode.
153. Heuristic Cleaning Algorithms
in Log-Structured File
Systems Trevor Blackwell, Jeffrey
Harris, Margo Seltzer
Harvard University
http://bit.ly/1EAsv8w
154. Abstract
o Research results show that while LogStructured
File Systems (LFS) offer the potential for
dramatically improved file system performance,
the cleaner can seriously degrade performance,
by as much as 40% in transaction processing
workloads [9]. Our goal is to examine trace
data from live file systems and use those to
derive simple heuristics that will permit the
cleaner to run without interfering with normal
file access. Our results show that trivial
heuristics perform very well, allowing 97% of
all cleaning on the most heavily loaded system
we studied to be done in the background.
155. The two traces depict
FFS and LFS processing
cycles. Both systems
are writing the same
amount of data, but
FFS must issue its
writes synchronously,
resulting in a much
longer total execution
time. LFS issues its
write asynchronously
and cleans during
processing periods,
thus achieving much
shorter total execution
time.
LFSでは、writeが非同期に実行され、かつ、処理期間中にクリーニング
が実行されるので、トータルの実行時間は短くなる。
177. LevelDB: SSTables and Log
Structured Merge Trees
o On-disk SSTable indexes are always
loaded into memory
o All writes go directly to
the MemTable index
o Reads check the MemTable first and
then the SSTable indexes
o Periodically, the MemTable is flushed to
disk as an SSTable
o Periodically, on-disk SSTables are
"collapsed together”
http://bit.ly/18tMGf9
178. “SSTable and Log Structured Storage: LevelDB”
http://bit.ly/18tMGf9
191. Flash file system
o A flash file system is a file system designed
for storing files on flash memory–based
storage devices. While the flash file systems
are closely related to file systems in general,
they are optimized for the nature and
characteristics of flash memory (such as to
avoid write amplification), and for use in
particular operating systems.
192. While a block device layer can emulate a disk
drive so that a general-purpose file system can be
used on a flash-based storage device, this is
suboptimal for several reasons:
o Erasing blocks: flash memory blocks have to
be explicitly erased before they can be written
to. The time taken to erase blocks can be
significant, thus it is beneficial to erase unused
blocks while the device is idle.
o Random access: general-purpose file systems
are optimized to avoid disk seeks whenever
possible, due to the high cost of seeking. Flash
memory devices impose no seek latency.
o Wear leveling: flash memory devices tend to
wear out when a single block is repeatedly
overwritten; flash file systems are designed to
spread out writes evenly.
193. Flash file system と LFS
o Log-structured file systems have all
the desirable properties for a flash file
system. Such file systems
include JFFS2 and YAFFS.
194. Log-structured file systems:
There's one in every SSD
o When you say "log-structured file system,"
most storage developers will immediately think
of Ousterhout and Rosenblum's classic paper
o Linux developers might think of JFFS2, NILFS,
or LogFS, three of several modern log-
structured file systems specialized for use with
solid state devices (SSDs).
o Few people, however, will think of SSD
firmware. The flash translation layer in a
modern, full-featured SSD resembles a log-
structured file system in several important
ways.
http://lwn.net/Articles/353411/