3. SkySQL
•Leading
provider
of
open
source
databases,
services
and
solutions
•Home
for
the
founders
and
the
original
developers
of
the
core
of
MySQL
•The
creators
of
MariaDB,
the
drop-‐in,
innovative
replacement
of
MySQL
Tuesday, 26 November 13
4. NO
TE:
Ma
We
riaD
ref
er
t
B
1
0.0 o
.5
Agenda
•What’s
new
in
MariaDB
10
•The
Spider
storage
engine
•MariaDB
10
Optimizer
•The
Cassandra
storage
engine
•MariaDB
10
Administration
•MariaDB
replication
•The
CONNECT
storage
engine
•All
the
rest!
4
Tuesday, 26 November 13
6. First:
What
Is
MariaDB
10?
•A
fork
of
MySQL
5.5
with
extra
features,
requested
by
the
users
•It
is
backward
compatible
with
MySQL
5.5
and
MariaDB
5.5
for
file
formats,
replication
and
configuration
files
•Other
Key
points:
•A
project
that
enhances
collaboration
• Hundreds
of
thousands
lines
of
code
from
community
contributors!
•100%
GPL
•100%
Free
-‐
commercial
add-‐ons
and
extensions
are
welcome,
but
they
are
clearly
identified
and
separated
•It
is
application
and
data
files
compatible
with
MySQL
5.6
6
Tuesday, 26 November 13
7. MariaDB
Inheritance
• MariaDB
5.1
(MySQL-‐5.1
base)
• Table
elimination
• new
storage
engines
• cleanup
• better
tests
• pool
of
threads
• MariaDB
5.2
(MariaDB-‐5.1
base)
• Virtual
columns
• extended
user
statistics
• segmented
MyISAM
keycache
• MariaDB
5.3
(MariaDB-‐5.2
base)
• Biggest
changes
to
optimizer
(faster
subqueries,
joins,
etc.)
• Microsecond
precision
• faster
HANDLER
(HANDLER
RED
50%
faster
w/
530,000
qps)
• dynamic
columns
• Better
replication
(group
commit,
etc.)
• HandlerSocket
• MariaDB
5.5
=
MariaDB
5.3
+
MySQL
5.5
• Opensource,
more
efficient
threadpool
• Non-‐blocking
client
library
• New
LIMIT
ROWS
EXAMINED
option
• Extended
keys
for
XtraDB/InnoDB
• New
SphinxSE
• Dynamic
replication
settings
• Lots
of
security
fixes,
new
status
variables,
etc.
7
Tuesday, 26 November 13
8. MySQL
5.6
Backported
and
Reimplemented
Features
• InnoDB/XtraDB
(5.6.5)
• Optimised
read
only
transaction,
with
support
for
TRANSACTION
READ
ONLY
• PERFORMANCE_SCHEMA
(5.6.5)
• Online
ALTER
TABLE
(in-‐progress)
• Privileges
on
temporary
tables
•--plugin-load-add
• GET
DIAGNOSTIC
statement
• Character
set
extensions
Tuesday, 26 November 13
• Temporal
literals
• Filesort
optimisation
for
ORDER
BY...LIMIT
(shows
only
few
rows
of
a
resultset)
• Reimplemented:
• Errormessages(w/system
error
string)
• CURRENT_TIMESTAMP
as
DEFAULT
for
DATETIME
columns
• GlobalTransactionID
• Parallel
replication
• EXISTS-‐TO-‐IN
optimisation
8
9. MariaDB
10
only
•SHOW
EXPLAIN
for
<thread_id>
•Faster
ALTER
TABLE
with
unique
keys
for
Aria
&
MyISAM
•Per-‐thread
memory
usage
•INFORMATION_SCHEMA.PROCESSLIST
with
MEMORY_USAGE
&
EXAMINED_ROWS
now
•SHOW
STATUS
with
memory
usage
9
Tuesday, 26 November 13
10. Additions
to
MariaDB
10
•MariaDB
Galera
Cluster
•Connectors
and
Drivers
•MariaDB
C
Client
Library
•MariaDB
JDBC
Driver
•MariaDB
Enterprise
•MariaDB
Manager
API
•MariaDB
Audit
Plugin
10
Tuesday, 26 November 13
15. EXPLAIN
and
Query
Plans
•-‐-‐log_slow_verbosity=query_plan
•Query
plan
in
the
slow
query
log
•SHOW
EXPLAIN
FOR
thread_id
•Query
plan
for
a
running
query
•EXPLAIN
ANALYZE
•Query
plan
and
query
execution
•EXPLAIN
...
JSON
•Extra
information
(WHERE
and
attached
conditions)
to
the
query
plain
•EXPLAIN
UPDATE/DELETE
•Query
plan
for
UPDATE
and
DELETE
statements
PAGE
15
Tuesday, 26 November 13
16. Engine-‐independent
Statistics
•Collected/used
on
SQL
layer,
not
at
storage
engine
layer
•No
auto
updates,
only
ANALYZE
TABLE
•100%
accuracy
•More
Statistics
•Index
statistics
•Table
statistics
•Column
statistics
• MIN/MAX
values
• NULL/NON
NULL
values
• Histograms
Tuesday, 26 November 13
• use_stat_tables
• never
• complementary
• preferably
• ANALYZE
TABLE tbl PERSISTENT
FOR COLUMNS (col1,col2,...)
INDEXES (idx1,idx2,...);
• System
tables
• mysql.table_stat
• mysql.index_stat
• mysql.column_stat
PAGE
16
17. Index
Condition
Pushdown
(ICP)
5.5
• Without
ICP:
• Read
index
• Read
record
• Check
the
WHERE
condition
• With
ICP:
• Read
index
• Check
the
WHERE
condition
on
index
• Read
record
• Check
the
WHERE
condition
• In
the
query
plan
• “Using
index
condition”
• Performance
gain
(example)
• Cold
Buffer:
1min
vs.
5min
(5x)
• Hot
Buffer:
0.07sec
vs.
0.19
sec
(2.7x)
Tuesday, 26 November 13
PAGE
17
18. Index
Merge/Union
5.5
SELECT * FROM ontime WHERE ( origin=‘LHR’ OR dest=‘LHR’ );
•Used
for
OR
in
the
WHERE
clause
•They
cannot
be
resolved
with
a
single
index
•Can
be
turned
off
globally
•optimizer_switch=‘index_merge=off’
PAGE
18
Tuesday, 26 November 13
19. Index
Merge/Intersection
5.5
EXPLAIN SELECT AVG(arrdelay) FROM ontime WHERE depdelay15=1 AND origin=‘LHR’;
+--+-----------+------+-----------+---------------+---------------+-------+----+----|id|select_type|table |type
|possible_keys |key
|key_len|ref |rows
+--+-----------+------+-----------+---------------+---------------+-------+----+----| 1|SIMPLE
|ontime|index_merge|Origin,DepDel15|Origin,DepDel15|3,5
|NULL|76952
+--+-----------+------+-----------+---------------+---------------+-------+----+----+--------------------------------------------+
|Extra
|
+--------------------------------------------+
|Using intersect(Origin,DepDel15);Using where|
+--------------------------------------------+
•Used
for
AND
in
the
WHERE
clause
and
no
composite
index
•Must
be
turned
on
•optimizer_switch=‘index_merge_sort_intersection=on’
PAGE
19
Tuesday, 26 November 13
20. 5.5
Index
Merge/Sort
Intersection
SELECT AVG(arrdelay) FROM ontime WHERE depdel15=1 AND OriginState IN ( 'CA', 'GB' );
•Used
for
AND
and
IN
in
the
WHERE
clause
•Must
be
turned
on
•optimizer_switch=
‘index_merge_sort_intersection=on’
Tuesday, 26 November 13
PAGE
20
22. Subquery
Optimization
5.5
SELECT COUNT(*) FROM customer
WHERE c_acctbal > 0.8 * ( SELECT MAX(c_acctbal) FROM customer C
WHERE C.c_nationkey=customer.c_nationkey
GROUP BY c_nationkey );
•Cache
ON/OFF:
1.01
sec
/
1hr
31m
43.33
sec
•Cache
Hit/Miss:
149975
/
25
PAGE
22
Tuesday, 26 November 13
23. Semi-‐join
Materialization
SELECT * FROM Country
WHERE Country.code IN ( SELECT
FROM
WHERE
AND
City.Country
City
City.Population > 7*1000*1000 )
Country.continent='Europe'
5.5
Materialization
scan
vs.
Materialization
lookup
PAGE
23
Tuesday, 26 November 13
24. Multi-‐range
Read
•Faster
disk
access
by
sorting
record
read
requests
and
then
doing
one
ordered
disk
sweep.
•Particularly
efficient
for
legacy
spindles,
but
still
reduces
I/O
for
SSDs
and
flash
storage
•optimizer_switch='mrr=on'
PAGE
24
Tuesday, 26 November 13
25. Multi-‐range
Read
-‐
Range
Access
EXPLAIN SELECT * FROM tbl WHERE tbl.key1 BETWEEN 1000 AND 2000;
+----+-------------+-------+-------+---------------+------+---------+------+-----| id | select_type | table | type | possible_keys | key | key_len | ref | rows
+----+-------------+-------+-------+---------------+------+---------+------+-----| 1 | SIMPLE
| tbl
| range | key1
| key1 | 5
| NULL | 960
+----+-------------+-------+-------+---------------+------+---------+------+-----+-------------------------------------------+
| Extra
|
+-------------------------------------------+
| Using index condition; Rowid-ordered scan |
+-------------------------------------------+
PAGE
25
Tuesday, 26 November 13
26. Multi-‐range
Read
-‐
Batched
Key
Access
EXPLAIN SELECT * FROM t1,t2 WHERE t2.key1=t1.col1;
+----+-------------+-------+------+---------------+------+---------+--------------+-----| id | select_type | table | type | possible_keys | key | key_len | ref
| rows
+----+-------------+-------+------+---------------+------+---------+--------------+-----| 1 | SIMPLE
| t1
| ALL | NULL
| NULL | NULL
| NULL
| 1000
| 1 | SIMPLE
| t2
| ref | key1
| key1 | 5
| test.t1.col1 |
1
+----+-------------+-------+------+---------------+------+---------+--------------+-----+--------------------------------------------------------+
| Extra
|
+--------------------------------------------------------+
| Using where
|
| Using join buffer (flat, BKA join); Rowid-ordered scan |
+--------------------------------------------------------+
•Similar
to
the
sorting
buffer
in
the
range
buffer,
but
applied
to
row
IDs
•Same
benefits
as
for
Range
access
PAGE
26
Tuesday, 26 November 13
28. New
and
Improved
Commands
•Progress
reporting
•SHOW
EXPLAIN
•SHOW
PLUGIN
SONAME
•SHUTDOWN
•Per-‐connection
memory
accounting
•Roles
28
Tuesday, 26 November 13
29. Progress
reporting
•MariaDB
5.3
and
later
supports
progress
reporting
for
some
long
running
commands.
•INFORMATION_SCHEMA.PROCESSLIST
has
three
new
columns
for
progress
reporting:
STAGE,
MAX_STAGE,
and
PROGRESS
•There
is
a
new
column
Progress in SHOW PROCESSLIST which
shows
the
total
progress
(0-‐100
%)
•The
client
receives
progress
messages
which
it
can
display
to
the
user
to
indicate
how
long
the
command
will
take.
•Valid
for:
•ALTER
TABLE,
ADD/DROP
INDEX,
LOAD
DATA
INFILE
•CHECK/REPAIR/ANALYZE/OPTIMIZE
TABLE
Tuesday, 26 November 13
29
30. SHOW
EXPLAIN
•The
SHOW
EXPLAIN
command
allows
to
get
an
EXPLAIN
(that
is,
a
printout
of
a
query
plan)
of
a
query
running
in
a
certain
thread.
•The
syntax
is:
SHOW
EXPLAIN FOR <thread_id>;
30
Tuesday, 26 November 13
31. SHOW
PLUGIN
SONAME
•It
displays
information
about
compiled-‐in
and
all
server
plugins
in
the
@@plugin_dir
directory,
including
not
installed
ones.
31
Tuesday, 26 November 13
32. SHUTDOWN
•SQL
statement
to
shut
the
server
down
•Same
as
mysqladmin shutdown
•Requires
SHUTDOWN
privilege
32
Tuesday, 26 November 13
34. Roles
•Contribution
from
Vicentju
Ciorbaru
at
Google
Summer
of
Code
•SQL
standard
implementation
of
roles
•CREATE
ROLE,
DROP
ROLE
•GRANT
role
TO
user,
GRANT
role
TO
role
•SET
ROLE
•DEFINER=role
•CURRENT_ROLE
•INFORMATION_SCHEMA
tables
34
Tuesday, 26 November 13
36. The
CONNECT
Storage
Engine
•What
is
the
CONNECT
storage
engine?
•CONNECT
is
a
storage
engine
that
enables
MariaDB
to
use
external
data
as
they
were
standard
tables
in
the
server
•Data
is
not
loaded
into
MariaDB
•History
of
the
CONNECT
storage
engine
•The
engine
has
been
mainly
developed
by
Olivier
Bertrand,
an
ex
IBM
database
researcher,
with
the
intent
to
have
a
more
versatile
way
to
access
external
data
sources
in
various
formats
•The
idea
dates
back
in
2004
and
Olivier
has
been
in
touch
with
MySQL
and
MariaDB
since
•Today
CONNECT
is
a
standard
engine
in
MariaDB
10
36
Tuesday, 26 November 13
37. CONNECT
Engine
Usage
•The
CONNECT
engine:
•Integrates/access
data
directly
in
many
non-‐MariaDB
formats
•Simplifies
the
ETL
procedures
in
Business
Intelligence
and
Business
Analytics
•Simplifies
the
export/import
of
data
from/to
MariaDB,
to/from
other
data
sources
•There
are
strong
similarities
with
the
CSV
engine
and
some
similarities
with
FederatedX
and
Merge
• The
CSV
engine
does
not
allow
indexing
and
block
reading
• The
FederateX
engine
imposes
several
restrictions
and
performance
limitations
•FILE
privilege
is
required
37
Tuesday, 26 November 13
38. CONNECT
Engine
-‐
Advantages
•Multiple
File
Table
(option
multiple=[0|1|2])
processes
sequentially
files
of
the
same
type
•Indexing
for
most
of
the
table
types
•Data
compression
•Block
reading
boosts
read
performance
•Condition
push
down
•Valid
for
ODBC,
MYSQL,
TBL
and
WMI
• set
optimizer_switch='engine_condition
_pushdown=on'
...or...
--engine_condition_pushdown=on
•Even
more
possibilities
with
the
OEM
file
type
38
Tuesday, 26 November 13
39. CONNECT
Table
Types
Type
Descrip*on
Type
XML
DOS
The
table
is
contained
in
one
or
several
files.
The
file
format
can
be
refined
by
some
other
ophons
of
the
command
or
more
oien
using
a
specific
type
as
many
of
those
described
below.
Otherwise,
it
is
a
flat
text
file
where
columns
are
placed
at
a
fixed
offset
within
each
record,
the
last
column
being
of
variable
length.
FIX
Text
file
arranged
like
DOS
but
with
fixed
length
records.
BIN
Binary
file
with
numeric
values
in
plalorm
representahon,
also
with
columns
at
fixed
offset
within
records.
VEC
Binary
file
organized
in
vectors,
in
which
column
values
are
grouped
consecuhvely,
either
split
in
separate
files
or
in
a
unique
file.
DBF*
File
having
the
dBASE
format.
CSV*
FMT
INI
Tuesday, 26 November 13
ODBC*
(*)
Auto
discovery
of
the
table
structure
Descrip*on
File
having
the
XML
or
HTML
format
Table
extracted
from
an
applicahon
accessible
via
ODBC
or
unixODBC.
For
example
from
another
DBMS
or
from
an
Excel
spreadsheet.
MYSQL*
Table
accessed
using
the
MySQL
API
like
the
FEDERATED
engine.
PROXY*
A
table
based
on
another
table
exishng
on
the
current
server.
TBL*
Accessing
a
collechon
of
tables
as
one
table
(like
the
MERGE
engine
does
for
MyIsam
tables)
XCOL*
A
table
based
on
another
table
exishng
on
the
current
server
with
one
of
its
column
containing
of
comma
separated
values.
OCCUR
"Comma
Separated
Values"
file
in
which
each
variable
length
record
contains
column
values
separated
by
a
specific
character
(defaulhng
to
the
comma)
File
in
which
each
record
contains
the
column
values
in
a
non-‐standard
format
(the
same
for
each
record)
This
format
is
specified
in
the
column
definihon.
WMI*
File
having
the
format
of
the
inihalizahon
or
configurahon
files
used
by
many
applicahons.
OEM
DIR
MAC
A
table
based
on
another
table
exishng
on
the
current
server,
several
columns
of
the
object
table
containing
values
that
can
be
grouped
in
only
one
column.
Virtual
table
that
returns
a
file
list
like
the
Unix
ls
or
DOS
dir
command.
Windows
Management
Instrumentahon
table
displaying
informahon
coming
from
a
WMI
provider.
This
type
enables
to
get
in
tabular
format
all
sort
of
informahon
about
the
machine
hardware
and
operahng
system
(Windows
only).
Virtual
table
returning
informahon
about
the
machine
and
network
cards
(Windows
only).
Table
of
any
other
formats
not
directly
handled
by
CONNECT
but
whose
access
is
implemented
by
an
external
plugin
module
(DLL
or
S39
hared
Library).
40. CONNECT
Table
Options
Type
Descrip*on
TABLE_TYPE
The
type
of
the
external
table:
DOS,
FIX,
BIN,
CSV,
FMT,
XML,
INI,
DBF,
VEC,
ODBC,
MYSQL,
TBL,
DIR,
WMI,
MAC
and
EOM.
Defaults
to
DOS.
FILE_NAME
XFILE_NAME
The
file
(path)
name
for
all
table
types
based
on
files.
Can
be
absolute
or
relahve
to
the
current
data
directory.
The
file
(path)
base
name
for
a
table
index
files.
Can
be
absolute
or
relahve
to
the
data
directory.
Defaults
to
the
file
name.
Type
Descrip*on
OPTION_LIST Used
to
specify
all
other
ophons
not
yet
directly
defined.
MAPPED
Specifies
whether
“file
mapping”
is
used
to
handle
the
table
file.
HUGE
To
specify
that
a
table
file
can
be
larger
than
2GB.
COMPRESS
True
if
the
data
file
is
compressed.
Defaults
to
NO.
TABNAME
The
target
table
or
node
for
ODBC,
MYSQL,
XML
or
catalog
tables.
TABLE_LIST
The
comma
separated
list
of
a
TBL
table
sub-‐tables.
READONLY True
if
the
data
file
must
not
be
modified
or
erased.
The
target
database
for
ODBC,
MYSQL
or
catalog
tables.
SEPINDEX
DBNAME
DATA_CHARSET The
character
set
used
in
the
external
file
or
data
source.
SEP_CHAR
QCHAR
MODULE
Specifies
the
field
separator
character
of
a
CSV
tables.
Specifies
the
character
used
for
quohng
some
fields
of
a
CSV
table
or
the
idenhfiers
of
an
ODBC
tables.
The
(path)
name
of
the
DLL
or
shared
lib
implemenhng
the
access
of
a
non-‐standard
(OEM)
table
type.
SPLIT
LRECL
BLOCK_SIZE
MULTIPLE
True
for
a
VEC
table
when
each
columns
are
in
separate
files.
When
true,
indexes
are
saved
in
separate
files.
The
file
record
size
(oien
calculated
by
default).
The
number
of
rows
each
block
of
FIX,
BIN,
DBF
OR
VEC
tables
contains.
For
an
ODBC
table
this
is
the
RowSet
size
ophon.
Used
to
specify
mulhple
file
tables.
HEADER
Applies
to
CSV,
VEC
and
HTML
files.
Its
meaning
depends
on
the
table
type.
SUBTYPE
The
subtype
of
an
OEM
table
type.
QUOTED
The
level
of
quohng
used
in
CSV
table
files.
CATFUNC
The
catalog
funchon
used
by
a
catalog
table.
ENDING
End
of
line
length.
Default
to
1
for
Unix/Linux
and
2
for
Windows.
Tuesday, 26 November 13
40
41. CONNECT
Engine
-‐
Features
•Table
“auto-‐creation”
when
the
the
file
does
not
exist
or
it
is
not
specified
•Large
tables
support
(>2GB)
•Available
for
FIX,
BIN
and
VEC
•Use
‘option_list’=‘huge=1’
•Compression
-‐
gzlib
format
•Available
for
DOS,
FIX,
BIN,
CSV
and
FMT
•ODBC
Format
•WHERE
conditions
are
push
to
the
ODBC
source
•Multiple
tables/files
from
data
sources
can
be
consolidated
into
a
single
table
•MYSQL
Format
-‐
to
access
local
or
remote
MySQL
tables
•Allows
to
define
a
subset
of
the
source
columns
and
type
conversion
•Condition
LIMIT
push
down
•Access
to
ODBC
and
UnixODBC
data
sources
41
Tuesday, 26 November 13
42. CONNECT
Engine
-‐
Features
•TBL
-‐
Table
List
Table
•Collection
of
tables
seen
as
one
•No
limitation
on
the
storage
engine:
tables
can
be
from
different
storage
engines
(including
CONNECT)
•The
tables
may
have
different
column
structure
•The
“split
format”
allows
to
split
columns
in
multiple
files
• XML
-‐
for
XML
and
HTML
data
•HTML
attributes
can
be
defined
in
the
option_list
•VEC
-‐
Column
Store
•Data
is
stored
in
binary
files
as
vectors
•I/O
optimization,
as
CONNECT
reads
only
columns
that
are
requested
by
the
query
42
Tuesday, 26 November 13
43. CONNECT
Engine
-‐
Hands
On
•Install
the
plugin
INSTALL PLUGIN CONNECT SONAME 'ha_connect';
•Note:
The
library
may
not
be
in
the
rpm
on
RH/CentOS,
you
can
find
it
in
the
standard
tarball
•Create
a
HTML
table
and
export
data
CREATE TABLE employees engine=connect table_type=XML
file_name='/var/lib/mysql_connect/employees.html' header=yes
option_list='name=TABLE,coltype=HTML,attribute=border=1;cellpadding=5,headattr=bgcolor=yellow'
SELECT emp_no, birth_date, first_name, last_name, hire_date FROM employees.employees;
Query OK, 300024 rows affected (20.99 sec)
Records: 300024 Duplicates: 0 Warnings: 0
43
Tuesday, 26 November 13
49. The
Spider
Storage
Engine
•What
is
the
Spider
storage
engine?
•Spider
is
a
storage
engine
based
on
the
MySQL
partitioning
features,
with
built-‐in
sharding
capabilities
•Tables
of
different
MariaDB
instances
are
handled
as
if
they
are
on
the
same
instance
•It
supports
XA
transactions
and
multiple
storage
engines
(InnoDB,
MyISAM
etc.)
•Developed
by
Kentoku
Shiba,
available
on
Launchpad,
first
introduced
in
2008
and
now
available
in
MariaDB
10
49
Tuesday, 26 November 13
50. Spider
Engine
-‐
Under
the
Bonnet
•Spider
expands
local
partitioning
linking
each
partition
to
remote
MariaDB
tables
•Links
are
stored
into
Spider
systems
tables
(created
during
the
installation)
•Batched
Key
Access
Support
•Support
for
Handlersocket
•XA
Transactions
guarantee
the
atomicity
within
the
cluster
50
Tuesday, 26 November 13
51. Current
Limitations
•Query
cache
must
be
disabled
•Efficiency
is
expected
in
the
same
way
partition
pruning
is
effective
on
partitioning
•If
partition
pruning
is
not
effective,
Spider
will
access
all
the
remote
servers,
with
significant
overhead
•Data
can
be
also
partitioned
in
each
shard
•Still
experimental!
51
Tuesday, 26 November 13
52. Spider
Engine
-‐
An
Example
Table
Colors
Parhhon
1
color
=
‘red’
ServerA
Parhhon
2
color
=
‘black’
ServerB
Parhhon
3
color
=‘white’
ServerC
Parhhon
4
color
=
‘yellow’
ServerD
ServerA
ServerD
ServerC
ServerB
52
Tuesday, 26 November 13
53. Spider
Engine
-‐
An
Example
Table
Colors
Partton
1
color
=
‘red’
ServerA
Partton
2
color=
‘black’
ServerB
Partton
3
color
=‘white’
ServerC
Partton
4
color
=
‘yellow’
ServerD
ServerA
ServerD
ServerC
ServerB
53
Tuesday, 26 November 13
54. Spider
Engine
-‐
Hands
On
•Install
the
plugin
and
create
Spider
objects:
/usr/local/mariadb-10.0.5-linux-x86_64/share/install_spider.sql
...or...
/usr/share/mysql/install_spider.sql
MariaDB [mysql]> show tables like 'spider%';
+---------------------------+
| Tables_in_mysql (spider%) |
+---------------------------+
| spider_link_failed_log
|
| spider_link_mon_servers
|
| spider_tables
|
| spider_xa
|
| spider_xa_failed_log
|
| spider_xa_member
|
+---------------------------+
6 rows in set (0.00 sec)
install_spider.sql
creates
the
procedures:
spider_fix_one_table
spider_fix_system_tables
spider_plugin_installer
The
procedures
are
removed
after
the
installation.
If
you
see
them
in
the
DB
instance,
something
went
wrong...
54
Tuesday, 26 November 13
55. Spider
Engine
-‐
Hands
On
1
CREATE TABLE employees (
emp_no
int(11)
birth_date date
first_name varchar(14)
last_name varchar(16)
gender
char(1)
hire_date date
PRIMARY KEY ( emp_no )
ENGINE=MyISAM;
Sky4
NOT
NOT
NOT
NOT
NOT
NOT
)
NULL,
NULL,
NULL,
NULL,
NULL,
NULL,
2
CREATE TABLE employees (
emp_no
int(11)
NOT NULL,
birth_date date
NOT NULL,
first_name varchar(14) NOT NULL,
last_name varchar(16) NOT NULL,
gender
char(1)
NOT NULL,
hire_date date
NOT NULL,
PRIMARY KEY ( emp_no ) )
ENGINE=spider
COMMENT 'wrapper "mysql",
user "spider", password "spider",
database "spider_test", table "employees",
port "3306"'
PARTITION BY HASH( emp_no )
( PARTITION p1 COMMENT='host "Sky1"',
PARTITION p2 COMMENT='host "Sky2"' );
3
INSERT INTO employees
SELECT * FROM employees.employees;
Sky1
Database
Sky2
Database
55
Tuesday, 26 November 13
56. The
Cassandra
Storage
Engine
Tuesday, 26 November 13
57. The
Cassandra
Storage
Engine
•What
is
the
Cassandra
storage
engine?
•Cassandra
is
a
storage
engine
that
makes
the
Cassandra
column
family
appear
as
a
table
in
MariaDB
•The
engine
acts
as
a
“window”
from
MariaDB
to
look
into
Cassandra
•Possible
operations
are:
• INSERT,
UPDATE,
SELECT
• Joins
between
Cassandra
tables
and
other
storage
engines
57
Tuesday, 26 November 13
58. A
quick
look
at
Cassandra
•Open
Source
distributed
NoSQL
database
•Initially
developed
at
Facebook,
with
lots
of
influence
from
Amazon
Dynamo
•Written
in
Java,
first
release
in
2008,
now
part
of
the
Apache
Foundation
•Key
features:
•Top
1
feature:
fast
inserts/writes
with
linear
scalability
•Large
volume
of
data
automatically
distributed
on
a
cluster
(ring)
•Asynchronous
masterless
distribution,
customizable
replication
•Support
for
geographical
distribution
and
replication
58
Tuesday, 26 November 13
59. Cassandra
Data
Model
•Distributed
key/value
store
(limited
range
scan
support)
•Optionally
flexible
schema
• Pre-‐defined
“static”
columns
• Ad-‐hoc
dynamic
columns
•Automatic
sharding/replication
•Eventual
consistency
•Column
families
are
like
“tables”
•Row
key
-‐>
column
mapping
•Supercolumns
are
not
supported
in
the
storage
engine
59
Tuesday, 26 November 13
61. Cassandra
Storage
Engine
vs.
Cassandra
Query
Language
(CQL)
• Cassandra
Query
Language
(CQL)
is
the
default
and
primary
interface
into
the
Cassandra
DBMS
• The
concept
of
a
table
having
rows
and
columns
is
almost
the
same
in
CQL
and
SQL
• CQL
queries
are
tightly
bound
to
the
way
Cassandra
accesses
its
data
internally
• MariaDB
SQL
queries
have
the
standard
MySQL
format
• Cassandra
does
not
support
joins
or
subqueries,
except
for
batch
analysis
through
Hive
• The
Cassandra
storage
engine
supports
column
family
as
tables
that
can
be
joined
with
other
tables
• No
GROUP
BY,
ORDER
BY
must
be
able
to
use
available
indexes
• The
Cassandra
storage
engine
support
the
GROUP
BY
clause
• WHERE
clause
must
represent
an
index
lookup
• The
Cassandra
storage
engine
provides
the
standard
flexibility
of
the
other
engines
in
the
WHERE
clause
61
Tuesday, 26 November 13
62. Command
Mapping
•MariaDB
SQL
Commands:
•SELECT
-‐>
GET/Scan
•INSERT
-‐>
PUT
(upsert)
•UPDATE/DELETE
-‐>
read
+
write
•INSERT
works
as
“INSERT
or
UPDATE”
•If
a
row
with
the
same
PK
exists,
it
overwrites
the
row
•INSERT...SELECT
and
multi-‐line
INSERT
write
data
in
batches
•UPDATE
works
as
a
standard
SQL
UPDATE
•DELETE
maps
to
the
truncate(column_family)
call
•A
DELETE...WHERE
will
do
a
per-‐row
deletion
•SELECT
works
as
a
standard
MySQL
•Batched
Key
Access
is
supported
•The
batch
size
is
controlled
by
@@cassandra_insert_batch_size
62
Tuesday, 26 November 13
63. Issues
and
limitations
•Cassandra
1.2
has
slightly
changed
its
data
model
•It
is
described
here:
http://www.datastax.com/dev/blog/thrift-‐to-‐cql3.
•This
has
caused
some
of
Thrift-‐based
clients
to
no
longer
work
• for
example,
here's
a
problem
experienced
by
Pig:
CASSANDRA-‐5234
•Currently,
Cassandra
SE
is
only
able
to
access
Cassandra
1.2's
column
families
that
were
defined
WITH
COMPACT
STORAGE
attribute.
63
Tuesday, 26 November 13
64. Cassandra
Engine
-‐
Hands
On
•Install
the
plugin
INSTALL PLUGIN cassandra SONAME 'ha_cassandra.so';
•Create
a
table,
which
is
a
view
of
a
column
family
SET GLOBAL cassandra_default_thrift_host=‘Cassandra’
CREATE TABLE cas_table ( col1 VARCHAR(36) PRIMARY KEY,
col2 VARCHAR(68),
col3 bigint )
ENGINE=cassandra
keyspace=‘mariadbtest’
thrift_host=‘Cassandra’
column_family=‘cf1’;
This
command
does
not
create
a
new
table
in
Cassandra,
only
a
view
in
the
Cassandra
Storage
Engine
By
using
the
default
thrift
host
and
not
the
explicit
host
in
the
table
definition,
the
table
can
be
remapped
dynamically
to
a
different
Cassandra
cluster
A
primary
key
is
mandatory
Columns
map
to
cassandra
static
columns
64
Tuesday, 26 November 13
68. Replication
in
MariaDB
10
•Group
Commit
•Global
Transaction
ID
•Parallel
Slave
•Multi-‐source
replication
(contribution
from
Lixun
Peng
at
TaoBao)
•MySQL/MariaDB
5.X
compatibility
68
Tuesday, 26 November 13
69. MariaDB
Replication
Benefits
•Allows
safe
failover
or
switchover
•GTIDs
are
unique
and
replication
can
be
safely
resume
when
a
master
change
from
one
node
to
another
•Allows
more
sophisticated
topologies
•The
slave
is
crash
safe
for
single
and
multi-‐source
replication
•GTIDs
are
saved
in
a
transactional
system
table
•Faster
with
group
commit
and
parallel
replication
69
Tuesday, 26 November 13
70. Group
Commit
•binlog_commits
•Total
number
of
transactions
commited
to
the
binary
log
•binlog_group_commits
•Total
number
of
groups
of
transactions
committed
to
the
binary
log
•When
sync_binlog=1
it
is
the
number
of
fsync()’s
COMMIT&
70
Tuesday, 26 November 13
71. Global
Transaction
ID
-‐
GTID
•Available
since
10.0.2
•Enabled
by
default
•Treated
as
a
new
event
in
an
event
group
•i.e.
1
GTID
for
each
event
group
•Unique
across
a
group
of
servers
•Make
the
life
of
the
DBAs
much
easier
•Make
a
transaction
unique
throughout
the
whole
organisation
•Easy
to
failover
to
a
slave
•Easy
to
identify
transactions
•Make
multi-‐master
replication
safe
71
Tuesday, 26 November 13
72. Global
Transaction
ID
-‐
GTID
Server
ID
32bit
unsigned
The
server_id
value
for
the
MariaDB
instance
Sequence
Number
64bit
unsigned
It
increases
monotonically
at
each
commit
It
is
applied
for
each
Event
Group,
i.e.,
for
each
BEGIN/COMMIT
or
for
groups
that
have
no
BEGIN/COMMIT
(for
example
DDL
commands,
TRUNCATE
and
others)
Domain
ID
32bit
unsigned
The
domain_id
value
for
the
Replication
stream
#131109 14:59:59 server id 1 end_log_pos 4151 GTID XXX-YYY-ZZZ
/*!100001 SET @@session.gtid_seq_no=ZZZ*//*!*/;
72
Tuesday, 26 November 13
73. MariaDB
Replication
Commands
--On the Master:
SELECT BINLOG_GTID_POS("master-bin.000001", 600);
--On the Slave:
SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO master_host="Node1", master_port=3306,
master_user="root", master_use_gtid=slave_pos;
START SLAVE;
STOP SLAVE;
CHANGE MASTER TO master_host="Node1", master_port=3306,
master_user="root", master_use_gtid=current_pos;
START SLAVE;
STOP SLAVE;
CHANGE MASTER TO master_host='Node2', master_port=3306;
START SLAVE;
Start
replication
from
a
given
position
Switch
from
old-‐style
replication
to
the
new
GTID
Change
Master
The
slave
will
stop
at
a
given
position
START SLAVE master_gtid_pos = "1-11-100,2-21-50";
73
Tuesday, 26 November 13
74. Parallel
Slave
•Sponsored
by
Google
•Transactions
are
applied
in
parallel
if
they
have
been
executed
in
parallel
on
the
master.
•Works
beyond
the
boundaries
of
MySQL
5.6
parallel
slave
•Parallel
threads
apply
to:
•Queries
that
are
run
on
the
master
in
one
group
commit.
•Queries
that
are
from
different
domains.
•Queries
from
different
masters
(when
using
multi-‐source
replication).
•slave_parallel_threads
•Number
of
parallel
threads
on
the
slave
node
•slave_parallel_max_queued
•Number
of
parallel
threads
on
the
slave
node
74
Tuesday, 26 November 13
75. Multi-‐source
Replication
• Data
partitioned
over
many
masters
can
be
pulled
together
onto
one
slave
for
analytical
queries
Stream
1
Stream
2
Stream
3
• Many
masters
can
replicate
to
the
same
slave
and
a
complete
backup
can
be
done
on
the
slave
• Newer
hardware
usually
provides
more
performance.
Usually
all
hardware
isn’t
upgraded
at
once
and
multi-‐
source
can
be
used
for
replicating
many
masters
to
a
powerful
new
slave.
• Up
to
64
masters
75
Tuesday, 26 November 13
76. Multi-‐source
Replication
•New
Syntax
• You
specify
which
master
connection
you
want
to
work
with
by
either
specifying
the
connection
name
in
the
command
or
setting
default_master_connection
to
the
connection
you
want
to
work
with
CHANGE MASTER ["connection_name"] ...
FLUSH RELAY LOGS ["connection_name"]
MASTER_POS_WAIT(....,
["connection_name"])
RESET SLAVE ["connection_name"]
SHOW RELAYLOG ["connection_name"] EVENTS
SHOW SLAVE ["connection_name"] STATUS
SHOW ALL SLAVES STATUS
START SLAVE ["connection_name"...]]
START ALL SLAVES ...
STOP SLAVE ["connection_name"] ...
STOP ALL SLAVES ...
76
Tuesday, 26 November 13
77. GTID
New
Files
and
Structures
• gtid.info
(Data
Directory)
• On
a
master,
fsync()
the
current
seq_no
at
shutdown,
it
will
be
read
at
startup
• mysql.gtid_slave_pos
table
• GTID
of
the
last
transaction
applied
on
the
slave
• gtid
variables
• gtid_binlog_pos
• GTID
of
the
last
event
group
written
to
the
binlog
• gtid_binlog_state
• Used
to
restore
the
state
of
the
binlog
after
a
RESET
• ID
of
the
stream
• gtid_slave_pos
• Used
when
a
slave
starts
• Equal
to
gtid_current_pos
when
the
binlog
is
disabled
on
the
slave
• gtid_strict_mode
• Errors
may
be
generated
if
the
GTID
is
manually
changed
• BINLOG_GTID_POS(binlogFile,
binlogPos)
• Provides
the
GTID
associated
to
a
binlog
file
and
a
position
within
the
file
MASTER
and
PURGE
BINARY
LOGS
• gtid_current_pos
• Used
when
an
ex
master
server
rejoin
a
cluster
• gtid_domain_id
Tuesday, 26 November 13
77
78. Multi-‐source
Replication
•Set
the
servers
in
my.cnf
as
for
standard
replication
[mariadb-10.0]
gtid_domain_id=1
server_id=1
bind_address=192.168.56.21
log_bin
•Assuming
we
will
use
Node
1
and
Node
2
as
masters
and
Node
3
as
slave:
SET @@default_master_connection='Node1';
CHANGE MASTER TO MASTER_HOST = '192.168.56.21', MASTER_USER = 'root';
CHANGE MASTER 'Node2' TO MASTER_HOST = '192.168.56.22', MASTER_USER = 'root';
START ALL SLAVES;
SHOW ALL SLAVE STATUS;
78
Tuesday, 26 November 13
90. Virtual
&
Dynamic
Columns
VIRTUAL
COLUMNS
•For
InnoDB,
MyISAM
and
Aria
•PERSISTENT
(stored)
or
VIRTUAL
(generated)
DYNAMIC COLUMNS
•Implement a schemaless,
document store
• COLUMN_ CREATE, ADD, GET, LIST,
JSON, EXISTS, CHECK, DELETE
• Nested colums are allowed
• Main datatypes are allowed
• Max 1GB documents
CREATE
c1
c2
c3
TABLE t1 (
INT NOT NULL,
VARCHAR(32),
INT AS
( c1 MOD 10 ) VIRTUAL,
c4 VARCHAR(5) AS
( LEFT(B,5) ) PERSISTENT);
CREATE TABLE assets (
item_name
VARCHAR(32) PRIMARY KEY,
dynamic_cols BLOB );
INSERT INTO assets VALUES (
'MariaDB T-shirt',
COLUMN_CREATE( 'color', 'blue',
'size', 'XL' ) );
INSERT INTO assets VALUES (
'Thinkpad Laptop',
COLUMN_CREATE( 'color', 'black',
'price', 500 ) );
90
Tuesday, 26 November 13
91. TokuDB
Storage
Engine
•
Fast inserts/increased
performance
• Increased Compression
Online administration
•
No Index rebuild
•
91
Tuesday, 26 November 13
92. SSD
capabilities
and
Atomic
Writes
with
FusionIO
innodb_use_atomic_writes
innodb_doublewrite=0
innodb_file_flush_method=
O_DIRECT
|
ALL_O_DIRECT
|
O_DIRECT_NO_FSYNC
•Used
with
DirectFS
•Lower
latency
•Increased
Flash
life
•Less
write
amplification
92
Tuesday, 26 November 13
93. Audit
Plugin
•Logs
the
server
activity
•Who
connected
to
the
server
•What
queries
a
user
ran
•What
tables
a
user
touched
•Stored
to
a
rotating
log
file
or
sent
to
the
local
syslogd.
•https://mariadb.com/kb/en/server_audit-‐plugin/
93
Tuesday, 26 November 13
95. MariaDB
Manager?
•MariaDB
Manager
is
a
set
of
servers
used
to
provision,
administer
and
monitor
MariaDB
servers
•The
servers
can
be
co-‐located
with
MariaDB,
they
can
be
installed
and
co-‐
located
on
a
separate
machine
or
they
can
be
installed
on
many
separate
machines.
95
Tuesday, 26 November 13
97. Provision
a
new
node
1
HTTP
POST
method
-‐
Create
a
node
URI:
• .../restfulapi/system/systemid/node
• Parameters:
• name
• hostname
• publicip,
privateip,
port
• instanceid
• dbusername,
dbpassword
• ...
97
Tuesday, 26 November 13
98. Provision
a
new
node
1
HTTP
POST
method
-‐
Create
a
node
2
URI:
• .../restfulapi/system/systemid/nodeethod
-‐
Run
the
command
HTTP
POST
m
• Parameters:
Connect
to
the
new
node
• name
URI:
• hostname
• publicip,
privateip,
port• .../restfulapi/command/connect
• instanceid
• Parameters:
• dbusername,
dbpassword• systemid
• ...
• nodeid
• rootpassword
98
Tuesday, 26 November 13
99. Provision
a
new
node
1
HTTP
POST
method
-‐
Create
a
node
2
URI:
• .../restfulapi/system/systemid/nodeethod
-‐
Run
the
command
HTTP
POST
m
• Parameters:
3
Connect
to
the
new
node
• name
URI:
HTTP
POST
method
-‐
Run
the
command
• hostname
• publicip,
privateip,
port• .../restfulapi/command/connect the
state
of
the
node
Probe
• instanceid
• Parameters:
URI:
dbusername,
dbpassword systemid
•
•
• .../restfulapi/command/probe
• ...
nodeid
•
• Parameters:
• rootpassword
• systemid
• nodeid
99
Tuesday, 26 November 13
100. Provision
a
new
node
1
HTTP
POST
method
-‐
Create
a
node
2
URI:
• .../restfulapi/system/systemid/nodeethod
-‐
Run
the
command
HTTP
POST
m
• Parameters:
3
Connect
to
the
new
node
• name
URI:
HTTP
POST
method
-‐
Run
the
command
• hostname
4
• publicip,
privateip,
port• .../restfulapi/command/connect the
state
of
the
node
Probe
• instanceid
• Parameters:
URI:
HTTP
POST
method
-‐
Run
the
command
dbusername,
dbpassword systemid
•
•
• .../restfulapi/command/probe
• ...
Provision
the
node
nodeid
•
• Parameters:
URI:
• rootpassword
• systemid
• .../restfulapi/command/provision
• nodeid
• Parameters:
• systemid
• nodeid
100
Tuesday, 26 November 13
101. Start
a
node
HTTP
POST
method
-‐
Run
the
command
Start
the
node
URI:
• .../restfulapi/command/start
• Parameters:
• systemid
• nodeid
101
Tuesday, 26 November 13
102. Retrieve
the
status
of
a
cluster
• .../restfulapi/system/systemid
102
Tuesday, 26 November 13
103. Monitor
the
#
of
connections
of
a
node
• ../reslulapi/system/systemid/
node/nodeid/monitor/
{monitorid}/data
103
Tuesday, 26 November 13
104. For
More
Information...
•
Web,
Doc
&
Knowledge
Base:
www.mariadb.org
www.mariadb.com
•
Launchpad:
https://launchpad.net/maria
https://code.launchpad.net/
mariadb-‐native-‐client
•
Jira
(bugs
and
development):
mariadb.atlassian.net
•
IRC:
irc.freenode.net
#maria
webchat.freenode.net/?
randomnick=1&channels=maria
104
Tuesday, 26 November 13
105. Some
pictures
courtesy
of:
• h{p://www.worksmartmompreneurs.com
• h{p://www.nydailynews.com
Thank
You!
www.skysql.com
Tuesday, 26 November 13
ivan@skysql.com
izoratti.blogspot.com
www.slideshare.net/skysql
www.slideshare.net/izoratti