2. Generic Architecture of SQL Server
Source : http://blog.sqlauthority.com/2007/12/08/sql-server-generic-architecture-image
3. Physical Structure of Database
File Group Transaction Log File
Data Files
Tables and Indexes
Extent
Page
4. Physical v/s Logical Architecture
Physical Architecture Logical Architecture
Tells about how the data Tells about how the data
is actually stored in file is logically grouped and
system of an operating presented to the user.
system. Tables, constraints, vie
Page, extent, database ws, stored
files, transaction log procedures, functions, tr
files etc. are core iggers etc. are core
components of physical components of logical
architecture. architecture.
5. Components of Physical Architecture
Page
Extent
Table
Index
Database File
Database File Group
Transaction Log File
6. Page
Page is the smallest unit of storage in SQL server.
Page size is 8 KB.
Page begins with a 96 byte header that stores page
number, page type, the amount of free space on the
page, and the allocation unit ID of the object that
owns the page. After header, data rows are stored
one after another. At end of the page a row offset
table exists; one row for each data row in page.
Following is the layout of a page - –
7. Page
Below table shows various types of pages used in
data file and the type of contents stored in page.
Page type Contents
Data rows with all data, except text, ntext, image, nvarchar(max), varchar(max),
Data
varbinary(max), and xml data, when text in row is set to ON.
Index Index entries.
Large object data types:
•text , ntext, image, nvarchar(max), varchar(max), varbinary(max), and xml data
Text/Image
Variable length columns when the data row exceeds 8 KB:
•varchar , nvarchar, varbinary, and sql_variant
Global Allocation Map,
Shared Global Allocation Information about whether extents are allocated.
Map
Page Free Space Information about page allocation and free space available on pages.
Index Allocation Map Information about extents used by a table or index per allocation unit.
Information about extents modified by bulk operations since the last BACKUP
Bulk Changed Map
LOG statement per allocation unit.
Information about extents that have changed since the last BACKUP DATABASE
Differential Changed Map
statement per allocation unit.
8. Extent
An extent consists of 8 adjacent pages.
There are two types of extents –
Extent type Contents
Uniform Extent All 8 pages in the extent are owned and used by a single object.
Each of the 8 pages in the extent may be owned and used by
Mixed Extent
different objects.
A new table or index is generally allocated pages
from mixed extents. When the table or index grows
to the point that it has eight pages, it then switches
to use uniform extents for subsequent allocations.
While creating an index on an existing table that has
enough rows to generate eight pages in the index,
index is allocated in uniform extents.
9. Extent Allocation
Two allocation maps are used for extent allocation –
Extent type Contents
Global Allocation Map GAM uses 1 bit to tell whether an extent is free on not.
(GAM) Each GAM covers 64000 extents.
Shared Global SGAM uses 1 bit to tell whether a mixed extent has free
Allocation Map (SGAM) pages or not. Each SGAM covers 64000 extents.
To allocate a uniform extent, the DB Engine searches the
GAM for a 1 bit and sets it to 0.
To find a mixed extent with free pages, it searches the
SGAM for a 1 bit.
To allocate a mixed extent, it searches the GAM for a 1
bit, sets it to 0, and then also sets the corresponding bit
in the SGAM to 1.
To deallocate an extent, the DB engine makes sure that
the GAM bit is set to 1 and the SGAM bit is set to 0.
10. Free Space Management
Page Free Space (PFS) pages record allocation
status of each page, along with amount of free space
on the page.
PFS uses 1 byte for each page to record if the page
is unallocated, allocated, empty, 1 to 50 percent
full, 51 to 80 percent full, 81 to 95 percent full, or 96
to 100 percent full.
The amount of free space in a page is only
maintained for heap and Text/Image pages.
DB Engine uses PFS to find a page with free space
available to hold a newly inserted row.
11. Handling Large Rows
Usually rows can not span pages, but part of row
can be moved to another page for large rows.
The tables containing varchar, nvarchar, varbinary, or
sql_variant type column can exceed row size of 8KB.
When row size exceeds 8KB, DB engine moves one
or more variable length columns to another page in
the ROW_OVERFLOW_DATA allocation unit.
A 24 byte pointer of new page is saved on old page.
The individual columns must not exceed 8KB size.
12. Table
A table is contained in one or more partitions and each
partition contains data rows in either a heap or a
clustered index structure.
The pages of the heap or clustered index are managed in
one or more allocation units, depending on the column
types in the data rows.
Partition – is a user-defined unit of data organization that
resides in single file group.
Heap – is a table without
clustered index.
Clustered Table – is a table
with clustered index.
13. Index
SQL Server uses B-tree data structure for index.
The top node of the B-tree is called root node. The
bottom level nodes are called leaf nodes. Any index
levels in between are known as intermediate levels.
Index are of two types –
Clustered Index – The leaf layer of a clustered index
contains the data pages of the underlying table
Non-clustered Index – The leaf layer of a non-clustered
index contains index pages instead of data pages.
14. Allocation Unit
An allocation unit is a collection of pages within a
heap or B-tree.
It manages data according to page types.
A heap or B-tree can have only one allocation unit of
each type in a specific partition.
Allocation unit type Is used to manage
Data or index rows that contain all data, except large object (LOB)
IN_ROW_DATA data.
Pages are of type Data or Index.
Large object data stored in one or more of these data types: text,
ntext, image, xml, varchar(max), nvarchar(max), varbinary(max),
LOB_DATA
or CLR user-defined types (CLR UDT).
Pages are of type Text/Image.
Variable length data stored in varchar, nvarchar, varbinary, or
ROW_OVERFLOW_DATA sql_variant columns that exceed the 8,060 byte row size limit.
Pages are of type Text/Image.
15. Database File
SQL Server uses three types of files to map a database
to file system of operating system –
Primary Data File (.mdf) – It is the starting point of the database
and points to the other files in the database. Every database
has one primary data file.
Secondary Data File (.ndf) – Any data file other than primary
data file. A database may or may not have secondary data files.
Log file (.ldf) - Log files hold all the log information that is used
to recover the database. There must be at least one log file for
each database.
SQL Server database files have two types of names –
logical_file_name – used in T-SQL to refer to the physical file
os_file_name – full name (including folder path) of physical file
16. Database File Organization
The first page in a database file is file header which
contains metadata of file.
The next page is PFS which is followed by GAM,
SGAM, BCM and DCM pages respectively.
Next PFS page is after approx. 8000 pages in file.
Next GAM, SGAM, BCM and DCM pages appear
after an interval of 64000 extents in file.
17. Database File Group
Database files can be grouped into file groups for
allocation and administration purposes.
File groups are of two types –
Primary – It contains primary data files and pages of
system tables. Any other files are allocated in primary file
group if no other file group is specified while creating.
User-defined – These are created by database users.
One file can be part of only one file group.
For large objects e.g. tables and indexes created in
a file group, all their pages will also be part of the
same file group or objects can be partitioned.
18. Transaction Log File
DB engine uses transaction logs to maintain integrity of
database and for data recovery.
Transaction log file consists of log records of operations
performed and are stored sequentially.
Following types of operations are logged –
The start and end of each transaction.
Every data modification (insert, update, or delete) by system
stored procedures or data definition language (DDL)
statements to any table, including system tables.
Every extent and page allocation or deallocation.
Creating or dropping a table or index.
Rollback operations
Either of below two is put in a log record for an operation-
The logical operation performed
The before and after images of the modified data
19. Transaction Log Features
The transaction log supports the following operations:
Recovery of individual transactions.
Recovery of all incomplete transactions when SQL Server is
started.
Rolling a restored database, file, filegroup, or page forward to
the point of failure.
Supporting transactional replication.
Supporting standby-server solutions.
The log cache is managed separately from the buffer
cache for data pages.
The format of log records and pages is not constrained to
follow the format of data pages.
The mechanism to reuse the space in log files is quick
and has minimal effect on transaction throughput.
20. Transaction Log Storage
The transaction log maps to one or more physical files in DB.
Each physical log file is internally divided into indefinite
number of virtual log files of unfixed size.
The size of the virtual log files is non-configurable and decided
dynamically by DB engine while creating or extending log files.
Each log record is identified by a log sequence number (LSN).
New log record is written at the end with LSN higher than last.
Each log record contains the ID of the transaction that it
belongs to. For each transaction, all log records associated
with the transaction are individually linked in a chain using
backward pointers that speed the rollback of the transaction.
Each transaction reserves space on the transaction log to
support a rollback in case of an explicit rollback or an error.
Reserved space is freed on transaction completion.
21. Transaction Log Terms
MinLSN – is the log sequence number of the oldest log
record that is required for a successful database-wide
rollback.
Active Log – is the section of the log file from the
MinLSN to the last-written log record. The part of active
log can nott be truncated.
Checkpoint – occurs when dirty data pages from the
buffer cache of the current database are flushed to disk.
Dirty Page – is a page modified in the cache, but not yet
written to disk.
Page Flush – is process of writing a modified data page
from the buffer cache to disk.
22. Write-ahead Transaction Log
Process of writing data modifications to disk
SQL Server maintains a buffer cache into which it reads data
pages when data must be retrieved.
Data modification is made to the copy of page in buffer cache.
A log record is built in the log cache that records the modification.
Log records are written to disk when transactions are committed.
The modification is written to disk when a checkpoint occurs in
DB, or the modification must be written to disk so the buffer can be
used to hold a new page.
If the dirty page is flushed before the log record is written, the dirty
page creates a modification on the disk that cannot be rolled back
if the server fails before the log record is written to disk.
SQL Server uses write-ahead log (WAL) to guarantee that log
records are written to disk before the associated data
modifications are written to disk.