[Pgday.Seoul 2018] AWS Cloud 환경에서 PostgreSQL 구축하기

BespinGlobal 컨설팅 본부
김동수 부장(dongsu.kim@bespinglobal.com)
AWS Cloud 환경에서 PostgreSQL 구축 하기

Copyright © 2018 BESPIN GLOBAL Co., Ltd. All rights reserved | Confidential
http://www.bespinglobal.com
Cloud 환경으로 RDB 마이그레이션
AWS 환경에서 PostgreSQL
Aurora PostgreSQL
Commercial DB(Oracle)를 PostgreSQL로 마이그레이션

Cloud 환경으로 RDB 마이그레이션

최근의 DB 에서의 변화들
• 서비스 특성에 맞는 Data Store 고려
• Cloud환경의 PaaS형 DB 서비스 고려
• 비용 절감을 위한 Open source DB 고려

온-프레미스 환경을 돌아보면…
Difficulty of
resource
expansion
Resource sizing
is important
Multiple
businesses are
running on one
RDB
Various business
types are
running on RDB
Considering the
5-years work-
load increase
Refer to Tpmc.
Resource
over-spec
sizing
Oracle RAC
is preferred
Infra Cost
Efficiency
Performance
test is important

CLOUD 환경에서 달라진 개념
 프로젝트 초기 단계의 인프라의 용량 산정은 상대적으로 덜 중요 하다.
 그러나, 매월 인프라 비용의 적절성을 매월 확인해야 한다.
 튜닝은 원활한 서비스 제공 측면 외에 직접적인 인프라 비용과 직결된다.
 DB 인스턴스는 서비스 단위 별로 철저히 분리 되어야 한다.
 또한, 각 서비스는 비즈니스 요구에 적합한 Data Store를 선택 해야 한다.
 분리된 업무로, 장애 범위가 작아지고, Physical 장비 대비 빠른 fail-over
시간으로 Oracle RAC의 필요성은 상대적으로 줄어 들었다.
 용량 산정을 위한 TPMC 참조는 의미가 적어 졌다.
 성능 테스트의 여러 의미 중 사이징 검증의 의미가 상대적으로 덜 중요해졌다.• 언제든지, 즉시 deploy…
 차세대 프로젝트에서만 가능하던, Re-architecture는 언제든지 가능하다
• Managed, Server-less
• 사용한 만큼 지불
• 필요한 만큼 만 sizing…
• …..

DB Migration Case - RDB to RDB
Commercial
Open Source
Commercial
Open Source
Commercial
Open Source
Same RDB
New RDB
Same RDB
PaaS
IaaS
PaaS
IaaS
New RDB
IaaS
PaaS
New RDB
AWS cloud
corporate data center

DB Migration Case - Modification
Modification range/difficulty
Business RDB DW NoSQL
Migration
From RDB DW RDB NoSQL RDB
To Same RDB New RDB Same DW New DW Same NoSQL New NoSQL
On IaaS PaaS IaaS PaaS IaaS PaaS IaaS PaaS
Application code same same same same same same update same same update
Data model same same same same same update update same transform new
SQL/DML same same update update same same transform same new new
Data same same update update same transform transform same new new
DB Object same same update update same new new same new new
DBMS configuration update update new new update new new update new new
DBMS same same new new same new new same new new
HA/DR new - new - new - - new - -
Backup new - new - new - - new - -
OS/Hypervisor - - - - - - - - - -
H/W - - - - - - - - - -
Amazon
DynamoDB
Amazon
Redshift
Amazon
PostgreSQL
Amazon
RDS, Aurora
same Update Transform New
Cloud provider
Managed

AWS 환경에서 PostgreSQL

AWS 환경의 PostgreSQL 선택
AWS
AuroraEC2
PostgreSQL
RDS
PostgreSQL PostgreSQL
PaaSIaaS
1. EnterpriseDB vs. PostgreSQL?
2. IaaS vs. PaaS?
3. RDS vs. Aurora ?
EC2
EDB

AWS PaaS RDB - Managed Service
AWS managedCustom managed
Easy and fast DBA works
- DBMS install/patch
- Backup/Recovery
- DR/HA with Multi-AZ
- Scale-out with Read Replica
- Storage IOPS and expansion

AWS PaaS RDB - HA/DR, Scale-out
What For RDS Aurora
Multi-AZ HA/DR
Data is real-time copied by storage based sync-mode replication. Infra disaster is auto detected and
fail-over to multi-AZ(availability zone) that have replicated data.
Read
Replica
Read traffic
distribution
(scale-out)
Data is logically replicated to another instances by CDC
based Async-mode replication.
Data is replicated to another instances by shared Multi-AZ
storage. It means read only traffic always read same data
as master node.
Availability Zone A
Sync mode
Block based
Replication
Primary Standby
Application
Read Only
Availability Zone B
Auto
Fail-over
Read/Write
Read Only
Async mode
CDC based
Replication
Availability Zone C
RDS
Availability Zone A
Primary Standby
/ ReadOnly
Application
Availability Zone B
Auto
Fail-over
Read/Write
Read Only
Standby
/ ReadOnly
Standby
/ ReadOnly
Standby
/ ReadOnly
Standby
/ ReadOnly
Availability Zone C
Shared Multi-AZ storage
Aurora

AWS PaaS RDB - Backup/Recovery
Amazon S3
Back-up
- Backup concept is Snapshot
- Backup modified block only
- Automatically only required blocks are retained by backup policy
- Backup is stored on Amazon S3
- Manual, auto-backup (Backup time, schedule, retention)
Snapshot Backup
Recovery
- Easy point-in-time recovery within backup retention period.
 Snapshot backup restore + DB Log apply
- From Snapshot backup, easy provisioning to new instance.
 Suitable for provisioning to QA/Test DB
Point in-time recovery
Point in time RecoverySnapshot 7
Restore
Log Apply

TIP. Snapshot Backup의 활용
Cloud 환경의 특성과 Snapshot backup의 특징을 활용한 Snapshot backup의 활용방안
• PaaS형 RDB (RDS) 특징  Snapshot Backup 활용 방안
Snapshot
Auto Scheduled Backup
RDS RDS
Periodically Auto Create
RDS
Recovery for production crash
RDS
Production
For Recovered Production
For Up-to-dated Test/QA
Temporary Recovery
For temporary data recovery
(complete recovery)
(point-in-time recovery)
(point-in-time recovery)

TIP. Snapshot Backup의 활용
• PaaS형 RDB (RDS) 특징  Snapshot Backup 활용 방안
What is most important for successful project and stable system
management?
What makes Test/QA difficult?
Cost Effort Timelydata and schema
of specific time-stamp
… … … …Test/QA Database !
… … … …Test/QA Environment ! …
… … … …Test/QA !
What is most difficult for Test/QA environment ?
Why Database environment for Test/QA is
difficult?
참고: 최신의 Test/QA환경의 중요성과 이를 위한 Database환경의 어려움

AWS PaaS RDB 지원범위
RDB HA/DR
Backup/
Recovery
Read
Replica
DB
Security
DB
Monitoring
Data
Migration
IaaS
Commercial any
Open Source any
PaaS
Commercial
RDS
•Oracle
•MS-SQL
AWS AWS
AWS
(File Encryption)
AWS
(Performance
Insight,
Cloud Watch)
AWS
Open
Source
•MySQL
•Maria
•PostgreSQL
AWS
Aurora
•MySQL comp
•PostgreSQL comp

AWS RDB PaaS 고려사항 - 제약 사항들
 아래의 제약 조건들은 항목별로 서로 관련이 있으므로, 충분한 검토가 필요하다.
 추가적으로, 지원 가능한 최소 버전, Option group, Parameter group, Character set 등도 검토가 되어야 한다.
(2018.10)
RDB Engine Newest Version
Maximum Specification
vCPU M/M Storage IOPS N/W
Commercial
RDS
Oracle
12.1.0.2.v13
11.2.0.4.v17
128 3,904G 32TB 40,000
25G
MS-SQL 2017 14.00.3035.2.v1
64 488G
16TB
32,000
Open
Source
PostgreSQL
10.5-R1
9.6.10-R1
40,000
MySQL 5.7.23
Maria 10.2.15
Aurora
MySQL 5.6.10a
64TB
(auto expand)
Managed
PostgreSQL
10.4-R1(compatible)
9.6.9-R1(compatible)

Aurora PostgreSQL

Traditional approaches to scale databases
Each architecture is limited by the monolithic mindset
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage
Application
Storage Storage
SQL
Transactions
Caching
Logging
Storage
SQL
Transactions
Caching
Logging
Storage

Reimagining the relational database
What if you were inventing the database today?
You would break apart the stack
You would build something that:
 Lets layers scale out independently…
 Is self-healing…
 Leverages distributed services…

A service-oriented architecture applied to the database
Move the logging and storage layer into a
multitenant, scale-out, database-optimized
storage service
Integrate with other AWS services such as S3,
EC2, VPC, DynamoDB, SWF, and Route 53
for control & monitoring
Make it a managed service – using Amazon
RDS. Takes care of management and
administrative functions.
Amazon
DynamoDB
Amazon SWF
Amazon Route
53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
Amazon RDS

Scale-out, distributed, log structured storage
Master Replica Replica Replica
Availability Zone 1
Shared Storage Volume – Transaction Aware
Primary
Database
Node
Read
Replica /
Secondary
Node
Read
Replica /
Secondary
Node
Read
Replica /
Secondary
Node
Availability Zone 2 Availability Zone 3
AWS Region
Storage
Monitoring
Database and
Instance
Monitoring

Amazon Aurora Storage Engine Overview
Data is replicated 6 times across 3 Availability
Zones
Continuous backup to Amazon S3
(built for 11 9s durability)
Continuous monitoring of nodes and disks for
repair
10 GB segments as unit of repair or hotspot
rebalance
Quorum system for read/write; latency tolerant
Quorum membership changes do not stall writes
Storage volume automatically grows up to 64 TB
AZ 1 AZ 2 AZ 3
Amazon S3
Database
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Monitoring

What can fail?
Segment failures (disks)
Node failures (machines)
AZ failures (network or datacenter)
Optimizations
4 out of 6 write quorum
3 out of 6 read quorum
Peer-to-peer replication for repairs
SQL
Transaction
AZ 1 AZ 2 AZ 3
Caching
Amazon Aurora Storage Engine Fault-tolerance
SQL
Transaction
AZ 1 AZ 2 AZ 3
Caching

Amazon Aurora Replicas
Availability
Failing database nodes are automatically
detected and replaced
Failing database processes are
automatically detected and recycled
Replicas are automatically promoted to
primary if needed (failover)
Customer specifiable fail-over order
AZ 1 AZ 3AZ 2
Primary
Node
Primary
Node
Primary
Database
Node
Primary
Node
Primary
Node
Read
Replica
Primary
Node
Primary
Node
Read
Replica
Database
and Instance
Monitoring
Performance
Customer applications can scale out read
traffic across read replicas
Read balancing across read replicas

Faster, more predictable failover with Amazon Aurora
App
RunningFailure Detection DNS Propagation
Recovery
Database
Failure
Amazon RDS for PostgreSQL is good: failover times of ~60 seconds
Replica-Aware App Running
Failure Detection DNS Propagation
Recovery
Database
Failure
Amazon Aurora is better: failover times < 30 seconds
1 5 - 2 0 s e c 3 - 1 0 s e c
App
Running
1 5 - 2 0 s e c 3 0 - 4 0 s e c

Amazon Aurora Continuous Backup
Segment snapshot Log records
Recovery point
Segment 1
Segment 2
Segment 3
Time
• Take periodic snapshot of each segment in parallel; stream the logs to Amazon S3
• Backup happens continuously without performance or availability impact
• At restore, retrieve the appropriate segment snapshots and log streams from S3 to
storage nodes
• Apply log streams to segment snapshots in parallel and asynchronously

Traditional databases
Have to replay logs since the last
checkpoint
Typically 5 minutes between checkpoints
Single-threaded in MySQL and
PostgreSQL; requires a large number of
disk accesses
Amazon Aurora
No replay at startup because storage system
is transaction-aware
Underlying storage replays log records
continuously, whether in recovery or not
Coalescing is parallel, distributed, and
asynchronous
Checkpointed Data Log
Crash at T0 requires
a re-application of the
SQL in the log since
last checkpoint
T0 T0
Crash at T0 will result in logs being applied to
each segment on demand, in parallel,
asynchronously
Amazon Aurora Instant Crash Recovery

High Performance
Easy
to Operate &
Compatible
High Availability
Secure
by Design
Amazon Aurora with PostgreSQL compatibility
 2x-3x more throughput than PostgreSQL
 Up to 64 TB of storage per instance
 Write jitter reduction
 Near synchronous replicas
 Reader endpoint
 Enhanced OS monitoring
 Performance Insights
 Push button migration
 Auto-scaling storage
 Continuous backup and PITR
 Easy provisioning / patching
 All PostgreSQL features
 All RDS for PostgreSQL extensions
 AWS DMS supported inbound
 Failover in less than 30 seconds
 Customer specifiable failover order
 Up to 15 readable failover targets
 Instant crash recovery
 Survivable buffer cache
 X-region snapshot copy
 Encryption at rest (AWS KMS)
 Encryption in transit (SSL)
 Amazon VPC by default
 Row Level Security

Amazon
Aurora
Available
Durable
The Amazon Aurora Database Family
AWS DMS Amazon RDS
AWS IAM, KMS
& VPC
Amazon
S3
Convenient
Compatible
Automatic
Failover
Read
Replicas
X
6 Copies
High
Performance
& Scale
Secure
Encryption at rest
and in transit
Enterprise
Performance
64TB
Storage
PostgreSQL
MySQL

Comparison of RDS & Aurora
Category RDS Aurora
Performance Good performance 5x faster for MySQL
3x faster for PostgreSQL
Scalability Up to 5 read replicas
Lag measured in seconds
Up to 15 read replicas
Lag measured in milliseconds
Failover Around 60 seconds Less than 30 seconds
Storage Scales up to 16 TB Scales up to 64 TB. Auto-scales in 10 GB
increments.
High Availability Multi-AZ is available Writes 6 copies to 3 AZs
Backup Takes daily snapshot during backup window
& captures transaction logs
Continuous, asynchronous backup to S3 (no
backup windows)
Instance Types M3, M4, R3, T2 R3, R4, T2 for MySQL
R4 for PostgreSQL
Innovations with Aurora Fast Database Cloning
Fast DDL
Advanced Auditing
Performance Insights (Preview)
Parallel Query (Preview)
Serverless (Preview)
Multi-Master (Coming in 2018)

Commercial DB(Oracle)를 PostgreSQL로 마이그레이션

Open Source RDB 고려 사항  고객의 걱정
Requirement
Functional
• Supporting Object
• Supporting SQL
• Supporting Function and …
Non-Functional
• Stability from failure
• Performance
• HA/DR, Backup/Recovery and
 상용 DB(Oracle)를 Open Source RDB로 migration 하는 가장 큰 동기는 이다.
 즉, Open Source RDB 선택의 판단 기준은
형 를 추천 합니다.
Risk
Migration
• Easy to convert Objects
• Easy to convert SQL
• How to data migration
Management
• Who support RDB
• Bug fix ownership
• Number of DB expert

Why PostgreSQL - MySQL vs. PostgreSQL
https://db-engines.com/en/ranking/relational+dbms
RDS-
MySQL
Maria
RDS-
PostgreSQL
Open Source RDB의 가격 비교
10 10.6
RDS-
MySQL
Maria
RDS-
Oracle
RDS-
PostgreSQL
Oracle과의 Open Source RDB 종합 평점
10
?
?

Why PostgreSQL > Avoidance of RISK
https://db-engines.com/en/ranking/relational+dbms
Require
ment
Stability from failure Reference, Reputation
Performance Performance Test
Supporting Object and SQL Supporting matrix
HA/DR, Backup/Recovery, … AWS
Risk
Easy to convert Objects AWS SCT
Easy to convert SQL AWS SCT BespinGlobal
How to data migration AWS DMS
RDB Support AWS BespinGlobal
Bug fix ownership
RDS Community EnterpriseDB
Aurora AWS
Number of DB expert Small
AWS PostgreSQL 평가
http://www.sql-workbench.net/dbms_comparison.html
베스핀 성능 비교 TEST 결과 참조

참고: DB 성능 테스트 결과

참고: DB 기능 비교
Last updated: 2017-11-12
This comparison focuses on
SQL features that can be
used in SQL statements or
self-contained SQL scripts
that don't require additional
software (e.g. a compiler) to
be usable. Features for
database administration or
deployment are also not the
focus of this comparison.
http://www.sql-
workbench.net/dbms_comp
arison.html

Why PostgreSQL > Avoidance of RISK
RDB Support
Bug fix
ownership
IaaS
Commercial BYOL
AWS
or
MSP
DB vendor
Open Source Community
PaaS
RDS
Commercial
Included AWS
BYOL DB vendor
Open Source
Community
Aurora AWS
EPAS on EC2
RDS /
Aurora
License Subscription X
Oracle SQL
Compatibility
High Medium
Merit of PaaS X O
RDB 지원 EPAS와 비교
• 일반적인 DB에 대한 지원과 문제 상황에 대한 지원은 AWS 혹은 MSP 업체가
진행합니다.
• 단, 문제에 대한 최종 결론이 RDB의 Bug fix일 경우, Open source의 특성상,
ownership은 달라집니다.
• 가급적, Bug Fix ownership 관점에서는 Included License나 Aurora 사용이
유리 합니다.
기존 시스템 마이그레이션 시, EPAS대비 PostgreSQL
단점인 Oracle SQL과의 호환성 차이는 
BespinGlobal은 process, tool 그리고 Know-how를 통해
지원합니다.

Migration to PostgreSQL - Process & Tool
Object Migration
PL/SQL Migration
Schema
Check
XML SQL Migration
XML SQL
Check
Function Check
Manual
SQL rewrite
Data Migration Data Check
AWS SCT AWS DMS
Establish
alternative
SCT Summary
report
Start
End
AWS SCT
BespinGlobal Script
BespinGlobal XMT BespinGlobal QCT
BespinGlobal QCT
SQL rewrite Guide

AWS SCT Summary Report (Sample)

Conversion Guide > SQL rewrite guide (Issue Type 별)

BespinGlobal QCT ( Query Check Tool )
yes
Load Target Info
(XML, Function)
시작
PostgreSQL
XML Files
Parse Target
extract Info
1. XML Query
2. XML / Function Param Info
parameter setting
need param
change ?
save data
1. default param info
Run Query Test
no
save data
1. test log
2. (if success then) successed param info
3. (if new param then) default param info
Monitor
(session, lock)
Log View
session monitor
1. kill session
2. add except from run list
lock monitor
1. kill waiting session
log view
1. brief view
2. detail view
Step0. 옵션 설정
1. xml, procedure, 접속정보 등 옵션 설정
Step1. XML 파싱 / 함수 정보 로드
xml 파싱 / 함수 정보 로드
1. xml 파싱 쿼리 추출 / 객체 정보 조회 함수 정보 로드
Step2. Parameter Setting
파라미터 정보 셋팅
1. 파라미터 정보 추출 및 부재시 기본 파라미터 입력
2. 기본 파라미터 수정 가능
3. 성공 파라미터 제거 가능
Step3. XML 쿼리 / 함수 콜 테스트 수행
1. 선택된 대상 xml sql / 함수 수행
2. 성공 파라미터 우선 사용
3. 파라미터 치환 후 문장 수행
4. 결과 및 성공 파라미터 셋 저장
Step4. Result
1. 현재 수행 결과 확인
2. 누적 수행 결과 확인

BespinGlobal XMT ( XML Migration tool )
• 개발 환경 : Electron (Node.JS,
Chromium 기반 크로스 데스크 탑 앱)
• 지원 OS : Windows, Linux, Mac
시작
XML Parsing
Build
Create Stmt
Oracle
SCT Target Xml
Parsing
SCT Conversion
Xml Parsing
Xml BuildTransList.json
Create
Procedure
XML
Save Project
Target.xml
SCT 시작
SCT 종료
Convert
XML종료
readread
read
write
write
create
write
Create Procedure Build XML
AWS Schema Conversion Tool
loop per
xml files
loop per
xml files
Extract
Stmt & Param
Extract
Stmt & Param
Create
XML File
Converted
Stmt list
write
read
Step2. execute AWS SCT
Step1. Create Procedure
XML SQL로 Oracle에 Procedure를
Oracle에 생성
read
Step3. Build XML
이기종 DB로 Converting된 Function의
SQL을 XML로 생성
SCT를 이용해 Oracle의 Function을
이기종 DB로 migration

Migration to PostgreSQL – Milestone 예
2nd. Migration
(Object + Data)
1st. Migration
(Object + Data)
Object and Data
verification
SQL Conversion
Guide
Developer Seminar
Dev DB Setup
Start Object version
management
PL/SQL conversion
SQL conversion
Last Migration
(Data only)
CDC Start
AS-IS Schema
freezing
Service cut-off
QA DB Setup
Service Open
Final check
(Object and SQL)

BespinGlobal CDP (Cloud Data Platform) 팀 소개
https://www.youtube.com/watch?v=0GBdUXlhsdI
클라우드 세상에서 IT 관리자로 살아남기
We are hiring
 On-premise  AWS Migration
 Oracle  AWS PostgreSQL
 RDB을 넘어, AWS 환경에서의 Data Architect를 지향
 국내 최대(大) Data관련 전문가 보유

[Pgday.Seoul 2018] AWS Cloud 환경에서 PostgreSQL 구축하기

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [Pgday.Seoul 2018] AWS Cloud 환경에서 PostgreSQL 구축하기

Similar to [Pgday.Seoul 2018] AWS Cloud 환경에서 PostgreSQL 구축하기 (20)

More from PgDay.Seoul

More from PgDay.Seoul (20)

Recently uploaded

Recently uploaded (20)

[Pgday.Seoul 2018] AWS Cloud 환경에서 PostgreSQL 구축하기