SlideShare a Scribd company logo
1 of 76
Download to read offline
April 21, 2015
Seattle, WA
Big Data Collection & Storage
Amazon DynamoDB
•  Managed NoSQL database service
•  Supports both document and key-value data models
•  Highly scalable – no table size or throughput limits
•  Consistent, single-digit millisecond latency at any
scale
•  Highly available—3x replication
•  Simple and powerful API
DynamoDB Table
Table
Items
A,ributes
Hash
Key
Range
Key
Mandatory
Key-value access pattern
Determines data distribution Optional
Model 1:N relationships
Enables rich query capabilities
All  items  for  a  hash  key
==,  <,  >,  >=,  <=
“begins  with”
“between”
sorted  results
counts
top/bo,om  N  values
paged  responses
CreateTable
UpdateTable
DeleteTable
DescribeTable
ListTables
PutItem
UpdateItem
DeleteItem
BatchWriteItem
GetItem
Query
Scan
BatchGetItem
ListStreams
DescribeStream
GetShardIterator
GetRecords
TableAPIItemAPI
New
DynamoDB
API
Stream API
Data types
String (S)
Number (N)
Binary (B)
String Set (SS)
Number Set (NS)
Binary Set (BS)
Boolean (BOOL)
Null (NULL)
List (L)
Map (M)
Used for storing nested JSON documents
00 55 A954 AA FF
Hash table
•  Hash key uniquely identifies an item
•  Hash key is used for building an unordered hash index
•  Table can be partitioned for scale
00 FF
Id = 1
Name = Jim
Hash (1) = 7B
Id = 2
Name = Andy
Dept = Engg
Hash (2) = 48
Id = 3
Name = Kim
Dept = Ops
Hash (3) = CD
Key Space
Partitions are three-way replicated
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Id = 2
Name = Andy
Dept = Engg
Id = 3
Name = Kim
Dept = Ops
Id = 1
Name = Jim
Replica 1
Replica 2
Replica 3
Partition 1 Partition 2 Partition N
Hash-range table
•  Hash key and range key together uniquely identify an Item
•  Within unordered hash index, data is sorted by the range key
•  No limit on the number of items (∞) per hash key
–  Except if you have local secondary indexes
00:0 FF:∞
Hash (2) = 48
Customer# = 2
Order# = 10
Item = Pen
Customer# = 2
Order# = 11
Item = Shoes
Customer# = 1
Order# = 10
Item = Toy
Customer# = 1
Order# = 11
Item = Boots
Hash (1) = 7B
Customer# = 3
Order# = 10
Item = Book
Customer# = 3
Order# = 11
Item = Paper
Hash (3) = CD
55 A9:∞54:∞ AA
Partition 1 Partition 2 Partition 3
DynamoDB table examples
case class CameraRecord(
cameraId: Int, // hash key
ownerId: Int,
subscribers: Set[Int],
hoursOfRecording: Int,
...
)
case class Cuepoint(
cameraId: Int, // hash key
timestamp: Long, // range key
type: String,
...
)HashKey RangeKey Value
Key Segment 1234554343254
Key Segment1 1231231433235
Local Secondary Index (LSI)
alternate	
  range	
  key	
  +	
  same	
  hash	
  key	
  
index	
  and	
  table	
  data	
  is	
  co-­‐located	
  (same	
  par88on)	
  
10 GB max per hash key, i.e.
LSIs limit the # of range keys!
Global Secondary Index
any	
  a:ribute	
  indexed	
  as	
  
new	
  hash	
  and/or	
  range	
  key	
  
RCUs/WCUs
provisioned separately
for GSIs
Online indexing
LSI or GSI?
•  LSI can be modeled as a GSI
•  If data size in an item collection > 10 GB, use GSI
•  If eventual consistency is okay for your
scenario, use GSI!
•  Stream of updates to
a table
•  Asynchronous
•  Exactly once
•  Strictly ordered
–  Per item
•  Highly durable
•  Scale with table
•  24-hour lifetime
•  Sub-second latency
DynamoDB Streams
DynamoDB Streams and AWS Lambda
Emerging Architecture Pattern
Scaling
•  Throughput
–  Provision any amount of throughput to a table
•  Size
–  Add any number of items to a table
•  Max item size is 400 KB
•  LSIs limit the number of range keys due to 10 GB limit
•  Scaling is achieved through partitioning
Throughput
•  Provisioned at the table level
–  Write capacity units (WCUs) are measured in 1 KB per second
–  Read capacity units (RCUs) are measured in 4 KB per second
•  RCUs measure strictly consistent reads
•  Eventually consistent reads cost 1/2 of consistent reads
•  Read and write throughput limits are
independent
WCURCU
Partitioning example
#   𝑜𝑓   𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠  =  ​8   𝐺𝐵/10   𝐺𝐵  = 0.8 = 1
( 𝑓𝑜𝑟   𝑠𝑖𝑧𝑒)
#   𝑜𝑓   𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠
( 𝑓𝑜𝑟   𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡)
=      ​​5000↓𝑅𝐶𝑈 /3000   𝑅𝐶𝑈   +  ​​500↓𝑊𝐶𝑈 /1000   𝑊𝐶𝑈  =
2.17 = 3
Table  size  =  8  GB,  RCUs  =  5000,  WCUs  =  500
#   𝑜𝑓   𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠=​MAX⁠​1   𝑓𝑜𝑟   𝑠𝑖𝑧𝑒⁠  3   𝑓𝑜𝑟   𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡  
( 𝑡𝑜𝑡𝑎𝑙)
RCUs  per  partition  =  5000/3  =  1666.67
WCUs  per  partition  =  500/3  =    166.67
Data/partition  =  10/3  =  3.33  GB
RCUs and WCUs are uniformly
spread across partitions
DynamoDB Best Practices
Amazon DynamoDB Best Practices
•  Keep item size small
•  Store metadata in Amazon DynamoDB and
large blobs in Amazon S3
•  Use a table with a hash key for extremely
high scale
•  Use table per day, week, month etc. for
storing time series data
•  Use conditional updates for de-duping
•  Use hash-range table and/or GSI to model
–  1:N, M:N relationships
•  Avoid hot keys and hot partitions
Events_table_2012	
  
Event_id
(Hash key)	
  
Timestam
p
(range	
  key)	
  
Attribute1	
   ….	
   Attribute N	
  
Events_table_2012_05_week1	
  
Event_id
(Hash key)	
  
Timestam
p
(range	
  key)	
  
Attribute1	
   ….	
   Attribute N	
  Events_table_2012_05_week2	
  
Event_id
(Hash key)	
  
Timestam
p
(range	
  key)	
  
Attribute1	
   ….	
   Attribute N	
  
Events_table_2012_05_week3	
  
Event_id
(Hash key)	
  
Timestam
p
(range	
  key)	
  
Attribute1	
   ….	
   Attribute N	
  
www.youtube.com/watch?v=VuKu23oZp9Q
http://www.slideshare.net/AmazonWebServices/deep-dive-
amazon-dynamodb
objects
buckets
•  Designed for 99.999999999% durability
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Notifications
Notifications
Foo() {
…
}
232a
7b54
921c
File
•  Compress data files
–  Reduces Bandwidth
•  Avoid small files
–  Hadoop mappers proportional to number of files
–  S3 PUT cost quickly adds up
Algorithm % Space
Remaining
Encoding
Speed
Decoding
Speed
GZIP 13% 21MB/s 118MB/s
LZO 20% 135MB/s 410MB/s
Snappy 22% 172MB/s 409MB/s
•  Use S3DistCP to combine smaller files together
•  S3DistCP takes a pattern and target path to combine smaller
input files to larger ones
"--groupBy,.*XABCD12345678.([0-9]+-[0-9]+-[0-9]+-[0-9]+).*“
•  Supply a target size and compression codec
"--targetSize,128",“--outputCodec,lzo"
s3://myawsbucket/cf/XABCD12345678.2012-02-23-01.HLUS3JKx.gz
s3://myawsbucket/cf/XABCD12345678.2012-02-23-01.I9CNAZrg.gz
s3://myawsbucket/cf/XABCD12345678.2012-02-23-02.YRRwERSA.gz
s3://myawsbucket/cf/XABCD12345678.2012-02-23-02.dshVLXFE.gz
s3://myawsbucket/cf/XABCD12345678.2012-02-23-02.LpLfuShd.gz
s3://myawsbucket/cf1/2012-02-23-01.lzo
s3://myawsbucket/cf1/2012-02-23-02.lzo
AWS Import/
Export
AWS Direct Connect
Internet
Amazon S3
AWS Region
Corporate Data
Center
Amazon
EC2
Availability Zone
Using AWS for Multi-instance, Multi-part
Uploads
Moving Big Data into the Cloud with Tsunami
UDP
Moving Big Data Into The Cloud with ExpeDat
Gateway for Amazon S3
Amazon Kinesis
4 4 3 3 2 2 1 14 3 2 1
4 3 2 1
4 3 2 1
4 3 2 1
4 4 3 3 2 2 1 1
Producer 1
Shard or Partition 1
Shard or Partition 2
Consumer 1
Count of
Red = 4
Count of
Violet = 4
Consumer 2
Count of
Blue = 4
Count of
Green = 4
Producer 2
Producer 3
Producer N
Key = Red
Key = Green
Key = Blue
Key = Violet
Amazon Kinesis
Managed Service for streaming data ingestion, and processing
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates data
across three data centers (availability zones)
Aggregate and
archive to S3
Millions of
sources producing
100s of terabytes
per hour
Front
End
Authentication
Authorization
Ordered stream
of events supports
multiple readers
Real-time
dashboards
and alarms
Machine learning
algorithms or
sliding window
analytics
Aggregate analysis
in Hadoop or a
data warehouse
Inexpensive: $0.028 per million puts
Sending & Reading Data from Kinesis Streams
HTTP Post
AWS SDK
LOG4J
Flume
Fluentd
Get* APIs
Kinesis Client Library
+
Connector Library
Apache
Storm
Amazon Elastic
MapReduce
Sending Consuming
AWS Mobile
SDK
Kinesis Stream & Shards
Shard
Shard
1 MB/S
2 KB * 500 TPS = 1000KB/s
1 MB/S
2 KB * 500 TPS = 1000KB/s
Payment Processing Application
1 MB/S
1 MB/S
Producers
Theoretical Minimum of 2 Shards Required
Shard
Shard
1 MB/S
2 KB * 500 TPS = 1000KB/s
1 MB/S
2 KB * 500 TPS = 1000KB/s
Payment Processing Application
Fraud Detection Application
Recommendation Engine Application
Egress Bottleneck
Producers
MergeShards Takes two adjacent
shards in a stream
and combines them
into a single shard to
reduce the stream's
capacity
X-Amz-Target: Kinesis_20131202.MergeShards
{
"StreamName": "exampleStreamName",
"ShardToMerge": "shardId-000000000000",
"AdjacentShardToMerge":
"shardId-000000000001"
}
SplitShard Splits a shard into
two new shards in
the stream, to
increase the
stream's capacity
X-Amz-Target: Kinesis_20131202.SplitShard
{
"StreamName": "exampleStreamName",
"ShardToSplit": "shardId-000000000000",
"NewStartingHashKey": "10"
}
Ø  Both are online operations
Producer
Shard 1
Shard 2
Shard 3
Shard n
Shard 4
Producer
Producer
Producer
Producer
Producer
Producer
Producer
Producer
Kinesis
Putting Data into Kinesis
Simple Put interface to store data in Kinesis
Determine Your Partition Key Strategy
•  Kinesis as a managed buffer or a streaming map-
reduce?
•  Ensure a high cardinality for Partition Keys with
respect to shards, to prevent a “hot shard” problem
–  Generate Random Partition Keys
•  Streaming Map-Reduce: Leverage Partition Keys for
business specific logic as applicable
–  Partition Key per billing customer, per DeviceId, per
stock symbol
Provisioning Adequate Shards
•  For ingress needs
•  Egress needs for all consuming applications: If more
than 2 simultaneous consumers
•  Include head-room for catching up with data in stream
in the event of application failures
Pre-Batch before Puts for better efficiency
# KINESIS appender
log4j.logger.KinesisLogger=INFO, KINESIS
log4j.additivity.KinesisLogger=false
log4j.appender.KINESIS=com.amazonaws.services.kinesis.log4j.KinesisA
ppender
# DO NOT use a trailing %n unless you want a newline to be
transmitted to KINESIS after every message
log4j.appender.KINESIS.layout=org.apache.log4j.PatternLayout
log4j.appender.KINESIS.layout.ConversionPattern=%m
# mandatory properties for KINESIS appender
log4j.appender.KINESIS.streamName=testStream
#optional, defaults to UTF-8
log4j.appender.KINESIS.encoding=UTF-8
#optional, defaults to 3
log4j.appender.KINESIS.maxRetries=3
#optional, defaults to 2000
log4j.appender.KINESIS.bufferSize=1000
#optional, defaults to 20
log4j.appender.KINESIS.threadCount=20
#optional, defaults to 30 seconds
log4j.appender.KINESIS.shutdownTimeout=30
https://github.com/awslabs/kinesis-log4j-
appender
Pre-Batch before Puts for better efficiency
•  Retry if rise in input rate is temporary
•  Reshard to increase number of
shards
•  Monitor CloudWatch metrics:
PutRecord.Bytes and
GetRecords.Bytes metrics keep track
of shard usage
Metric Units
PutRecord.Bytes Bytes
PutRecord.Latency Milliseconds
PutRecord.Success Count
•  Keep track of your metrics
•  Log hashkey values generated by
your partition keys
•  Log Shard-Ids
•  Determine which Shard receive the
most (hashkey) traffic.
String shardId =
putRecordResult.getShardId();
putRecordRequest.setPartitionKey
(String.format( "myPartitionKey"));
Options:
•  stream-name - The name of the
Stream to be scaled
•  scaling-action - The action to be
taken to scale. Must be one of
"scaleUp”, "scaleDown" or
“resize"
•  count - Number of shards by
which to absolutely scale up or
down, or resize to or:
•  pct - Percentage of the existing
number of shards by which to
scale up or down
https://github.com/awslabs/amazon-
kinesis-scaling-utils
many small files billion during peak
total size 1.5 TB per month
Request rate
(Writes/sec)
Object size
(Bytes)
Total size
(GB/month)
Objects per month
300 2048 1483 777,600,000
Cost Conscious Design
Example: Should I use Amazon S3 or Amazon DynamoDB?
Request rate
(Writes/sec)
Object size
(Bytes)
Total size
(GB/month)
Objects per
month
300 2,048 1,483 777,600,000
Amazon S3 or
Amazon
DynamoDB?
Request rate
(Writes/sec)
Object size
(Bytes)
Total size
(GB/month)
Objects per
month
Scenario 1300 2,048 1,483 777,600,000
Scenario 2300 32,768 23,730 777,600,000
Amazon S3
Amazon DynamoDB
use
use
Hot Warm Cold
Volume MB–GB GB–TB PB
Item size B–KB KB–MB KB–TB
Latency ms ms, sec min, hrs
Durability Low–High High Very High
Request rate Very High High Low
Cost/GB $$-$ $-¢¢ ¢
Amazon
RDS Amazon
Redshift
Request rate
High Low
Cost/GB
High Low
Latency
Low High
Data Volume
Low High
Amazon
Glacier
Structure
Low
High
Amazon
DynamoDB
Amazon
Kinesis
Amazon S3
November 14, 2014 | Las Vegas, NV
Valentino Volonghi, CTO, AdRoll
Siva Raghupathy, Principal Solutions Architect, AWS
60 billion requests/day
We
Must
Stay
Up
1% downtime
=
>$1M
No
Infinitely
Deep
Pockets
100ms MAX Latency
Paris-New York: ~6000km
Speed of Light in fiber: 200,000 km/s
RTT latency without hops and copper:
60ms
Paris-New York: ~6000km
Speed of Light in fiber: 200,000 km/s
RTT latency without hops and copper:
60ms6000
km
60 ms
c-RTT
Data
Collection
• Amazon EC2, Elastic Load
Balancing, Auto Scaling
Store
• Amazon S3 + Amazon
Kinesis
Global
Distribution
• Apache Storm on Amazon
EC2
Bid Store
• DynamoDB
Bidding
• Amazon EC2, Elastic Load
Balancing, Auto Scaling
Data	
  Collection
Bidding
Ad	
  Network	
  2Ad	
  Network	
  1
Auto	
  Scaling	
  GroupAuto	
  Scaling	
  GroupAuto	
  Scaling	
  GroupAuto	
  Scaling	
  Group Auto	
  Scaling	
  GroupAuto	
  Scaling	
  Group
Auto	
  Scaling	
  GroupAuto	
  Scaling	
  Group Auto	
  Scaling	
  Group
Apache	
  Storm
v2 V3 V3v1 v2 V3 V3v1
V2 V3 V3V1
Auto	
  Scaling	
  Group
V3 V4
Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing
DynamoDB
Write
Read Read Read Read
Read Read
Write
Writes
Write
Write
Read
V3
`
Elastic Load
Balancing
Elastic Load
Balancing
Elastic Load
Balancing
Elastic Load
Balancing
Elastic Load
Balancing
Elastic Load
Balancing
DynamoDB
Data	
  Collection
Bidding
DynamoDB
Write
Read
Read
Write
Write
Write
Amazon S3
Amazon
Kinesis
Data Collection = Batch Layer Bidding = Speed Layer
Data
Collection
Data
Storage
Global
Distribution
Bid
Storage
Bidding
BiddingData Collection
US East region
Availability Zone
 Availability Zone
Elastic Load Balancing
instances
 instances
Auto Scaling group
Amazon
S3
Amazon
Kinesis
Apache
Storm
DynamoDB
Availability Zone
 Availability Zone
Auto Scaling group
Elastic Load Balancing
AWS Data Collection & Storage

More Related Content

What's hot

AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017AWSKRUG - AWS한국사용자모임
 
Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...
Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...
Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...Amazon Web Services Korea
 
Rabbit MQ introduction
Rabbit MQ introductionRabbit MQ introduction
Rabbit MQ introductionShirish Bari
 
Amazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Web Services Korea
 
Introduction To RabbitMQ
Introduction To RabbitMQIntroduction To RabbitMQ
Introduction To RabbitMQKnoldus Inc.
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
 
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬Amazon Web Services Korea
 
[AWS Builders] 프리티어 서비스부터 계정 보안까지
[AWS Builders] 프리티어 서비스부터 계정 보안까지[AWS Builders] 프리티어 서비스부터 계정 보안까지
[AWS Builders] 프리티어 서비스부터 계정 보안까지Amazon Web Services Korea
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless ArchitectureElana Krasner
 
A Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureA Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureAmazon Web Services
 
Service mesh(istio) monitoring
Service mesh(istio) monitoringService mesh(istio) monitoring
Service mesh(istio) monitoringJeong-Ho Na
 
금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...
금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...
금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...Amazon Web Services Korea
 
2022년 07월 21일 Confluent+Imply 웨비나 발표자료
2022년 07월 21일 Confluent+Imply 웨비나 발표자료2022년 07월 21일 Confluent+Imply 웨비나 발표자료
2022년 07월 21일 Confluent+Imply 웨비나 발표자료confluent
 
AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)Ashish Kushwaha
 

What's hot (20)

AWS SQS SNS
AWS SQS SNSAWS SQS SNS
AWS SQS SNS
 
Amazon simple queue service
Amazon simple queue serviceAmazon simple queue service
Amazon simple queue service
 
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
AWS 기반 대규모 트래픽 견디기 - 장준엽 (구로디지털 모임) :: AWS Community Day 2017
 
Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...
Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...
Amazon kinesis와 elasticsearch service로 만드는 실시간 데이터 분석 플랫폼 :: 박철수 :: AWS Summi...
 
Rabbit MQ introduction
Rabbit MQ introductionRabbit MQ introduction
Rabbit MQ introduction
 
Amazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB Day
 
Introduction To RabbitMQ
Introduction To RabbitMQIntroduction To RabbitMQ
Introduction To RabbitMQ
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
AWS Kinesis Streams
AWS Kinesis StreamsAWS Kinesis Streams
AWS Kinesis Streams
 
Amazon Simple Email Service
Amazon Simple Email ServiceAmazon Simple Email Service
Amazon Simple Email Service
 
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
 
[AWS Builders] 프리티어 서비스부터 계정 보안까지
[AWS Builders] 프리티어 서비스부터 계정 보안까지[AWS Builders] 프리티어 서비스부터 계정 보안까지
[AWS Builders] 프리티어 서비스부터 계정 보안까지
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
 
A Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureA Brief Look at Serverless Architecture
A Brief Look at Serverless Architecture
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Service mesh(istio) monitoring
Service mesh(istio) monitoringService mesh(istio) monitoring
Service mesh(istio) monitoring
 
금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...
금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...
금융 분야 마이데이터 (My Data) 산업 도입 방안 및 AWS 활용법 – 고종원 AWS 어카운트 매니저, 양찬욱 KB국민카드 팀장:: ...
 
2022년 07월 21일 Confluent+Imply 웨비나 발표자료
2022년 07월 21일 Confluent+Imply 웨비나 발표자료2022년 07월 21일 Confluent+Imply 웨비나 발표자료
2022년 07월 21일 Confluent+Imply 웨비나 발표자료
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)
 

Viewers also liked

大規模環境のOpenStack アップグレードの考え方と実施のコツ
大規模環境のOpenStackアップグレードの考え方と実施のコツ大規模環境のOpenStackアップグレードの考え方と実施のコツ
大規模環境のOpenStack アップグレードの考え方と実施のコツTomoya Hashimoto
 
Nfv orchestration open stack summit may2015 aricent
Nfv orchestration open stack summit may2015 aricentNfv orchestration open stack summit may2015 aricent
Nfv orchestration open stack summit may2015 aricentAricent
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backIcinga
 
Network visibility and control using industry standard sFlow telemetry
Network visibility and control using industry standard sFlow telemetryNetwork visibility and control using industry standard sFlow telemetry
Network visibility and control using industry standard sFlow telemetrypphaal
 
5 g network &amp; technology
5 g network &amp; technology5 g network &amp; technology
5 g network &amp; technologyFrikha Nour
 
Using Agilio SmartNICs for OpenStack Networking Acceleration
Using Agilio SmartNICs for OpenStack Networking AccelerationUsing Agilio SmartNICs for OpenStack Networking Acceleration
Using Agilio SmartNICs for OpenStack Networking AccelerationNetronome
 
Treasure Data Cloud Data Platform
Treasure Data Cloud Data PlatformTreasure Data Cloud Data Platform
Treasure Data Cloud Data Platforminside-BigData.com
 
Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Sadayuki Furuhashi
 
NFV : Virtual Network Function Architecture
NFV : Virtual Network Function ArchitectureNFV : Virtual Network Function Architecture
NFV : Virtual Network Function Architecturesidneel
 
【AWS初心者向けWebinar】AWSから始める動画配信
【AWS初心者向けWebinar】AWSから始める動画配信【AWS初心者向けWebinar】AWSから始める動画配信
【AWS初心者向けWebinar】AWSから始める動画配信Amazon Web Services Japan
 
Cloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper ContrailCloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper Contrailbuildacloud
 
Contrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at ScaleContrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at ScaleMarketingArrowECS_CZ
 
ビッグデータ処理データベースの全体像と使い分け
ビッグデータ処理データベースの全体像と使い分けビッグデータ処理データベースの全体像と使い分け
ビッグデータ処理データベースの全体像と使い分けRecruit Technologies
 

Viewers also liked (18)

大規模環境のOpenStack アップグレードの考え方と実施のコツ
大規模環境のOpenStackアップグレードの考え方と実施のコツ大規模環境のOpenStackアップグレードの考え方と実施のコツ
大規模環境のOpenStack アップグレードの考え方と実施のコツ
 
Nfv orchestration open stack summit may2015 aricent
Nfv orchestration open stack summit may2015 aricentNfv orchestration open stack summit may2015 aricent
Nfv orchestration open stack summit may2015 aricent
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to back
 
Network visibility and control using industry standard sFlow telemetry
Network visibility and control using industry standard sFlow telemetryNetwork visibility and control using industry standard sFlow telemetry
Network visibility and control using industry standard sFlow telemetry
 
5 g network &amp; technology
5 g network &amp; technology5 g network &amp; technology
5 g network &amp; technology
 
Using Agilio SmartNICs for OpenStack Networking Acceleration
Using Agilio SmartNICs for OpenStack Networking AccelerationUsing Agilio SmartNICs for OpenStack Networking Acceleration
Using Agilio SmartNICs for OpenStack Networking Acceleration
 
NFV Tutorial
NFV TutorialNFV Tutorial
NFV Tutorial
 
NFV and OpenStack
NFV and OpenStackNFV and OpenStack
NFV and OpenStack
 
Treasure Data Cloud Data Platform
Treasure Data Cloud Data PlatformTreasure Data Cloud Data Platform
Treasure Data Cloud Data Platform
 
Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理
 
NFV evolution towards 5G
NFV evolution towards 5GNFV evolution towards 5G
NFV evolution towards 5G
 
Design Principles for 5G
Design Principles for 5GDesign Principles for 5G
Design Principles for 5G
 
NFV : Virtual Network Function Architecture
NFV : Virtual Network Function ArchitectureNFV : Virtual Network Function Architecture
NFV : Virtual Network Function Architecture
 
【AWS初心者向けWebinar】AWSから始める動画配信
【AWS初心者向けWebinar】AWSから始める動画配信【AWS初心者向けWebinar】AWSから始める動画配信
【AWS初心者向けWebinar】AWSから始める動画配信
 
Cloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper ContrailCloud Network Virtualization with Juniper Contrail
Cloud Network Virtualization with Juniper Contrail
 
Contrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at ScaleContrail Deep-dive - Cloud Network Services at Scale
Contrail Deep-dive - Cloud Network Services at Scale
 
170827 jtf garafana
170827 jtf garafana170827 jtf garafana
170827 jtf garafana
 
ビッグデータ処理データベースの全体像と使い分け
ビッグデータ処理データベースの全体像と使い分けビッグデータ処理データベースの全体像と使い分け
ビッグデータ処理データベースの全体像と使い分け
 

Similar to AWS Data Collection & Storage

February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBAmazon Web Services
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep DiveAmazon Web Services
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBAmazon Web Services
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAmazon Web Services
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBAmazon Web Services
 
Raleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS LambdaRaleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS LambdaAmazon Web Services
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAmazon Web Services
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven ProgrammingAmazon Web Services
 

Similar to AWS Data Collection & Storage (20)

Data Collection and Storage
Data Collection and StorageData Collection and Storage
Data Collection and Storage
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDBAWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
AWS December 2015 Webinar Series - Design Patterns using Amazon DynamoDB
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
SRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDBSRV404 Deep Dive on Amazon DynamoDB
SRV404 Deep Dive on Amazon DynamoDB
 
Deep Dive - DynamoDB
Deep Dive - DynamoDBDeep Dive - DynamoDB
Deep Dive - DynamoDB
 
Raleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS LambdaRaleigh DevDay 2017: Real time data processing using AWS Lambda
Raleigh DevDay 2017: Real time data processing using AWS Lambda
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDB
 
(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming(WRK302) Event-Driven Programming
(WRK302) Event-Driven Programming
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

AWS Data Collection & Storage

  • 1. April 21, 2015 Seattle, WA Big Data Collection & Storage
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. Amazon DynamoDB •  Managed NoSQL database service •  Supports both document and key-value data models •  Highly scalable – no table size or throughput limits •  Consistent, single-digit millisecond latency at any scale •  Highly available—3x replication •  Simple and powerful API
  • 8. DynamoDB Table Table Items A,ributes Hash Key Range Key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All  items  for  a  hash  key ==,  <,  >,  >=,  <= “begins  with” “between” sorted  results counts top/bo,om  N  values paged  responses
  • 10. Data types String (S) Number (N) Binary (B) String Set (SS) Number Set (NS) Binary Set (BS) Boolean (BOOL) Null (NULL) List (L) Map (M) Used for storing nested JSON documents
  • 11. 00 55 A954 AA FF Hash table •  Hash key uniquely identifies an item •  Hash key is used for building an unordered hash index •  Table can be partitioned for scale 00 FF Id = 1 Name = Jim Hash (1) = 7B Id = 2 Name = Andy Dept = Engg Hash (2) = 48 Id = 3 Name = Kim Dept = Ops Hash (3) = CD Key Space
  • 12. Partitions are three-way replicated Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Replica 1 Replica 2 Replica 3 Partition 1 Partition 2 Partition N
  • 13. Hash-range table •  Hash key and range key together uniquely identify an Item •  Within unordered hash index, data is sorted by the range key •  No limit on the number of items (∞) per hash key –  Except if you have local secondary indexes 00:0 FF:∞ Hash (2) = 48 Customer# = 2 Order# = 10 Item = Pen Customer# = 2 Order# = 11 Item = Shoes Customer# = 1 Order# = 10 Item = Toy Customer# = 1 Order# = 11 Item = Boots Hash (1) = 7B Customer# = 3 Order# = 10 Item = Book Customer# = 3 Order# = 11 Item = Paper Hash (3) = CD 55 A9:∞54:∞ AA Partition 1 Partition 2 Partition 3
  • 14. DynamoDB table examples case class CameraRecord( cameraId: Int, // hash key ownerId: Int, subscribers: Set[Int], hoursOfRecording: Int, ... ) case class Cuepoint( cameraId: Int, // hash key timestamp: Long, // range key type: String, ... )HashKey RangeKey Value Key Segment 1234554343254 Key Segment1 1231231433235
  • 15. Local Secondary Index (LSI) alternate  range  key  +  same  hash  key   index  and  table  data  is  co-­‐located  (same  par88on)   10 GB max per hash key, i.e. LSIs limit the # of range keys!
  • 16. Global Secondary Index any  a:ribute  indexed  as   new  hash  and/or  range  key   RCUs/WCUs provisioned separately for GSIs Online indexing
  • 17. LSI or GSI? •  LSI can be modeled as a GSI •  If data size in an item collection > 10 GB, use GSI •  If eventual consistency is okay for your scenario, use GSI!
  • 18. •  Stream of updates to a table •  Asynchronous •  Exactly once •  Strictly ordered –  Per item •  Highly durable •  Scale with table •  24-hour lifetime •  Sub-second latency DynamoDB Streams
  • 19. DynamoDB Streams and AWS Lambda
  • 21. Scaling •  Throughput –  Provision any amount of throughput to a table •  Size –  Add any number of items to a table •  Max item size is 400 KB •  LSIs limit the number of range keys due to 10 GB limit •  Scaling is achieved through partitioning
  • 22. Throughput •  Provisioned at the table level –  Write capacity units (WCUs) are measured in 1 KB per second –  Read capacity units (RCUs) are measured in 4 KB per second •  RCUs measure strictly consistent reads •  Eventually consistent reads cost 1/2 of consistent reads •  Read and write throughput limits are independent WCURCU
  • 23. Partitioning example #   𝑜𝑓   𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠  =  ​8   𝐺𝐵/10   𝐺𝐵  = 0.8 = 1 ( 𝑓𝑜𝑟   𝑠𝑖𝑧𝑒) #   𝑜𝑓   𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠 ( 𝑓𝑜𝑟   𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡) =      ​​5000↓𝑅𝐶𝑈 /3000   𝑅𝐶𝑈   +  ​​500↓𝑊𝐶𝑈 /1000   𝑊𝐶𝑈  = 2.17 = 3 Table  size  =  8  GB,  RCUs  =  5000,  WCUs  =  500 #   𝑜𝑓   𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑠=​MAX⁠​1   𝑓𝑜𝑟   𝑠𝑖𝑧𝑒⁠  3   𝑓𝑜𝑟   𝑡ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡   ( 𝑡𝑜𝑡𝑎𝑙) RCUs  per  partition  =  5000/3  =  1666.67 WCUs  per  partition  =  500/3  =    166.67 Data/partition  =  10/3  =  3.33  GB RCUs and WCUs are uniformly spread across partitions
  • 25. Amazon DynamoDB Best Practices •  Keep item size small •  Store metadata in Amazon DynamoDB and large blobs in Amazon S3 •  Use a table with a hash key for extremely high scale •  Use table per day, week, month etc. for storing time series data •  Use conditional updates for de-duping •  Use hash-range table and/or GSI to model –  1:N, M:N relationships •  Avoid hot keys and hot partitions Events_table_2012   Event_id (Hash key)   Timestam p (range  key)   Attribute1   ….   Attribute N   Events_table_2012_05_week1   Event_id (Hash key)   Timestam p (range  key)   Attribute1   ….   Attribute N  Events_table_2012_05_week2   Event_id (Hash key)   Timestam p (range  key)   Attribute1   ….   Attribute N   Events_table_2012_05_week3   Event_id (Hash key)   Timestam p (range  key)   Attribute1   ….   Attribute N  
  • 27. objects buckets •  Designed for 99.999999999% durability
  • 28.
  • 29. S3 Events SNS topic SQS queue Lambda function Notifications Notifications Notifications Foo() { … }
  • 30.
  • 31.
  • 32.
  • 33.
  • 35. File •  Compress data files –  Reduces Bandwidth •  Avoid small files –  Hadoop mappers proportional to number of files –  S3 PUT cost quickly adds up Algorithm % Space Remaining Encoding Speed Decoding Speed GZIP 13% 21MB/s 118MB/s LZO 20% 135MB/s 410MB/s Snappy 22% 172MB/s 409MB/s
  • 36. •  Use S3DistCP to combine smaller files together •  S3DistCP takes a pattern and target path to combine smaller input files to larger ones "--groupBy,.*XABCD12345678.([0-9]+-[0-9]+-[0-9]+-[0-9]+).*“ •  Supply a target size and compression codec "--targetSize,128",“--outputCodec,lzo" s3://myawsbucket/cf/XABCD12345678.2012-02-23-01.HLUS3JKx.gz s3://myawsbucket/cf/XABCD12345678.2012-02-23-01.I9CNAZrg.gz s3://myawsbucket/cf/XABCD12345678.2012-02-23-02.YRRwERSA.gz s3://myawsbucket/cf/XABCD12345678.2012-02-23-02.dshVLXFE.gz s3://myawsbucket/cf/XABCD12345678.2012-02-23-02.LpLfuShd.gz s3://myawsbucket/cf1/2012-02-23-01.lzo s3://myawsbucket/cf1/2012-02-23-02.lzo
  • 37. AWS Import/ Export AWS Direct Connect Internet Amazon S3 AWS Region Corporate Data Center Amazon EC2 Availability Zone
  • 38.
  • 39. Using AWS for Multi-instance, Multi-part Uploads Moving Big Data into the Cloud with Tsunami UDP Moving Big Data Into The Cloud with ExpeDat Gateway for Amazon S3
  • 40. Amazon Kinesis 4 4 3 3 2 2 1 14 3 2 1 4 3 2 1 4 3 2 1 4 3 2 1 4 4 3 3 2 2 1 1 Producer 1 Shard or Partition 1 Shard or Partition 2 Consumer 1 Count of Red = 4 Count of Violet = 4 Consumer 2 Count of Blue = 4 Count of Green = 4 Producer 2 Producer 3 Producer N Key = Red Key = Green Key = Blue Key = Violet
  • 41. Amazon Kinesis Managed Service for streaming data ingestion, and processing Amazon Web Services AZ AZ AZ Durable, highly consistent storage replicates data across three data centers (availability zones) Aggregate and archive to S3 Millions of sources producing 100s of terabytes per hour Front End Authentication Authorization Ordered stream of events supports multiple readers Real-time dashboards and alarms Machine learning algorithms or sliding window analytics Aggregate analysis in Hadoop or a data warehouse Inexpensive: $0.028 per million puts
  • 42. Sending & Reading Data from Kinesis Streams HTTP Post AWS SDK LOG4J Flume Fluentd Get* APIs Kinesis Client Library + Connector Library Apache Storm Amazon Elastic MapReduce Sending Consuming AWS Mobile SDK
  • 44. Shard Shard 1 MB/S 2 KB * 500 TPS = 1000KB/s 1 MB/S 2 KB * 500 TPS = 1000KB/s Payment Processing Application 1 MB/S 1 MB/S Producers Theoretical Minimum of 2 Shards Required
  • 45. Shard Shard 1 MB/S 2 KB * 500 TPS = 1000KB/s 1 MB/S 2 KB * 500 TPS = 1000KB/s Payment Processing Application Fraud Detection Application Recommendation Engine Application Egress Bottleneck Producers
  • 46. MergeShards Takes two adjacent shards in a stream and combines them into a single shard to reduce the stream's capacity X-Amz-Target: Kinesis_20131202.MergeShards { "StreamName": "exampleStreamName", "ShardToMerge": "shardId-000000000000", "AdjacentShardToMerge": "shardId-000000000001" } SplitShard Splits a shard into two new shards in the stream, to increase the stream's capacity X-Amz-Target: Kinesis_20131202.SplitShard { "StreamName": "exampleStreamName", "ShardToSplit": "shardId-000000000000", "NewStartingHashKey": "10" } Ø  Both are online operations
  • 47. Producer Shard 1 Shard 2 Shard 3 Shard n Shard 4 Producer Producer Producer Producer Producer Producer Producer Producer Kinesis Putting Data into Kinesis Simple Put interface to store data in Kinesis
  • 48.
  • 49.
  • 50. Determine Your Partition Key Strategy •  Kinesis as a managed buffer or a streaming map- reduce? •  Ensure a high cardinality for Partition Keys with respect to shards, to prevent a “hot shard” problem –  Generate Random Partition Keys •  Streaming Map-Reduce: Leverage Partition Keys for business specific logic as applicable –  Partition Key per billing customer, per DeviceId, per stock symbol
  • 51. Provisioning Adequate Shards •  For ingress needs •  Egress needs for all consuming applications: If more than 2 simultaneous consumers •  Include head-room for catching up with data in stream in the event of application failures
  • 52. Pre-Batch before Puts for better efficiency
  • 53. # KINESIS appender log4j.logger.KinesisLogger=INFO, KINESIS log4j.additivity.KinesisLogger=false log4j.appender.KINESIS=com.amazonaws.services.kinesis.log4j.KinesisA ppender # DO NOT use a trailing %n unless you want a newline to be transmitted to KINESIS after every message log4j.appender.KINESIS.layout=org.apache.log4j.PatternLayout log4j.appender.KINESIS.layout.ConversionPattern=%m # mandatory properties for KINESIS appender log4j.appender.KINESIS.streamName=testStream #optional, defaults to UTF-8 log4j.appender.KINESIS.encoding=UTF-8 #optional, defaults to 3 log4j.appender.KINESIS.maxRetries=3 #optional, defaults to 2000 log4j.appender.KINESIS.bufferSize=1000 #optional, defaults to 20 log4j.appender.KINESIS.threadCount=20 #optional, defaults to 30 seconds log4j.appender.KINESIS.shutdownTimeout=30 https://github.com/awslabs/kinesis-log4j- appender Pre-Batch before Puts for better efficiency
  • 54. •  Retry if rise in input rate is temporary •  Reshard to increase number of shards •  Monitor CloudWatch metrics: PutRecord.Bytes and GetRecords.Bytes metrics keep track of shard usage Metric Units PutRecord.Bytes Bytes PutRecord.Latency Milliseconds PutRecord.Success Count •  Keep track of your metrics •  Log hashkey values generated by your partition keys •  Log Shard-Ids •  Determine which Shard receive the most (hashkey) traffic. String shardId = putRecordResult.getShardId(); putRecordRequest.setPartitionKey (String.format( "myPartitionKey"));
  • 55. Options: •  stream-name - The name of the Stream to be scaled •  scaling-action - The action to be taken to scale. Must be one of "scaleUp”, "scaleDown" or “resize" •  count - Number of shards by which to absolutely scale up or down, or resize to or: •  pct - Percentage of the existing number of shards by which to scale up or down https://github.com/awslabs/amazon- kinesis-scaling-utils
  • 56.
  • 57. many small files billion during peak total size 1.5 TB per month Request rate (Writes/sec) Object size (Bytes) Total size (GB/month) Objects per month 300 2048 1483 777,600,000
  • 58. Cost Conscious Design Example: Should I use Amazon S3 or Amazon DynamoDB?
  • 59. Request rate (Writes/sec) Object size (Bytes) Total size (GB/month) Objects per month 300 2,048 1,483 777,600,000 Amazon S3 or Amazon DynamoDB?
  • 60. Request rate (Writes/sec) Object size (Bytes) Total size (GB/month) Objects per month Scenario 1300 2,048 1,483 777,600,000 Scenario 2300 32,768 23,730 777,600,000 Amazon S3 Amazon DynamoDB use use
  • 61.
  • 62. Hot Warm Cold Volume MB–GB GB–TB PB Item size B–KB KB–MB KB–TB Latency ms ms, sec min, hrs Durability Low–High High Very High Request rate Very High High Low Cost/GB $$-$ $-¢¢ ¢
  • 63. Amazon RDS Amazon Redshift Request rate High Low Cost/GB High Low Latency Low High Data Volume Low High Amazon Glacier Structure Low High Amazon DynamoDB Amazon Kinesis Amazon S3
  • 64.
  • 65. November 14, 2014 | Las Vegas, NV Valentino Volonghi, CTO, AdRoll Siva Raghupathy, Principal Solutions Architect, AWS
  • 70. Paris-New York: ~6000km Speed of Light in fiber: 200,000 km/s RTT latency without hops and copper: 60ms Paris-New York: ~6000km Speed of Light in fiber: 200,000 km/s RTT latency without hops and copper: 60ms6000 km 60 ms c-RTT
  • 71.
  • 72.
  • 73. Data Collection • Amazon EC2, Elastic Load Balancing, Auto Scaling Store • Amazon S3 + Amazon Kinesis Global Distribution • Apache Storm on Amazon EC2 Bid Store • DynamoDB Bidding • Amazon EC2, Elastic Load Balancing, Auto Scaling Data  Collection Bidding Ad  Network  2Ad  Network  1 Auto  Scaling  GroupAuto  Scaling  GroupAuto  Scaling  GroupAuto  Scaling  Group Auto  Scaling  GroupAuto  Scaling  Group Auto  Scaling  GroupAuto  Scaling  Group Auto  Scaling  Group Apache  Storm v2 V3 V3v1 v2 V3 V3v1 V2 V3 V3V1 Auto  Scaling  Group V3 V4 Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing DynamoDB Write Read Read Read Read Read Read Write Writes Write Write Read V3 ` Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing Elastic Load Balancing DynamoDB Data  Collection Bidding DynamoDB Write Read Read Write Write Write Amazon S3 Amazon Kinesis
  • 74. Data Collection = Batch Layer Bidding = Speed Layer Data Collection Data Storage Global Distribution Bid Storage Bidding
  • 75. BiddingData Collection US East region Availability Zone Availability Zone Elastic Load Balancing instances instances Auto Scaling group Amazon S3 Amazon Kinesis Apache Storm DynamoDB Availability Zone Availability Zone Auto Scaling group Elastic Load Balancing