More Related Content Similar to Fanatics Ingests Streaming Data to a Data Lake on AWS (20) More from Amazon Web Services (20) Fanatics Ingests Streaming Data to a Data Lake on AWS1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fanatics ingests streaming data
to a data lake on AWS
July 12, 2018| 10AM-11AM PDT
2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s presenters
Paul Sears, Partner Solutions Architect, Amazon Web Services
Jordan Martz, Director of Technology Solutions, Attunity
Alan Chang, Senior Product Manager, Fanatics
3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s agenda
• Driving innovation with AWS data lake solutions
• Moving data in real time with Attunity Replicate
• How Fanatics leverages data for customer insights
• Q&A/Discussion
4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learning objectives
• How to deploy a data lake on Amazon S3
• How to ingest real-time data to a data lake with
minimal operational impact
• How to use AWS, Attunity, and Kafka to get
more value from your data
5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The data lake and AWS
Driving business value with disparate types of data
6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Legacy data warehouses and RDBMS
• Complex to set up and manage
• Do not scale
• Take months to add new
data sources
• Queries take too long
• Cost $MM upfront
7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Should I build a data lake?
Starting by amassing "all your data" and dumping
into a large repository for the data gurus to start
finding "insights" is like trying to win the lottery by
buying all the tickets
8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Rethink how to become a data-driven business
• Business outcomes - start with the insights and actions you
want to drive, then work backwards to a streamlined design
• Experimentation - start small, test many ideas, keep the
good ones and scale those up, paying only for what you
consume
• Agile and timely - deploy data processing infrastructure in
minutes, not months. Take advantage of a rich platform of
services to respond quickly to changing business needs
9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Business case determines platform design
Ingest/
collect
Consume/
visualize
Store
Process/
analyze
Data
1 4
0 9
5
Answers
and
insights
START HERE
WITH A BUSINESS CASE
10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Experiment and scale based on your business needs
MATCH
AVAILABLE DATA
Metrics and
monitoring
Workflow
logs
ERP
transactions
Ingest/
collect
Consume/
visualize
Store
Process/
analyze
Data
1 4
0 9
5
Answers
and
insights
11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Business outcomes on a modern data architecture
Outcome 1 : Modernize and consolidate
• Insights to enhance business applications and create new digital services
Outcome 2 : Innovate for new revenues
• Personalization, demand forecasting, risk analysis
Outcome 3 : Real-time engagement
• Interactive customer experience, event-driven automation, fraud detection
Outcome 4 : Automate for expansive reach
• Automation of business processes and physical infrastructure
12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data lake on AWS
AWS
Snowball
AWS
Snowmobile
Amazon
Kinesis
Data Firehose
Amazon
Kinesis
Data Streams
S3
Relational and non-relational data
Schema defined during analysis
Unmatched durability and availability at EB scale
Best security, compliance, and audit capabilities
Run any analytics on the same data without
movement
Scale storage and compute independently
Store data at $0.023 / month; Query for $0.05/GB
scanned
Amazon
Redshift
Amazo
n
EMR
Amazo
nAthen
a
Amazo
n
Kinesis Amazon
Elasticsearch Service
Amazon
Kinesis
Video Streams
AI Services
13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Why Amazon S3 for modern data architecture?
Designed for 11 9s
of durability
Designed for
99.99% availability
Durable Available High performance
Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift Spectrum
Amazon DynamoDB
Amazon Athena
AWS Glue
Amazon Kinesis
Amazon SageMaker
IntegratedEasy to use
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Decouple storage and compute
• Legacy design was large databases or
data warehouses with integrated
hardware
• Big data architectures often benefit
from decoupling storage and compute
15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Attunity and AWS
Working together to help customers ingest real-time
data into the cloud
16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
AWS Partner Network (APN)AdvancedTechnology Partne
APN Migration and Big Data Competency Partner
5-star rating onAWS Marketplace
17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 17© 2017 Attunity
Automate to reduce cost of
traditional EDW process
Adapt data processes and
technologies to changing
business needs
Provide near real-time
updates of analytics-ready
data sets
Automate your data lake pipeline with Attunity
Deliver timely transactional data for insights
Ensure quality and governance
18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Requirement Attunity Replicate capabilities
Low latency Propagate data and schema changes end-to-end with near-zero
latency to enable analytics-ready data sets
Scale Load from 100s sources into data lake or Hadoop – no agents
Efficiently control large scale environments
Flexibility Universal, optimized platform integration for future flexibility
Full load and CDC to big data components
Time to value Automate to eliminate manual scripting, enabling non-programmers to
create analytics-ready ODS/HDS
Performance Protect production with low-touch, agentless processing of incremental
updates
Efficiency Low-impact CDC eliminates disruptive full loads and re-loads
Data lake capabilities of Attunity Replicate
19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 19© 2017 Attunity
Continuous data ingestion with our
CDC technology
Scales for hundreds of
heterogenous sources
Pipeline automation – ingestion
and Hive merging
Attunity differentiators – data lake
20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 20© 2017 Attunity
ATTUNITY REPLICATE ARCHITECTURE
TRANSFER
IN-MEMORY
FILTER
HADOOP
RDBMS
DATA
WAREHOUSE
FILES
MAINFRAME
TRANSFORM
PERSISTENT
STORE
CDC
BATCH
INCREMENTAL
BATCH
HADOOP
RDBMS
DATA
WAREHOUSE
STREAMING
FILES
21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
UNIVERSAL PLATFORM COVERAGE – SOURCES
DATABASE EDW
CLOUD
MAINFRAMESAP
FLAT FILESOTHER LEGACY
Oracle
SQL Server
DB2iSeries
DB2z/OS
DB2LUW
MySQL
PostgreSQL
SybaseASE
Informix
Exadata
Teradata
Netezza
Vertica
Pivotal
DB2forz/OS
IMS/DB
VSAM
ECC onOracle
ECC onSQL
ECC onDB2
ECC onHANA
S4 HANA
AWS RDS
AmazonAurora
AmazonRedshift
SQL/MP
Enscribe
RMS
Delimited
(e.g., CSV, TSV)
Sources from whichAttunity Replicate moves data
22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
UNIVERSAL PLATFORM COVERAGE – TARGETS
DATABASE EDW
STREAMING
CLOUD DATA LAKE
Oracle
SQL Server
DB2LUW
MySQL
PostgreSQL
SybaseASE
Informix
MemSQL
Exadata
Teradata
Netezza
Vertica
SybaseIQ
AmazonRedshift
ActianVector
SAP HANA
AmazonRDS
AmazonRedshift
AmazonEMR
AmazonS3
AmazonAurora
Snowflake
Hortonworks
Cloudera
MapR
AmazonEMR
HDInsight
MapR-ES
Kafka
FLAT FILESSAP
HANA Delimited
(e.g., CSV, TSV)
Targets where Attunity Replicate moves data
23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fanatics
Using big data to identify customer needs
24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Fan Gear And Replica Jersey
Rights To Meet Today’s
On-Demand Culture
Licensing Rights Across
All Major Leagues And Numerous
NCAA Partners Programs
Over $2B In Sales Through
A Multichannel Approach For
More Than 300 Global Partners
Event Retail And In-Venue
Retail Rights With Top Leagues,
Teams And Global Events
25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Multi-faceted
capabilities
• E-commerce
• Tech, data and, mobile
• Hot market
• In-house manufacturing / On-demand
• In-venue event and retail
• Memorabilia and game-used
26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
More than 300 partners worldwide
27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Diverse
destinations
More than just a website
Fanatics delivers a comprehensive, multi-channel technology
and data platform. If you are a sports fan, you have likely had
a Fanatics Experience.
28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fanatics: mobile-first, omni-channel company emphasizing
data and technology
Data is important to execute upon this vision
Make data ubiquitous in all aspects of user experience & business operations
Why is data so important at Fanatics?
eCommerce + offline venues
Business ecosystem: dynamic nature of sports business
Sampling of data use cases
– BI insights, including real-time analytics
– User experiences (search, personalization, shipping & delivery, experimentation)
– Paid marketing optimization
– Pricing and promotions
– Merchandising optimization
– Planning and optimization (fulfillment, customer service, manufacturing)
29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our Challenge: low overhead and scalable data ingestion solution needed
Variety of apps: homegrown site transactional, ERP, Manufacturing
Support for DBs: SQL Server, Oracle, Postgres, MySQL
Volume: 1000+ tables; 100 TB + over time
Key decisions:
– Leverage cloud (elasticity, agility, new systems)
– Leverage Amazon S3
Permanent data repository (data lake)
Primary data exchange mechanism across applications, regions, partners
Amazon S3 allows us the flexibility to enable multiple data processing and querying technologies
Continuous data ingestion from on-premises applications to Amazon S3
Flexibility: low overhead to add new tables, sources
30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Our solution: AWS and Attunity
Sources
• More than five major internal systems
• 100+ SQLServer tables (current)
Micro-batches (15 min intervals)
• Normalize transaction logs.
• Parquet output serialization
• Detect DDL changes
Near real time batches (30-60 min intervals)
• Create table snapshot from most recent data
Near future
• Install Attunity on AWS
• Take a look at Kafka connector
31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Transitioning Attunity with AWS
32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Future state
Real-time ingestion in AWS
• Kafka as target for Attunity
Real-time integration and reporting
• Apache Spark/Flink applications enrich
and integrate incoming data in real time
33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Best Practices and Lessons Learned
• Attunity installation and data consumption owned by the
same team
• Correctly understand the data frequency requirements
• Evaluate tradeoff between file size and file IO operations
34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
The takeaway
Amazon Web Services
Fanatics selected AWS to take advantage of multiple data platforms
Attunity
Fanatics used Attunity for data integration into multiple open source
and AWS platforms
35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q&A
36. Next steps and further information
Get a free trial of Attunity solutions:
– http://amzn.to/2kBDgtF
Learn more about Attunity solutions on AWS:
– http://bit.ly/2kAU4kJ
Learn more about Fanatics:
– https://www.fanatics.com/