SlideShare a Scribd company logo
1 of 37
Download to read offline
“Teaching Old Data New Tricks™”
Brian Barker • CEO • NorthBay Solutions
John Puopolo • SVP • Engineering • Eliza Corporation
Ali Khan • Director, Business Intelligence and Analytics • Scholastic
Sai Reddy Thangirala • Solutions Architect • Amazon Web Services
Agenda
• Big Data on AWS
• NorthBay
• Eliza Corporation Case Study
• Challenges Eliza Faced
• Strategic Goals
• Why a Data Lake Approach was Chosen
• Outcomes & Benefits Eliza Achieved
• Scholastic Case Study
• Challenges
• Goals
• The AWS/NorthBay Decision
• How the Initiative Unfolded
• Key Learnings
Data is Growing
of new data will be
created every second
for every human being
on the planet by 2020
http://www.whizpr.be/upload/medialab/21/c
ompany/Media_Presentation_2012_DigiUn
iverseFINAL1.pdf
1.7MB
compound annual
growth rate of 58%
surpassing $1 billion by
2020 forecasted for the
Hadoop market
http://www.ap-institute.com/big-data-
articles/big-data-what-is-hadoop-
%E2%80%93-an-explanation-for-
absolutely-anyone.aspx
http://www.marketanalysis.com/?p=279
58%
of all data is ever
analyzed and used at
the moment
http://www.technologyreview.com/news/51
4346/the-data-made-me-do-it/
0.5%<
Big Data Is for Everyone
The market for Big Data technologies is growing more than six times faster than the
information technology market as a whole….
…and those companies who use their data well win.
Why AWS for Big Data?
Immediately
Available
Broad and Deep
Capabilities
Trusted and
Secure
Scalable
AWS Provides the Most Complete Platform
for Big Data
It’s easy to get data to AWS, store it securely, and analyze it with the engine of your choice, without any
long-term commitment or vendor lock-in
Collect
Import/Export
Snowball
Direct Connect
VM Import/Export
Store
Amazon S3
EMR
Amazon Glacier
Amazon Redshift
DynamoDB
Aurora
Analyze
Amazon Kinesis
Lambda
EMR
EC2
What Can You Do With Big Data on AWS?
Big Data Repositories Clickstream Analysis ETL Offload
Machine Learning Online Ad Serving BI Applications
“Teaching Old Data New
Tricks™” with NorthBay
“Teaching Old Data New Tricks™”
Untapped wealth - Companies gain
tremendous leverage when
“Teaching Old Data New Tricks™”
• So what does that mean?
• You’ll hear 2 exciting Customer
Examples/Use Cases
presented today
Building a HIPAA compliant Data Lake
Re-tooling old on premise technology on the fly
Customer Examples/Use Cases
Scholastic Preview of Coming Attractions
• How did an old school $1.5B 100-year-old company re-invent its
old school IBM and Microsoft based big data system & analytics
system on the fly?
• What was their starting point?
• What factors did they consider when making their decision?
• What did they decide on for technology and partners and why?
• How did they implement?
• What were the results?
• Lessons learned?
AWS & NorthBay Background
Global Provider of Big Data Solutions
250+
Full-time professionals
145+
Clients
200+
Solutions launched
Conceptual Data Lake Architecture
Eliza Preview of Coming Attractions
• How does a high flying Healthcare services company re-platform
its Enterprise Data Platform while processing millions of
'interactions' every day.
• Why the need to change?
• What strategic goals had to be achieved?
• What is so tough about "named value pairs"
• Why a Data Lake and why NorthBay?
• Which AWS services were chosen to leverage?
• What did they decide on for technology and partners and why?
• How did it turn out?
• What did they learn?
Eliza Corporation
John Puopolo, SVP, Engineering, Eliza Corporation
About Eliza Corporation
• Founded 2000
• Leader in Health Engagement Management
(HEM) outreach services
• Hundreds of millions of outreaches for
intensive operation and analytics processing
• High-volume semi-structured data, complex
business flow of data
• Variety of analytics/consumption needs
ranging from portal for customers to ML
workloads
Challenges Eliza Faced
Eliza Corporation
analyzes more than
300 million interactions
per year
Outreach questions and
responses form a
decision tree, and each
question and response
are captured as a pair,
E.G.: <question,
response> = <“Did
you visit your
physician in
the last 30 days?”,
“Yes”>
Diverse downstream
consumption
requirements
Challenging to process
and analyze data
Strategic Goals
Create next generation
data architecture
Decouple Storage and
Compute
Ability to process old &
new data streams
Achieve HIPAA
compliance
Ingest & store original
datasets
Allow both real-time &
batch processing
Enable access through
entitlements and
governance
Increase self-service for
end-users
Conceptual Data Lake Architecture
Monitoring, auditing, management, and alerting
Data System Analytics (Lineage, Profiling)
EDWETL
Data Lake
Storage
Data Lake
Archive
Catalog
& Search
& Data
Discovery
API
& UI
Entitlements &
Authorizations
Data Quality &
Governance
Streaming
Data Sources
Batch Data
Sources
Data Sources & Ingestion Processing & Storage Consumption & Analytics
Real Time
Analytics
BI tools
Hadoop
(Shared
services)
Business
Units
BI UI
Hadoop,
SAS
(Business
Unit
Dedicated)
Benefits of the New Enterprise Data
Platform Architecture
• Hub & spoke model for one original copy of all enterprise
analytics data
• Quality layer for consistent transformations and cleansing of data
• Governance layer for entitlements and security management
• Enable multiple consumption patterns called projections
• A purpose-designed schema for an Enterprise Data Warehouse
(Redshift) for efficient reporting of known queries
• Streamline and automated ingestion of source batch and streaming
data reducing human/manual touch points
Technical Architecture
Major AWS Services Used
Aurora
Kinesis + Kinesis
Streams
Amazon Redshift Dynamo DB
Hive, Presto,
Spark on EMR
CloudSearch, EC2
Benefits of a New Enterprise Data Platform
• Streamlined data load process by enabling schema on read
• Improved business agility by allowing schema on read
• Improved ability to manage costs by allowing separation
of costs
• Provided ability to enable resources to scale on-demand
• Reduced end-to-end client analytics time
Key Learnings
• The nature of our data is name-value. We
were doing too many transformations due
to our original storage formats.
• Using mini-PoCs to form hypotheses and
prove/disprove them led to an emergent
architecture, which pointed us towards a
data lake
• A data lake architecture fits our core
business and growth plans extremely well
Scholastic
Ali Khan, Director, Business Intelligence and Analytics, Scholastic
About Scholastic
in annual revenue. The worlds
largest publisher and
distributor of children’s books
website for U.S. elementary
school teachers
employees globally
1.6B #1 8,400+
countries languages
165+ 45+
A leader in comprehensive
educational solutions
Existing Platform & Challenges
• We taught old data new tricks
• IBM AS/400 was primary data warehouse platform, supplemented by Microsoft SQL
Server to enable business intelligence
• 5,500+ AS/400 workloads, 350+ SQL Server workloads
• Inflexible architecture – slow time to market
• Unable to meet internal SLAs due to performance of daily ETL processes
• Scalability limitations with SQL Server Analysis Services (SSAS) for
dashboards/reports
• Limited ability to perform self-service business intelligence
28
Project Goals
Improve performance, scalability,
availability, logging, security
Enable self-service business
intelligence
Integrate with existing
technology stack
Align with the tech strategy
(DevOps model, Cloud First)
Leverage the skill set of current
team (SQL/relational)
Team up with an experienced
partner
• AWS was chosen because of agility, scalability,
elasticity, security and alignment with corporate
strategy
• Redshift was chosen to replace AS400 and SQL
Server for its relational-style high performance
data store
• NorthBay was chosen for their expertise in Big
Data and Amazon Redshift migrations
The Decision
30
Pilot Plans
Migrate function area in
key business unit during a
3-month pilot
Demonstrate immediate
business value
Stand up the AWS environment
to allow IT to gain competence
with AWS
Pilot Outcomes
Create core framework for
migration
Implement ELT
architecture and perform
validation
Establish
visualization/self-service
capability through Tableau
Technical Architecture
AS400 / DB2
(Source DB)
EMR Cluster running
Sqoop Script
Output
Bucket
EC2 Instance running
Copy Command
Redshift
(Staging)
Tableau
(Reporting Tool)
Data Pipeline
SNS Topic
(Pipeline Status) (Pipeline Failure)
SNS Email
Notification
Lambda
(Save Pipeline Stats)
RDS MySQL
Instance
(Save Pipeline Stats)
(Pipeline
Configurations)
DynamoDB
DynamoDB Redshift
(Data Warehouse)
RDS MySQL
Instance
Core Framework
• Jobs and job groups are defined as metadata in DynamoDB
• Control-M Scheduler, Custom Application and Data Pipeline for
Orchestration
• ELT Process with EMR/Sqoop for Extraction, Redshift Load and Transform
the data through SQL scripts
• Core Framework allows for
• Restart capability from point of failure
• Capturing of operational statistics ( # of rows updated)
• Audit capability (which feed caused the fact to change)
34
Data Visualization Through Tableau
• Business users have access to facts/dimensions for standard reports through Tableau
• Power users have access to Staging tables for Ad-Hoc queries through Tableau
• Data Scientists have access to Files in S3 (from all extracts serving as Data Archive)
using Hive and/or Presto
35
Accelerating the Program Timeline
36
• CTO moved budget forward to:
• Reduce project timeline by 50%
• Eliminate overhead of 2 platforms
• Parallel work streams (swim lanes) utilized the same core
framework for migrating data for other business units
• NorthBay partners with each of those work streams to
accelerate migration
• Users wanted to be on the new platform sooner
Lessons Learned - Technology
Isolate core framework
with project specific
code repositories
Make appropriate
schema changes when
migrating to new
platform
Customize Framework
for gathering
operational stats (eg: #
of rows loaded etc.)
Start with test
automation tools and
Acceptance Test Driven
Development (ATDD)
earlier in the project
Lessons Learned – Program Execution
Creating new data platforms and
migrating data into them is easy,
especially with AWS.
Decommission of existing data
platforms is hard!
“Data Champion” / “Data Guide”
partnership absolutely critical for
successful adoption of new
platforms and working models
Importance of strong Agile
coaches while scaling out Agile
teams
Questions & Answers
Brian Barker • CEO • NorthBay Solutions brian.barker@northbaysolutions.com
John Puopolo • SVP • Engineering • Eliza Corporation
Ali Khan • Director, Business Intelligence and Analytics • Scholastic
Sai Reddy Thangirala • Solutions Architect • Amazon Web Services
www.northbaysolutions.com info@northbaysolutions.com

More Related Content

What's hot

WKS402 Well-Architected Workshop
WKS402 Well-Architected WorkshopWKS402 Well-Architected Workshop
WKS402 Well-Architected WorkshopAmazon Web Services
 
AWS Partnership Model - AWS - AWSome Day Zurich - 112016
AWS Partnership Model - AWS - AWSome Day Zurich - 112016AWS Partnership Model - AWS - AWSome Day Zurich - 112016
AWS Partnership Model - AWS - AWSome Day Zurich - 112016Amazon Web Services
 
Fortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsFortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsAmazon Web Services
 
2016 summits - future of enterprise it
2016 summits - future of enterprise it2016 summits - future of enterprise it
2016 summits - future of enterprise itAmazon Web Services
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxAmazon Web Services
 
Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016Alexandra Sasha Tchulkova
 
Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...
Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...
Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...Amazon Web Services
 
AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...
AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...
AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...Amazon Web Services
 
Windows Workloads on AWS - AWS Innovate Toronto
Windows Workloads on AWS - AWS Innovate TorontoWindows Workloads on AWS - AWS Innovate Toronto
Windows Workloads on AWS - AWS Innovate TorontoAmazon Web Services
 
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Amazon Web Services
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
 
Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9
Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9
Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9Amazon Web Services
 
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoDatabase and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoAmazon Web Services
 
Are you Well-Architected? - AWS Online Tech Talks
Are you Well-Architected? - AWS Online Tech TalksAre you Well-Architected? - AWS Online Tech Talks
Are you Well-Architected? - AWS Online Tech TalksAmazon Web Services
 
AWS Innovate Montreal Keynote - by Chris Munns
AWS Innovate Montreal Keynote - by Chris MunnsAWS Innovate Montreal Keynote - by Chris Munns
AWS Innovate Montreal Keynote - by Chris MunnsAmazon Web Services
 

What's hot (20)

WKS402 Well-Architected Workshop
WKS402 Well-Architected WorkshopWKS402 Well-Architected Workshop
WKS402 Well-Architected Workshop
 
Serverless Real Time Analytics
Serverless Real Time AnalyticsServerless Real Time Analytics
Serverless Real Time Analytics
 
AWS Partnership Model - AWS - AWSome Day Zurich - 112016
AWS Partnership Model - AWS - AWSome Day Zurich - 112016AWS Partnership Model - AWS - AWSome Day Zurich - 112016
AWS Partnership Model - AWS - AWSome Day Zurich - 112016
 
Fortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsFortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure Workloads
 
2016 summits - future of enterprise it
2016 summits - future of enterprise it2016 summits - future of enterprise it
2016 summits - future of enterprise it
 
AWS Workloads on AWS
AWS Workloads on AWSAWS Workloads on AWS
AWS Workloads on AWS
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
 
Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016Webinar: Make Your Cloud Strategy Work for 2016
Webinar: Make Your Cloud Strategy Work for 2016
 
Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...
Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...
Session Sponsored by Trend Micro: 3 Secrets to Becoming a Cloud Security Supe...
 
Financial Services in the Cloud
Financial Services in the CloudFinancial Services in the Cloud
Financial Services in the Cloud
 
AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...
AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...
AWS re:Invent 2016| HLC301 | Data Science and Healthcare: Running Large Scale...
 
Windows Workloads on AWS - AWS Innovate Toronto
Windows Workloads on AWS - AWS Innovate TorontoWindows Workloads on AWS - AWS Innovate Toronto
Windows Workloads on AWS - AWS Innovate Toronto
 
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john mallory
 
Getting Started with AWS
Getting Started with AWSGetting Started with AWS
Getting Started with AWS
 
Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9
Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9
Innovating IAM Protection for AWS with Dome9 - Session Sponsored by Dome9
 
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate TorontoDatabase and Analytics on the AWS Cloud - AWS Innovate Toronto
Database and Analytics on the AWS Cloud - AWS Innovate Toronto
 
Are you Well-Architected? - AWS Online Tech Talks
Are you Well-Architected? - AWS Online Tech TalksAre you Well-Architected? - AWS Online Tech Talks
Are you Well-Architected? - AWS Online Tech Talks
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
AWS Innovate Montreal Keynote - by Chris Munns
AWS Innovate Montreal Keynote - by Chris MunnsAWS Innovate Montreal Keynote - by Chris Munns
AWS Innovate Montreal Keynote - by Chris Munns
 

Viewers also liked

AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)Amazon Web Services
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...Amazon Web Services
 
Add User Sign in and Management to your Apps with Amazon Cognito
Add User Sign in and Management to your Apps with Amazon CognitoAdd User Sign in and Management to your Apps with Amazon Cognito
Add User Sign in and Management to your Apps with Amazon CognitoAmazon Web Services
 
database migration simple, cross-engine and cross-platform migrations with ...
database migration   simple, cross-engine and cross-platform migrations with ...database migration   simple, cross-engine and cross-platform migrations with ...
database migration simple, cross-engine and cross-platform migrations with ...Amazon Web Services
 
AWS Webinar - Dynamo DB + Redshift 13_09_19
AWS Webinar - Dynamo DB + Redshift 13_09_19AWS Webinar - Dynamo DB + Redshift 13_09_19
AWS Webinar - Dynamo DB + Redshift 13_09_19Amazon Web Services
 
AWS SDK for Go in #jawsmeguro
AWS SDK for Go in #jawsmeguroAWS SDK for Go in #jawsmeguro
AWS SDK for Go in #jawsmeguroKenta Suzuki
 
How to Extend your Datacenter into the Cloud - 2nd Watch - Webinar
How to Extend your Datacenter into the Cloud - 2nd Watch - WebinarHow to Extend your Datacenter into the Cloud - 2nd Watch - Webinar
How to Extend your Datacenter into the Cloud - 2nd Watch - WebinarAmazon Web Services
 
AWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon RedshiftAWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon RedshiftAmazon Web Services
 
Security Innovations in the Cloud
Security Innovations in the CloudSecurity Innovations in the Cloud
Security Innovations in the CloudAmazon Web Services
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSAmazon Web Services
 
AWS Enterprise Summit Netherlands - Starting Your Journey in the Cloud
AWS Enterprise Summit Netherlands - Starting Your Journey in the CloudAWS Enterprise Summit Netherlands - Starting Your Journey in the Cloud
AWS Enterprise Summit Netherlands - Starting Your Journey in the CloudAmazon Web Services
 
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...ETCenter
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheAmazon Web Services
 
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...Amazon Web Services
 
Rackspace: Best Practices for Security Compliance on AWS
Rackspace: Best Practices for Security Compliance on AWSRackspace: Best Practices for Security Compliance on AWS
Rackspace: Best Practices for Security Compliance on AWSAmazon Web Services
 
AWS Enterprise Summit Netherlands - Enterprise Applications on AWS
AWS Enterprise Summit Netherlands - Enterprise Applications on AWSAWS Enterprise Summit Netherlands - Enterprise Applications on AWS
AWS Enterprise Summit Netherlands - Enterprise Applications on AWSAmazon Web Services
 

Viewers also liked (20)

AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
 
Add User Sign in and Management to your Apps with Amazon Cognito
Add User Sign in and Management to your Apps with Amazon CognitoAdd User Sign in and Management to your Apps with Amazon Cognito
Add User Sign in and Management to your Apps with Amazon Cognito
 
Storage & Content Delivery
Storage & Content Delivery Storage & Content Delivery
Storage & Content Delivery
 
database migration simple, cross-engine and cross-platform migrations with ...
database migration   simple, cross-engine and cross-platform migrations with ...database migration   simple, cross-engine and cross-platform migrations with ...
database migration simple, cross-engine and cross-platform migrations with ...
 
AWS Webinar - Dynamo DB + Redshift 13_09_19
AWS Webinar - Dynamo DB + Redshift 13_09_19AWS Webinar - Dynamo DB + Redshift 13_09_19
AWS Webinar - Dynamo DB + Redshift 13_09_19
 
AWS SDK for Go in #jawsmeguro
AWS SDK for Go in #jawsmeguroAWS SDK for Go in #jawsmeguro
AWS SDK for Go in #jawsmeguro
 
How to Extend your Datacenter into the Cloud - 2nd Watch - Webinar
How to Extend your Datacenter into the Cloud - 2nd Watch - WebinarHow to Extend your Datacenter into the Cloud - 2nd Watch - Webinar
How to Extend your Datacenter into the Cloud - 2nd Watch - Webinar
 
AWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon RedshiftAWS Webcast - Data Integration into Amazon Redshift
AWS Webcast - Data Integration into Amazon Redshift
 
Security Innovations in the Cloud
Security Innovations in the CloudSecurity Innovations in the Cloud
Security Innovations in the Cloud
 
Getting Started on AWS
Getting Started on AWS Getting Started on AWS
Getting Started on AWS
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
AWS Enterprise Summit Netherlands - Starting Your Journey in the Cloud
AWS Enterprise Summit Netherlands - Starting Your Journey in the CloudAWS Enterprise Summit Netherlands - Starting Your Journey in the Cloud
AWS Enterprise Summit Netherlands - Starting Your Journey in the Cloud
 
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCache
 
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
 
Rackspace: Best Practices for Security Compliance on AWS
Rackspace: Best Practices for Security Compliance on AWSRackspace: Best Practices for Security Compliance on AWS
Rackspace: Best Practices for Security Compliance on AWS
 
AWS Enterprise Summit Netherlands - Enterprise Applications on AWS
AWS Enterprise Summit Netherlands - Enterprise Applications on AWSAWS Enterprise Summit Netherlands - Enterprise Applications on AWS
AWS Enterprise Summit Netherlands - Enterprise Applications on AWS
 

Similar to Develop a Custom Data Solution Architecture with NorthBay

Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star SchemaDATAVERSITY
 
Taming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsTaming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsKellyn Pot'Vin-Gorman
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerAntonios Chatzipavlis
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Power BI - 2016 - Public
Power BI - 2016 - PublicPower BI - 2016 - Public
Power BI - 2016 - PublicJulian Payne
 
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...Vineeth Mylapur
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
J1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan KumarJ1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan KumarMS Cloud Summit
 
Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.Amazon Web Services
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 

Similar to Develop a Custom Data Solution Architecture with NorthBay (20)

Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
 
Taming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI OptionsTaming the shrew, Optimizing Power BI Options
Taming the shrew, Optimizing Power BI Options
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Power BI - 2016 - Public
Power BI - 2016 - PublicPower BI - 2016 - Public
Power BI - 2016 - Public
 
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Oracle bi ee architecture
Oracle bi ee architectureOracle bi ee architecture
Oracle bi ee architecture
 
J1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan KumarJ1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan Kumar
 
Introduction to Dremio
Introduction to DremioIntroduction to Dremio
Introduction to Dremio
 
Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Tata Kelola Bisnis perushaan yang bergerak
Tata Kelola Bisnis perushaan yang bergerakTata Kelola Bisnis perushaan yang bergerak
Tata Kelola Bisnis perushaan yang bergerakEditores1
 
Graham and Doddsville - Issue 1 - Winter 2006 (1).pdf
Graham and Doddsville - Issue 1 - Winter 2006 (1).pdfGraham and Doddsville - Issue 1 - Winter 2006 (1).pdf
Graham and Doddsville - Issue 1 - Winter 2006 (1).pdfAnhNguyen97152
 
Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024Borderless Access
 
To Create Your Own Wig Online To Create Your Own Wig Online
To Create Your Own Wig Online  To Create Your Own Wig OnlineTo Create Your Own Wig Online  To Create Your Own Wig Online
To Create Your Own Wig Online To Create Your Own Wig Onlinelng ths
 
Chicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdf
Chicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdfChicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdf
Chicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdfSourav Sikder
 
MoneyBridge Pitch Deck - Investor Presentation
MoneyBridge Pitch Deck - Investor PresentationMoneyBridge Pitch Deck - Investor Presentation
MoneyBridge Pitch Deck - Investor Presentationbaron83
 
7movierulz.uk
7movierulz.uk7movierulz.uk
7movierulz.ukaroemirsr
 
Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)
Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)
Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)tazeenaila12
 
The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003
The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003
The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003believeminhh
 
Ethical stalking by Mark Williams. UpliftLive 2024
Ethical stalking by Mark Williams. UpliftLive 2024Ethical stalking by Mark Williams. UpliftLive 2024
Ethical stalking by Mark Williams. UpliftLive 2024Winbusinessin
 
BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John Meulemans
BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John MeulemansBCE24 | Virtual Brand Ambassadors: Making Brands Personal - John Meulemans
BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John MeulemansBBPMedia1
 
A flour, rice and Suji company in Jhang.
A flour, rice and Suji company in Jhang.A flour, rice and Suji company in Jhang.
A flour, rice and Suji company in Jhang.mcshagufta46
 
Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024Borderless Access
 
Entrepreneurship & organisations: influences and organizations
Entrepreneurship & organisations: influences and organizationsEntrepreneurship & organisations: influences and organizations
Entrepreneurship & organisations: influences and organizationsP&CO
 
Slicing Work on Business Agility Meetup Berlin
Slicing Work on Business Agility Meetup BerlinSlicing Work on Business Agility Meetup Berlin
Slicing Work on Business Agility Meetup BerlinAnton Skornyakov
 
Team B Mind Map for Organizational Chg..
Team B Mind Map for Organizational Chg..Team B Mind Map for Organizational Chg..
Team B Mind Map for Organizational Chg..dlewis191
 
Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)Lviv Startup Club
 
Developing Coaching Skills: Mine, Yours, Ours
Developing Coaching Skills: Mine, Yours, OursDeveloping Coaching Skills: Mine, Yours, Ours
Developing Coaching Skills: Mine, Yours, OursKaiNexus
 

Recently uploaded (20)

Tata Kelola Bisnis perushaan yang bergerak
Tata Kelola Bisnis perushaan yang bergerakTata Kelola Bisnis perushaan yang bergerak
Tata Kelola Bisnis perushaan yang bergerak
 
Graham and Doddsville - Issue 1 - Winter 2006 (1).pdf
Graham and Doddsville - Issue 1 - Winter 2006 (1).pdfGraham and Doddsville - Issue 1 - Winter 2006 (1).pdf
Graham and Doddsville - Issue 1 - Winter 2006 (1).pdf
 
Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024
 
To Create Your Own Wig Online To Create Your Own Wig Online
To Create Your Own Wig Online  To Create Your Own Wig OnlineTo Create Your Own Wig Online  To Create Your Own Wig Online
To Create Your Own Wig Online To Create Your Own Wig Online
 
Chicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdf
Chicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdfChicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdf
Chicago Medical Malpractice Lawyer Chicago Medical Malpractice Lawyer.pdf
 
MoneyBridge Pitch Deck - Investor Presentation
MoneyBridge Pitch Deck - Investor PresentationMoneyBridge Pitch Deck - Investor Presentation
MoneyBridge Pitch Deck - Investor Presentation
 
WAM Corporate Presentation Mar 25 2024.pdf
WAM Corporate Presentation Mar 25 2024.pdfWAM Corporate Presentation Mar 25 2024.pdf
WAM Corporate Presentation Mar 25 2024.pdf
 
7movierulz.uk
7movierulz.uk7movierulz.uk
7movierulz.uk
 
Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)
Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)
Harvard Business Review.pptx | Navigating Labor Unrest (March-April 2024)
 
The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003
The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003
The Vietnam Believer Newsletter_MARCH 25, 2024_EN_Vol. 003
 
Investment Opportunity for Thailand's Automotive & EV Industries
Investment Opportunity for Thailand's Automotive & EV IndustriesInvestment Opportunity for Thailand's Automotive & EV Industries
Investment Opportunity for Thailand's Automotive & EV Industries
 
Ethical stalking by Mark Williams. UpliftLive 2024
Ethical stalking by Mark Williams. UpliftLive 2024Ethical stalking by Mark Williams. UpliftLive 2024
Ethical stalking by Mark Williams. UpliftLive 2024
 
BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John Meulemans
BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John MeulemansBCE24 | Virtual Brand Ambassadors: Making Brands Personal - John Meulemans
BCE24 | Virtual Brand Ambassadors: Making Brands Personal - John Meulemans
 
A flour, rice and Suji company in Jhang.
A flour, rice and Suji company in Jhang.A flour, rice and Suji company in Jhang.
A flour, rice and Suji company in Jhang.
 
Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024Borderless Access - Global Panel book-unlock 2024
Borderless Access - Global Panel book-unlock 2024
 
Entrepreneurship & organisations: influences and organizations
Entrepreneurship & organisations: influences and organizationsEntrepreneurship & organisations: influences and organizations
Entrepreneurship & organisations: influences and organizations
 
Slicing Work on Business Agility Meetup Berlin
Slicing Work on Business Agility Meetup BerlinSlicing Work on Business Agility Meetup Berlin
Slicing Work on Business Agility Meetup Berlin
 
Team B Mind Map for Organizational Chg..
Team B Mind Map for Organizational Chg..Team B Mind Map for Organizational Chg..
Team B Mind Map for Organizational Chg..
 
Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)
 
Developing Coaching Skills: Mine, Yours, Ours
Developing Coaching Skills: Mine, Yours, OursDeveloping Coaching Skills: Mine, Yours, Ours
Developing Coaching Skills: Mine, Yours, Ours
 

Develop a Custom Data Solution Architecture with NorthBay

  • 1. “Teaching Old Data New Tricks™” Brian Barker • CEO • NorthBay Solutions John Puopolo • SVP • Engineering • Eliza Corporation Ali Khan • Director, Business Intelligence and Analytics • Scholastic Sai Reddy Thangirala • Solutions Architect • Amazon Web Services
  • 2. Agenda • Big Data on AWS • NorthBay • Eliza Corporation Case Study • Challenges Eliza Faced • Strategic Goals • Why a Data Lake Approach was Chosen • Outcomes & Benefits Eliza Achieved • Scholastic Case Study • Challenges • Goals • The AWS/NorthBay Decision • How the Initiative Unfolded • Key Learnings
  • 3. Data is Growing of new data will be created every second for every human being on the planet by 2020 http://www.whizpr.be/upload/medialab/21/c ompany/Media_Presentation_2012_DigiUn iverseFINAL1.pdf 1.7MB compound annual growth rate of 58% surpassing $1 billion by 2020 forecasted for the Hadoop market http://www.ap-institute.com/big-data- articles/big-data-what-is-hadoop- %E2%80%93-an-explanation-for- absolutely-anyone.aspx http://www.marketanalysis.com/?p=279 58% of all data is ever analyzed and used at the moment http://www.technologyreview.com/news/51 4346/the-data-made-me-do-it/ 0.5%<
  • 4. Big Data Is for Everyone The market for Big Data technologies is growing more than six times faster than the information technology market as a whole…. …and those companies who use their data well win.
  • 5. Why AWS for Big Data? Immediately Available Broad and Deep Capabilities Trusted and Secure Scalable
  • 6. AWS Provides the Most Complete Platform for Big Data It’s easy to get data to AWS, store it securely, and analyze it with the engine of your choice, without any long-term commitment or vendor lock-in Collect Import/Export Snowball Direct Connect VM Import/Export Store Amazon S3 EMR Amazon Glacier Amazon Redshift DynamoDB Aurora Analyze Amazon Kinesis Lambda EMR EC2
  • 7. What Can You Do With Big Data on AWS? Big Data Repositories Clickstream Analysis ETL Offload Machine Learning Online Ad Serving BI Applications
  • 8. “Teaching Old Data New Tricks™” with NorthBay
  • 9. “Teaching Old Data New Tricks™” Untapped wealth - Companies gain tremendous leverage when “Teaching Old Data New Tricks™” • So what does that mean? • You’ll hear 2 exciting Customer Examples/Use Cases presented today Building a HIPAA compliant Data Lake Re-tooling old on premise technology on the fly Customer Examples/Use Cases
  • 10. Scholastic Preview of Coming Attractions • How did an old school $1.5B 100-year-old company re-invent its old school IBM and Microsoft based big data system & analytics system on the fly? • What was their starting point? • What factors did they consider when making their decision? • What did they decide on for technology and partners and why? • How did they implement? • What were the results? • Lessons learned?
  • 11. AWS & NorthBay Background Global Provider of Big Data Solutions 250+ Full-time professionals 145+ Clients 200+ Solutions launched
  • 12. Conceptual Data Lake Architecture
  • 13. Eliza Preview of Coming Attractions • How does a high flying Healthcare services company re-platform its Enterprise Data Platform while processing millions of 'interactions' every day. • Why the need to change? • What strategic goals had to be achieved? • What is so tough about "named value pairs" • Why a Data Lake and why NorthBay? • Which AWS services were chosen to leverage? • What did they decide on for technology and partners and why? • How did it turn out? • What did they learn?
  • 14. Eliza Corporation John Puopolo, SVP, Engineering, Eliza Corporation
  • 15. About Eliza Corporation • Founded 2000 • Leader in Health Engagement Management (HEM) outreach services • Hundreds of millions of outreaches for intensive operation and analytics processing • High-volume semi-structured data, complex business flow of data • Variety of analytics/consumption needs ranging from portal for customers to ML workloads
  • 16. Challenges Eliza Faced Eliza Corporation analyzes more than 300 million interactions per year Outreach questions and responses form a decision tree, and each question and response are captured as a pair, E.G.: <question, response> = <“Did you visit your physician in the last 30 days?”, “Yes”> Diverse downstream consumption requirements Challenging to process and analyze data
  • 17. Strategic Goals Create next generation data architecture Decouple Storage and Compute Ability to process old & new data streams Achieve HIPAA compliance Ingest & store original datasets Allow both real-time & batch processing Enable access through entitlements and governance Increase self-service for end-users
  • 18. Conceptual Data Lake Architecture Monitoring, auditing, management, and alerting Data System Analytics (Lineage, Profiling) EDWETL Data Lake Storage Data Lake Archive Catalog & Search & Data Discovery API & UI Entitlements & Authorizations Data Quality & Governance Streaming Data Sources Batch Data Sources Data Sources & Ingestion Processing & Storage Consumption & Analytics Real Time Analytics BI tools Hadoop (Shared services) Business Units BI UI Hadoop, SAS (Business Unit Dedicated)
  • 19. Benefits of the New Enterprise Data Platform Architecture • Hub & spoke model for one original copy of all enterprise analytics data • Quality layer for consistent transformations and cleansing of data • Governance layer for entitlements and security management • Enable multiple consumption patterns called projections • A purpose-designed schema for an Enterprise Data Warehouse (Redshift) for efficient reporting of known queries • Streamline and automated ingestion of source batch and streaming data reducing human/manual touch points
  • 21. Major AWS Services Used Aurora Kinesis + Kinesis Streams Amazon Redshift Dynamo DB Hive, Presto, Spark on EMR CloudSearch, EC2
  • 22. Benefits of a New Enterprise Data Platform • Streamlined data load process by enabling schema on read • Improved business agility by allowing schema on read • Improved ability to manage costs by allowing separation of costs • Provided ability to enable resources to scale on-demand • Reduced end-to-end client analytics time
  • 23. Key Learnings • The nature of our data is name-value. We were doing too many transformations due to our original storage formats. • Using mini-PoCs to form hypotheses and prove/disprove them led to an emergent architecture, which pointed us towards a data lake • A data lake architecture fits our core business and growth plans extremely well
  • 24. Scholastic Ali Khan, Director, Business Intelligence and Analytics, Scholastic
  • 25. About Scholastic in annual revenue. The worlds largest publisher and distributor of children’s books website for U.S. elementary school teachers employees globally 1.6B #1 8,400+ countries languages 165+ 45+ A leader in comprehensive educational solutions
  • 26. Existing Platform & Challenges • We taught old data new tricks • IBM AS/400 was primary data warehouse platform, supplemented by Microsoft SQL Server to enable business intelligence • 5,500+ AS/400 workloads, 350+ SQL Server workloads • Inflexible architecture – slow time to market • Unable to meet internal SLAs due to performance of daily ETL processes • Scalability limitations with SQL Server Analysis Services (SSAS) for dashboards/reports • Limited ability to perform self-service business intelligence 28
  • 27. Project Goals Improve performance, scalability, availability, logging, security Enable self-service business intelligence Integrate with existing technology stack Align with the tech strategy (DevOps model, Cloud First) Leverage the skill set of current team (SQL/relational) Team up with an experienced partner
  • 28. • AWS was chosen because of agility, scalability, elasticity, security and alignment with corporate strategy • Redshift was chosen to replace AS400 and SQL Server for its relational-style high performance data store • NorthBay was chosen for their expertise in Big Data and Amazon Redshift migrations The Decision 30
  • 29. Pilot Plans Migrate function area in key business unit during a 3-month pilot Demonstrate immediate business value Stand up the AWS environment to allow IT to gain competence with AWS
  • 30. Pilot Outcomes Create core framework for migration Implement ELT architecture and perform validation Establish visualization/self-service capability through Tableau
  • 31. Technical Architecture AS400 / DB2 (Source DB) EMR Cluster running Sqoop Script Output Bucket EC2 Instance running Copy Command Redshift (Staging) Tableau (Reporting Tool) Data Pipeline SNS Topic (Pipeline Status) (Pipeline Failure) SNS Email Notification Lambda (Save Pipeline Stats) RDS MySQL Instance (Save Pipeline Stats) (Pipeline Configurations) DynamoDB DynamoDB Redshift (Data Warehouse) RDS MySQL Instance
  • 32. Core Framework • Jobs and job groups are defined as metadata in DynamoDB • Control-M Scheduler, Custom Application and Data Pipeline for Orchestration • ELT Process with EMR/Sqoop for Extraction, Redshift Load and Transform the data through SQL scripts • Core Framework allows for • Restart capability from point of failure • Capturing of operational statistics ( # of rows updated) • Audit capability (which feed caused the fact to change) 34
  • 33. Data Visualization Through Tableau • Business users have access to facts/dimensions for standard reports through Tableau • Power users have access to Staging tables for Ad-Hoc queries through Tableau • Data Scientists have access to Files in S3 (from all extracts serving as Data Archive) using Hive and/or Presto 35
  • 34. Accelerating the Program Timeline 36 • CTO moved budget forward to: • Reduce project timeline by 50% • Eliminate overhead of 2 platforms • Parallel work streams (swim lanes) utilized the same core framework for migrating data for other business units • NorthBay partners with each of those work streams to accelerate migration • Users wanted to be on the new platform sooner
  • 35. Lessons Learned - Technology Isolate core framework with project specific code repositories Make appropriate schema changes when migrating to new platform Customize Framework for gathering operational stats (eg: # of rows loaded etc.) Start with test automation tools and Acceptance Test Driven Development (ATDD) earlier in the project
  • 36. Lessons Learned – Program Execution Creating new data platforms and migrating data into them is easy, especially with AWS. Decommission of existing data platforms is hard! “Data Champion” / “Data Guide” partnership absolutely critical for successful adoption of new platforms and working models Importance of strong Agile coaches while scaling out Agile teams
  • 37. Questions & Answers Brian Barker • CEO • NorthBay Solutions brian.barker@northbaysolutions.com John Puopolo • SVP • Engineering • Eliza Corporation Ali Khan • Director, Business Intelligence and Analytics • Scholastic Sai Reddy Thangirala • Solutions Architect • Amazon Web Services www.northbaysolutions.com info@northbaysolutions.com

Editor's Notes

  1. Thanks - I am Brian Barker – CEO NorthBay Sometimes in the middle of a seemingly routine project something great happens. It happened to me - when one of our customers – in fact who you will hear from today - they said – “It was our same data – but we get so much more out of it.” In saying that he exposed Universal Truth - Virtually every company on this webcast, and in fact every company in America has a vast pool of untapped wealth in their old data. The key is unlocking, re -energizing and re-using, re-combining and making it availble to users who can in turn re -energize and re-use, re-combine it We realized that if we can help our customer can TODNT they reap enormous rewards – when Amazon heard this they got excited too – and that is why we are here today So what does that all mean ? 2 great NorthBay customers will share their story
  2. What we are really
  3. MODERN DATA LAKE (1. Catalog & metadata, 2. Storage & 3. Access control) The conceptual architecture for a Data Lake-centered platform has: Both [stream oriented] and [batch oriented] data sources That are {ingested} going through the [Quality & Governance Layer] for cleansing of the data (CLEANSING AND STANDARDING DATA) These data sets are stored in the [Data Lake] on S3 with the [Catalog and metadata] updated allowing for later search and discovery The Data Lake has the [Archival] available for management of life cycle of the data inside the lake {Processing} of the data from the data lake is done through [ETL] into [AWS Redshift] The entire Data Lake and its services are encapsulated via an [API] that provides for [Entitlements & Authorization] Multiple other consumption needs for the data residing in the data lake are met by various [BI Tools], [Hadoop clusters] and so on using it
  4. Hello, I’m John Puopolo, Senior Vice President of Engineering at Eliza. Since 1998, Eliza Corporation has developed healthcare consumer engagement solutions to address some of the industry’s greatest challenges – from adherence, to prevention, to condition management, to brand loyalty and retention. "Pay-for-performance" in healthcare incentivizes payers and providers to keep a population under their care healthier. The Pay-for-performance arrangements provide financial incentives to hospitals, physicians, and other healthcare providers to carry out such improvements and help achieve optimal outcomes for patients. This is a departure from fee-for-service, where payments are for each service used. Eliza focuses on Health Engagement Management, and acts on behalf of healthcare organizations (e.g. hospitals, clinics, pharmacies, insurance companies, etc.) in order to engage people at the right time, with the right message, and in the right channel to capture relevant metrics to analyze the overall value provided by Healthcare. We process the healthcare data for over 55M Americans every year This translates to 100s of millions of interactions per year Outreach results yield 2M-5M data points per day A handful of our key tables have approximately 1 trillion rows
  5. Eliza Corporation analyzes more than 300 million outreaches per year, primarily through outbound phone calls with Interactive Voice Response (IVR) technology, but other channels such as SMS, email, and in-bound IVR are growing quite rapidly.   For Eliza, interactions are healthcare questions. Each question results in an answer. The questions and answers are implemented as a decision tree, and we capture each unique question-answer pair as a tuple:  <question, response> = <“Did you visit your physician in the last 30 days?” , “Yes”>   Post-outreach, these question-answer pairs needs to be analyzed, sorted, aggregated, etc., and very often we need to processes them differently for different customers. This means that keeping the data in raw form is important, as it makes analysis and reporting as flexible as possible. Imposing a schema-on-write can limit our ability to do what-if analysis.     
  6. Our strategic goals were… To create a next generation data architecture to support continued growth and functionality, allowing for maximum flexibility in analysis and reporting Have the ability to accept, transform and process old & new data streams Allow for both real-time and batch processing with little human intervention Provide metadata, catalog and data discovery for content in the data lake Enable access to data through entitlements and governance Increase the level of self-service enablement for end-users
  7. MODERN DATA LAKE (1. Catalog & metadata, 2. Storage & 3. Access control) The conceptual architecture for a Data Lake-centered platform has 3 main layers: An ingress layer that accepts a variety of data formats along both batch and streaming pathways A conditioning layer where data quality and governance rules can be applied, keeping corrupted or inaccurate data from entering the lake. At this layer, we can also generate and add metadata to any and all data streams. This metadata supports later search and discovery scenarios. A consumption layer, where downstream clients can access and subsequently analyze the data In addition to these primary layers, this canonical architecture readily supports data access control and data life-cycle management. And…the entire Data Lake and its services are accessed through an API that provides for streamlined querying and reads.
  8. The high-level technical architecture for the platform is based on the conceptual one we discussed a few slides back. --- In our implementation, outreach results, e.g., call dispositions, SMS response, etc. flow into the system via AWS Kinesis Streams on a continuous, real-time basis. We perform near real-time analytics and queries on the outreach results using Kinesis Analytics, and we radiate activity and volume statistics to our Network Operations Center. Other data, such as customer member profiles, come in through FTP and land in a “raw” S3 bucket. The system takes the “raw” data and passes it through a conditioning layer. Here, we apply data quality rules, generate metadata for downstream data access and client consumption, and build a metadata index for rapid search and retrieval. We move the data through the system at scale using Spark jobs on EMR clusters, and store some of the by-products in DynamoDB. After passing through the conditioning layer (which includes a HIPAA Obfuscation module not shown in the diagram for Dev & Test), the data is stored in S3 buckets that use randomized keys to achieve an operationally effective distribution of the data across partitions. Moving to the storage layer, where the data in now “cooked”, we use DynamoDB tables to store the catalog and metadata, which provides a rapidly accessible map of the data space. IAM policies control access to the data. Down the line, we will implement data life-cycle management policies to move data from S3 to Glacier according to customers’ and HIPAA requirements. On the Data Access and consumption side, the lake serves a variety of clients. Two examples include We make ad-hoc querying available to our analytics and data science teams using Hive and Presto. Feeding the Enterprise Data Warehouse hosting on [AWS Redshift] Internal clients and also access data from the read-only API.
  9. One key consumption of data sets was the orchestration of data through AWS Data Pipeline ETL’d into AWS Redshift based Enterprise Data Warehouse for Eliza Orchestration tool to ETL data into Redshift (EDW) Has proper DB schema, etc. DERT - legacy extracts (tableau adhoc questions…)... Data Extraction and Retrieval Tool Lamda – telling Data Pipeline that something is available S3 Copy Command Datapapline – orchestrates ETL and populating Redshift 2-7 GB per day is processed Eliza – GB per day added….
  10. Robust architecture to support data at scale We improved… Business agility by allowing schema-on-read vs. traditional shaping of data in ETL process Our Ability to manage costs improved by allowing separation of costs associated with compute and storage resources Flexibility by providing elasticity, enabling resources to dynamically scale up or down based on demand Reduced data transformations and touchpoints, resulting in elimination of about 30% of operational labor costs We reduced cycle time for end-to-end Client Analytics reduced by 50% Customer Portal gets near real-time updates Ad-hoc reporting tools are self-service with Presto In the future, we are considering applying Machine Learning to data sets to determine if and when a member falls into a given set of clinical categories, helping our customers segment their populations in new and meaningful ways.
  11. Minimum Viable Plonboardatform (MVP) with one thread of processing end-to-end is very helpful in deriving business value soon
  12. About Challenges Goals
  13. Scholastic ran its business on old IBM AS/400 technologies (initially launched in 1988) and Microsoft SQL Server. Prior to their engagement with NorthBay, Scholastic had over 5,500 AS/400 workloads and more than 350 SQL Server workloads performing their Big Data and Analytics work. These systems grew over a 20-year period, and like most systems of that vintage, were rife with problems. They were expensive to run, unable to keep up with the business needs, and inflexible.   Scholastic knew they needed to evolve, and decided to evaluate how they could use Amazon Web Services (AWS) to evolve their Big Data and Analytics workloads. They evaluated multiple AWS Partner Network (APN) Consulting Partners, and chose to engage with NorthBay to help them meet their objectives.
  14. Create a next generation platform that could provide stability, performance & scalability Have the ability to accept, transform and process old and new data streams Be able to handle real time streaming and all of the old warehouse Raise the level of self-service enablement for end-users
  15. Redshift was chosen to replace AS400 for its relational-style high performance data store. It is also managed service with cost-optimization models, elastic and scalable Redshift as the Enterprise Data Warehouse platform S3 as location for Data migration, Archival and Analysis NorthBay was chosen as the implementation partner for after talking to other vendors recommended by Amazon for their expertise in Big Data and Redshift migrations from traditional warehouses Datapipeline was chosen as the orchestration service for its flexibility and scalability EMR with Sqoop chosen as ingestion process for its scalability and parallelizing the processes DynamoDB No-SQL store is chosen to store metadata about various processes. This is cost-effective and offers flexibility with schema changes Key Management Service is used for credentials encryption for source and target data stores SNS, Lambda are used to trigger the success, failure notifications and handling them appropriately for their processes
  16. Not invested in CDC/ETL tool for migrating iSeries AS400 data After the success of a 3 month pilot: Project timeline accelerated from 3 years to 18 months Handful of workloads were transformed into the new platform Parallel work streams started utilizing same core framework AWS skills transferred to Scholastic’s technical team IT team understood and gained comfort with AWS Phase 1: 3-month pilot to: transform a handful of their workloads into the new platform (AWS/Redshift/S3) Demonstrate to the business users funding the project that they will get better information/ knowledge to run their business with the new platform Stand up the AWS environment to allow IT to understand and gain comfort with AWS and the Cloud Transfer AWS skills to the publisher’s technical team Prove that a 1 Team approach will work (Client’s team and NorthBay) Due to the HUGE pilot success, the CEO moved budget forward to accelerate project timeline from 3 Years to 18 Months (saving 18 mos. of cost on AS/400 and SQL Server Parallel work streams (swim lanes) were started utilizing the same core framework for migrating data for other business units NorthBay partnered with each of those work streams to accelerate the migration The team who developed the framework (NorthBay/Customer IT) helping other initiatives at Customer by training, offering best practices and lessons learnt around AWS, CI/CD and running projects in Agile manner
  17. After the success of a 3 month pilot: Project timeline accelerated from 3 years to 18 months Handful of workloads were transformed into the new platform Parallel work streams started utilizing same core framework AWS skills transferred to Scholastic’s technical team IT team understood and gained comfort with AWS Phase 1: 3-month pilot to: transform a handful of their workloads into the new platform (AWS/Redshift/S3) Demonstrate to the business users funding the project that they will get better information/ knowledge to run their business with the new platform Stand up the AWS environment to allow IT to understand and gain comfort with AWS and the Cloud Transfer AWS skills to the publisher’s technical team Prove that a 1 Team approach will work (Client’s team and NorthBay) Due to the HUGE pilot success, the CEO moved budget forward to accelerate project timeline from 3 Years to 18 Months (saving 18 mos. of cost on AS/400 and SQL Server Parallel work streams (swim lanes) were started utilizing the same core framework for migrating data for other business units NorthBay partnered with each of those work streams to accelerate the migration The team who developed the framework (NorthBay/Customer IT) helping other initiatives at Customer by training, offering best practices and lessons learnt around AWS, CI/CD and running projects in Agile manner
  18. + Operational Systems (POS, Customer system, etc.) + ControlM is their Scheduler… on prem jobs run... Manages job dependencies.. Execute Job x... Migarte to Redshft + Custom Python framework – read Dyanamo DB adn create datapipeline for each table Core Framework Jobs and Job Groups are defined as metadata in DynamoDB Control-M scheduler, Custom Application and Data Pipeline for Orchestration ELT Process with EMR/Sqoop for Extraction, Redshift Load and Transform the data through SQL scripts Core Framework allows for Restart Capability from point of failure Capturing of operational statistics ( # of rows updated etc.) Audit capability (which feed caused the Fact to change etc.) + Data Pipeline is the job orchestration process + Each job creates its own pipeline .. Jobs are bundled into groups + DynamoDB stores the metadata (which schema, which job, job group, data source) + Sqoop…. Parallel reads from mainrame ... To S3.. .. Limit of As400 is 16-20 parallel connections + Facts & Dimensions in Redshift are SQL scripts saved in S3
  19. After the success of the 3 month pilot the timeline was re-visited Savings on AS/400 and SQL Server - Cost of being on 2 platforms eliminated
  20. Consolidating logging solution across S3, Redshift, DynamoDB etc. was a challenge
  21. Consolidating logging solution across S3, Redshift, DynamoDB etc. was a challenge
  22. Send 5 pre-scripted q&a questions to Lo, Sai & Angela for review