SlideShare a Scribd company logo
1 of 34
Download to read offline
© 2020 Snowflake Inc. All Rights Reserved
Data Science and
AI/ML at Scale
Scott Hruby | Snowflake Solutions Engineer
© 2020 Snowflake Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved
Snowflake 101
© 2020 Snowflake Inc. All Rights Reserved
“The rapid rise of gathered/analyzed digital data
is often core to the holistic success of the fastest growing &
most successful companies of our time around the world.”
– Mary Meeker, Bond Capital
DATA:
THE WORLD’S MOST VALUABLE RESOURCE
© 2020 Snowflake Inc. All Rights Reserved
DATA:
THE NEW SUPERPOWER
© 2020 Snowflake Inc. All Rights Reserved
NEW TECHNOLOGY TRENDS
CHANGE HOW WE USE DATA
5
Analytics is growing in
importance, everywhere,
and for everyone
IoT, mobile, and
social open up new
opportunities for insight
Cloud gives us the
ability to scale and
centralize data
Rise of the Cloud Explosion of Data Diversification of Analytics
© 2020 Snowflake Inc. All Rights Reserved
On Premises
EDW
1st Gen Cloud
EDW
Data Lake,
Hadoop
Cloud Data
Platform
All Data
All Users
Fast Answers
SQL Database
JOURNEY TO A CLOUD DATA PLATFORM
© 2020 Snowflake Inc. All Rights Reserved
TRADITIONAL DATA ARCHITECTURE
Complex, Costly & Constrained
OLTP
Databases
Enterprise
Applications
Third-Party
Web/Log
Data
IoT
Data
Integration
Data
Transformation
Data
Analytics
Normalization
& Aggregation
Ad Hoc
Analysis
Real-time
Analytics
Operational
Reporting
DATA
SOURCES
DATA
CONSUMERS
ELT
Streaming
ELT
Data Marts
Data Warehouses
Backups
File Sharing
CubesData Lake
CDC Data Science
7
© 2020 Snowflake Inc. All Rights Reserved
SNOWFLAKE CLOUD DATA PLATFORM
ONE PLATFORM, ONE COPY OF DATA,
MANY WORKLOADS
8
DATA
SOURCES
OLTP DATABASES
ENTERPRISE
APPLICATIONS
THIRD-PARTY
WEB/LOG DATA
IoT
DATA
CONSUMERS
DATA MONETIZATION
OPERATIONAL
REPORTING
AD HOC ANALYSIS
REAL-TIME ANALYTICS
© 2020 Snowflake Inc. All Rights Reserved
SNOWFLAKE ARCHITECTURE
9
© 2020 Snowflake Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved
Data Science & AI
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE IS PREDICTIVE ANALYTICS
Descriptive
Analytics
Diagnostic
Analytics
What
happened?
Why did it
happen?
reports self driving car
Predictive
Analytics
Prescriptive
Analytics
What will
happen?
How can we
make it happen?
© 2020 Snowflake Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved
85%of #bigdata
projects fail to move
past preliminary
stages
Gartner - Nov. 2017
80% of analytics
insights will not
deliver business
outcomes through
2022
Gartner - Jan 2019
87% of data
science projects
never make it into
production
VentureBeat AI - July 2019
80% of AI projects
will “remain alchemy,
run by wizards”
through 2020
Gartner - Jan 2019
© 2020 Snowflake Inc. All Rights Reserved
EFFICIENT
DATA
PREPARATION
EXTENSIVE
PARTNER
ECOSYSTEM
CONSOLIDATED
SOURCE FOR
ALL DATA
KEY REQUIREMENTS FOR DATA SCIENCE
Structured, Semi structured, and
Unstructured data
3rd Party Data Sharing
Streaming & Batch
Dedicated compute clusters
for each team
No resource contention
Integration with the latest
ML tools and libraries
Consistent experience for
BI and ML users
© 2020 Snowflake Inc. All Rights Reserved
ValueofData DATA SCIENCE MATURITY
● Defining business use cases
● Focused on BI
● Exploring tools
● Experimenting with languages
● Training and hiring
● Testing data pipelines
DATA CHALLENGES
● 15% data utilization
● Understanding data
● Preparing data
OVERALL CHALLENGES
● Acquiring talent
● Designing & building consistent
processes & pipelines
● Defining business value
● Working on creating efficiencies
● Mapping ML to business value
● Planning growth & new use cases
● Testing various languages & tools
● Training and Hiring
● Optimizing data pipelines
DATA CHALLENGES
● 70% data utilization + external
● Data wrangling, ETL + data
validation
OVERALL CHALLENGES
● Training and Hiring
● Scalability of tools & team
● Cost containment
● Creating ROI consistently
● Large data science team
● ML built into product/service
● Standardized on tools/languages
● Advanced ML for video, audio,
and images
● Automated data pipelines
DATA CHALLENGES
● 130% data utilization
● Obtaining more data
● Streaming data
OVERALL CHALLENGES
● Retaining talent
● Optimize, simulate, & test model
efficacy
● Staying ahead of competitors
1. IDENTIFYING USE CASES 2. LIMITED PRODUCTION 3. COMPETITIVE ADVANTAGE
© 2020 Snowflake Inc. All Rights Reserved 15
DATA
SOURCES
OLTP DATABASES
ENTERPRISE
APPLICATIONS
THIRD-PARTY
WEB/LOG DATA
IoT
DATA
CONSUMERS
DATA MONETIZATION
OPERATIONAL
REPORTING
AD HOC ANALYSIS
REAL-TIME ANALYTICS
SNOWFLAKE CLOUD DATA PLATFORM
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
DATA SCIENCE WITH SNOWFLAKE
© 2020 Snowflake Inc. All Rights Reserved
ML FRAMEWORKS & LIBRARIES
STATISTICAL ALGORITHMS NEURAL NETWORK / DEEP LEARNING
● Suitable for most Data Science problems involving
structured and semi-structured data.
● Good performance with small amounts of data
● Mature - have been around for a while
● Mostly used with unstructured data like images, audio,
and video
● Performance increases with the size of the training data
© 2020 Snowflake Inc. All Rights Reserved
NOTEBOOK-BASEDAUTOMATION-BASED (AutoML)
ML PARTNER ECOSYSTEM
Powerful but ComplexFast and Easy but Less Customizable
MLlib
© 2020 Snowflake Inc. All Rights Reserved 25
PROVEN BY OVER 5,400 CUSTOMERS
© 2020 Snowflake Inc. All Rights Reserved
EVER EXPANDING ECOSYSTEM
© 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Computing Inc. All Rights Reserved
SECURE AND
GOVERNED ACCESS
TO ALL DATA
© 2020 Snowflake Computing Inc. All Rights Reserved
SNOWFLAKE SECURITY AT A GLANCE
Snowflake
Operational Controls
• NIST 800-53
• SOC2 Type 2
• HIPAA
• PCI
• FedRAMP
Access
• All communication
secured & encrypted
• TLS 1.2 encryption
in both trusted and
untrusted networks
• IP Whitelisting
Authentication
• Password Policy
enforcement
• Multifactor Authentication
• SAML 2.0 support for
Federated Authentication
Application
• Flexible user
management
• Role-based
access control for
granular control
• RBAC for data
and actions
Data
• Encrypted at rest
• Hierarchical key model
rooted in Cloud HSM
• Automatic key rotation
• Time Travel 1-90 days
• Tri-Secret Secure
• Query statement
encryption
Infrastructure
• AWS, Azure Physical Security
• AWS, Azure Redundancy
• Regional Data Centers
▪ US
▪ EU
▪ AP
28
© 2020 Snowflake Computing Inc. All Rights Reserved
COMPREHENSIVE DATA PROTECTION
Protection against infrastructure failures
All data transparently & synchronously replicated
3+ ways across independent infrastructure
Protection against corruption & user errors
“Time travel” feature enables instant roll-back to
any point in time during chosen retention window
Long-term data protection
Zero-copy clones + optional export to cloud
object storage enable user-managed data copies
SELECT* FROM T0…
T0 T1 T2
New data Modified data
Daily
Weekly
29
© 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved
Dynamically Mask Protected (PII, PHI)
Column Data at Query Time
• No change to the stored data
• Mask or partial mask using constant
value, hash, and custom functions
• Unmask for authorized users only
Policy Based Control
• Table/View owners and privileged users
(such as accountadmin) unauthorized
by default
• Centralized policy mgt
Ease of Management
• Apply single policy to multiple columns
• Prevent secure view explosion
Alice
(Unauthorized)
Bob
(Authorized)
ID Phone SSN
101 ***-***-5534 *********
102 ***-***-3564 *********
103 ***-***-9787 *********
ID Phone SSN
101 408-123-5534 *********
102 510-335-3564 *********
103 214-553-9787 *********
POLICIES
INGEST RAW DATA
GOVERNANCE AND SECURITY
Dynamic data masking
PRIVATE PUBLIC GA
© 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved
DB 1
Table 1
Column 1
DB 1
View 1
Column 1
DB n
Table n
Column n
<policy condition>
<masking function>
Masking Policy
Resource(s)
Policy
Admin
Apply
CASE
WHEN invoker_role() IN (‘pii_role’) THEN val
WHEN invoker_role() IN (‘support’) THEN
regexp_replace(val,'.+@','*****@')
ELSE ‘********’
END;
Masking Policy Example
Unmask
Partial mask
Mask
Masking Policy
• Policy contains condition(s) and masking
function to apply under those conditions
• Policy is applied to one or more table,
view, or external table columns in an
account
• Nested policy execution for views - policy
on table executed before policy on view(s)
Supports
• All data types
• Data sharing
• Streams
• Clone carries over policy associations
GOVERNANCE AND SECURITY
Dynamic data masking policies
PRIVATE PUBLIC GA
© 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved
ID Phone SSN
101 408-123-5534 387-78-3456
102 510-334-3564 226-44-8908
103 214-553-9787 359-9987-0098
Ingest tokenized data
Ingest Protected (PII/PHI) Data
as Externally Tokenized
• Using Protegrity agents on ETL
tools.
De-tokenize for Authorized
Users at Query Time
• Protegrity DSG called using external
functions to de-tokenize data.
• For unauthorized users, Protegrity
DSG is not called.
Policy Based Control
• Table/View owners and privileged
users (such as accountadmin)
unauthorized by default
• Centralized policy mgt
Customer VPC / VNet
Data Security
Gateway
(DSG)
POLICIES
Tokenized
De-tokenized
De-tokenized
Tokenized
ID Phone SSN
101 111-222-3333 000-78-9999
102 002-778-9904 779-66-8908
103 100-887-8888 111-00-8888
REST API
Alice
(Unauthorized)
Bob
(Authorized)
EXTERNAL
FUNCTION
GOVERNANCE AND SECURITY
External tokenization using third party
PRIVATE PUBLIC GA
© 2020 Snowflake Inc. All Rights Reserved
“I can only show you the door. You’re the one
that has to walk through it”
© 2020 Snowflake Inc. All Rights Reserved
THANK YOU

More Related Content

What's hot

Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudMichael Rainey
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceHarald Erb
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceSnowflake Computing
 
Snowflake Architecture.pptx
Snowflake Architecture.pptxSnowflake Architecture.pptx
Snowflake Architecture.pptxchennakesava44
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglyTyler Wishnoff
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data EngineeringHarald Erb
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWKent Graziano
 
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summits
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeKent Graziano
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...HostedbyConfluent
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDATAVERSITY
 

What's hot (20)

Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Snowflake Architecture.pptx
Snowflake Architecture.pptxSnowflake Architecture.pptx
Snowflake Architecture.pptx
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Snowflake Architecture
Snowflake ArchitectureSnowflake Architecture
Snowflake Architecture
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No LimitsAWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 

Similar to Snowflake Data Science and AI/ML at Scale

Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeKent Graziano
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSenturus
 
Successful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data EngineeringSuccessful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data EngineeringDatabricks
 
Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudKent Graziano
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...IDERA Software
 
Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019George Walters
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
 
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)Denodo
 
A Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsA Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsInside Analysis
 
Slides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdfSlides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdfbutthead7
 
Building the Internet of Everything
Building the Internet of Everything Building the Internet of Everything
Building the Internet of Everything Cisco Canada
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product ManagersPentaho
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kineticCisco Canada
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kineticCisco Canada
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Timothy Spann
 

Similar to Snowflake Data Science and AI/ML at Scale (20)

Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
Successful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data EngineeringSuccessful AI/ML Projects with End-to-End Cloud Data Engineering
Successful AI/ML Projects with End-to-End Cloud Data Engineering
 
Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
 
Big data for product managers
Big data for product managersBig data for product managers
Big data for product managers
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...Idera live 2021:  Keynote Presentation The Future of Data is The Data Cloud b...
Idera live 2021: Keynote Presentation The Future of Data is The Data Cloud b...
 
Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
 
A Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsA Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of Things
 
Slides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdfSlides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdf
 
Building the Internet of Everything
Building the Internet of Everything Building the Internet of Everything
Building the Internet of Everything
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product Managers
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
 
Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
 

More from Adam Doyle

Data Engineering Roles
Data Engineering RolesData Engineering Roles
Data Engineering RolesAdam Doyle
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster ServicesAdam Doyle
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations PresentationAdam Doyle
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowAdam Doyle
 
Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAdam Doyle
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle
 
Localized Hadoop Development
Localized Hadoop DevelopmentLocalized Hadoop Development
Localized Hadoop DevelopmentAdam Doyle
 
The new big data
The new big dataThe new big data
The new big dataAdam Doyle
 
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020Adam Doyle
 
Operationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEAOperationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEAAdam Doyle
 
Retooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech StackRetooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech StackAdam Doyle
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020Adam Doyle
 
How stlrda does data
How stlrda does dataHow stlrda does data
How stlrda does dataAdam Doyle
 
Tailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analyticsTailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analyticsAdam Doyle
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Big Data IDEA 101 2019
Big Data IDEA 101 2019Big Data IDEA 101 2019
Big Data IDEA 101 2019Adam Doyle
 
Data Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleData Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleAdam Doyle
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user groupAdam Doyle
 

More from Adam Doyle (20)

ML Ops.pptx
ML Ops.pptxML Ops.pptx
ML Ops.pptx
 
Data Engineering Roles
Data Engineering RolesData Engineering Roles
Data Engineering Roles
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations Presentation
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
 
Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFI
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Localized Hadoop Development
Localized Hadoop DevelopmentLocalized Hadoop Development
Localized Hadoop Development
 
The new big data
The new big dataThe new big data
The new big data
 
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020Feature store Overview   St. Louis Big Data IDEA Meetup aug 2020
Feature store Overview St. Louis Big Data IDEA Meetup aug 2020
 
Operationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEAOperationalizing Data Science St. Louis Big Data IDEA
Operationalizing Data Science St. Louis Big Data IDEA
 
Retooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech StackRetooling on the Modern Data and Analytics Tech Stack
Retooling on the Modern Data and Analytics Tech Stack
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
How stlrda does data
How stlrda does dataHow stlrda does data
How stlrda does data
 
Tailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analyticsTailoring machine learning practices to support prescriptive analytics
Tailoring machine learning practices to support prescriptive analytics
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Big Data IDEA 101 2019
Big Data IDEA 101 2019Big Data IDEA 101 2019
Big Data IDEA 101 2019
 
Data Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleData Engineering and the Data Science Lifecycle
Data Engineering and the Data Science Lifecycle
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
 

Recently uploaded

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 

Snowflake Data Science and AI/ML at Scale

  • 1. © 2020 Snowflake Inc. All Rights Reserved Data Science and AI/ML at Scale Scott Hruby | Snowflake Solutions Engineer
  • 2. © 2020 Snowflake Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved Snowflake 101
  • 3. © 2020 Snowflake Inc. All Rights Reserved “The rapid rise of gathered/analyzed digital data is often core to the holistic success of the fastest growing & most successful companies of our time around the world.” – Mary Meeker, Bond Capital DATA: THE WORLD’S MOST VALUABLE RESOURCE
  • 4. © 2020 Snowflake Inc. All Rights Reserved DATA: THE NEW SUPERPOWER
  • 5. © 2020 Snowflake Inc. All Rights Reserved NEW TECHNOLOGY TRENDS CHANGE HOW WE USE DATA 5 Analytics is growing in importance, everywhere, and for everyone IoT, mobile, and social open up new opportunities for insight Cloud gives us the ability to scale and centralize data Rise of the Cloud Explosion of Data Diversification of Analytics
  • 6. © 2020 Snowflake Inc. All Rights Reserved On Premises EDW 1st Gen Cloud EDW Data Lake, Hadoop Cloud Data Platform All Data All Users Fast Answers SQL Database JOURNEY TO A CLOUD DATA PLATFORM
  • 7. © 2020 Snowflake Inc. All Rights Reserved TRADITIONAL DATA ARCHITECTURE Complex, Costly & Constrained OLTP Databases Enterprise Applications Third-Party Web/Log Data IoT Data Integration Data Transformation Data Analytics Normalization & Aggregation Ad Hoc Analysis Real-time Analytics Operational Reporting DATA SOURCES DATA CONSUMERS ELT Streaming ELT Data Marts Data Warehouses Backups File Sharing CubesData Lake CDC Data Science 7
  • 8. © 2020 Snowflake Inc. All Rights Reserved SNOWFLAKE CLOUD DATA PLATFORM ONE PLATFORM, ONE COPY OF DATA, MANY WORKLOADS 8 DATA SOURCES OLTP DATABASES ENTERPRISE APPLICATIONS THIRD-PARTY WEB/LOG DATA IoT DATA CONSUMERS DATA MONETIZATION OPERATIONAL REPORTING AD HOC ANALYSIS REAL-TIME ANALYTICS
  • 9. © 2020 Snowflake Inc. All Rights Reserved SNOWFLAKE ARCHITECTURE 9
  • 10. © 2020 Snowflake Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved Data Science & AI
  • 11. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE IS PREDICTIVE ANALYTICS Descriptive Analytics Diagnostic Analytics What happened? Why did it happen? reports self driving car Predictive Analytics Prescriptive Analytics What will happen? How can we make it happen?
  • 12. © 2020 Snowflake Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved 85%of #bigdata projects fail to move past preliminary stages Gartner - Nov. 2017 80% of analytics insights will not deliver business outcomes through 2022 Gartner - Jan 2019 87% of data science projects never make it into production VentureBeat AI - July 2019 80% of AI projects will “remain alchemy, run by wizards” through 2020 Gartner - Jan 2019
  • 13. © 2020 Snowflake Inc. All Rights Reserved EFFICIENT DATA PREPARATION EXTENSIVE PARTNER ECOSYSTEM CONSOLIDATED SOURCE FOR ALL DATA KEY REQUIREMENTS FOR DATA SCIENCE Structured, Semi structured, and Unstructured data 3rd Party Data Sharing Streaming & Batch Dedicated compute clusters for each team No resource contention Integration with the latest ML tools and libraries Consistent experience for BI and ML users
  • 14. © 2020 Snowflake Inc. All Rights Reserved ValueofData DATA SCIENCE MATURITY ● Defining business use cases ● Focused on BI ● Exploring tools ● Experimenting with languages ● Training and hiring ● Testing data pipelines DATA CHALLENGES ● 15% data utilization ● Understanding data ● Preparing data OVERALL CHALLENGES ● Acquiring talent ● Designing & building consistent processes & pipelines ● Defining business value ● Working on creating efficiencies ● Mapping ML to business value ● Planning growth & new use cases ● Testing various languages & tools ● Training and Hiring ● Optimizing data pipelines DATA CHALLENGES ● 70% data utilization + external ● Data wrangling, ETL + data validation OVERALL CHALLENGES ● Training and Hiring ● Scalability of tools & team ● Cost containment ● Creating ROI consistently ● Large data science team ● ML built into product/service ● Standardized on tools/languages ● Advanced ML for video, audio, and images ● Automated data pipelines DATA CHALLENGES ● 130% data utilization ● Obtaining more data ● Streaming data OVERALL CHALLENGES ● Retaining talent ● Optimize, simulate, & test model efficacy ● Staying ahead of competitors 1. IDENTIFYING USE CASES 2. LIMITED PRODUCTION 3. COMPETITIVE ADVANTAGE
  • 15. © 2020 Snowflake Inc. All Rights Reserved 15 DATA SOURCES OLTP DATABASES ENTERPRISE APPLICATIONS THIRD-PARTY WEB/LOG DATA IoT DATA CONSUMERS DATA MONETIZATION OPERATIONAL REPORTING AD HOC ANALYSIS REAL-TIME ANALYTICS SNOWFLAKE CLOUD DATA PLATFORM
  • 16. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 17. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 18. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 19. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 20. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 21. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 22. © 2020 Snowflake Inc. All Rights Reserved DATA SCIENCE WITH SNOWFLAKE
  • 23. © 2020 Snowflake Inc. All Rights Reserved ML FRAMEWORKS & LIBRARIES STATISTICAL ALGORITHMS NEURAL NETWORK / DEEP LEARNING ● Suitable for most Data Science problems involving structured and semi-structured data. ● Good performance with small amounts of data ● Mature - have been around for a while ● Mostly used with unstructured data like images, audio, and video ● Performance increases with the size of the training data
  • 24. © 2020 Snowflake Inc. All Rights Reserved NOTEBOOK-BASEDAUTOMATION-BASED (AutoML) ML PARTNER ECOSYSTEM Powerful but ComplexFast and Easy but Less Customizable MLlib
  • 25. © 2020 Snowflake Inc. All Rights Reserved 25 PROVEN BY OVER 5,400 CUSTOMERS
  • 26. © 2020 Snowflake Inc. All Rights Reserved EVER EXPANDING ECOSYSTEM
  • 27. © 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Computing Inc. All Rights Reserved SECURE AND GOVERNED ACCESS TO ALL DATA
  • 28. © 2020 Snowflake Computing Inc. All Rights Reserved SNOWFLAKE SECURITY AT A GLANCE Snowflake Operational Controls • NIST 800-53 • SOC2 Type 2 • HIPAA • PCI • FedRAMP Access • All communication secured & encrypted • TLS 1.2 encryption in both trusted and untrusted networks • IP Whitelisting Authentication • Password Policy enforcement • Multifactor Authentication • SAML 2.0 support for Federated Authentication Application • Flexible user management • Role-based access control for granular control • RBAC for data and actions Data • Encrypted at rest • Hierarchical key model rooted in Cloud HSM • Automatic key rotation • Time Travel 1-90 days • Tri-Secret Secure • Query statement encryption Infrastructure • AWS, Azure Physical Security • AWS, Azure Redundancy • Regional Data Centers ▪ US ▪ EU ▪ AP 28
  • 29. © 2020 Snowflake Computing Inc. All Rights Reserved COMPREHENSIVE DATA PROTECTION Protection against infrastructure failures All data transparently & synchronously replicated 3+ ways across independent infrastructure Protection against corruption & user errors “Time travel” feature enables instant roll-back to any point in time during chosen retention window Long-term data protection Zero-copy clones + optional export to cloud object storage enable user-managed data copies SELECT* FROM T0… T0 T1 T2 New data Modified data Daily Weekly 29
  • 30. © 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved Dynamically Mask Protected (PII, PHI) Column Data at Query Time • No change to the stored data • Mask or partial mask using constant value, hash, and custom functions • Unmask for authorized users only Policy Based Control • Table/View owners and privileged users (such as accountadmin) unauthorized by default • Centralized policy mgt Ease of Management • Apply single policy to multiple columns • Prevent secure view explosion Alice (Unauthorized) Bob (Authorized) ID Phone SSN 101 ***-***-5534 ********* 102 ***-***-3564 ********* 103 ***-***-9787 ********* ID Phone SSN 101 408-123-5534 ********* 102 510-335-3564 ********* 103 214-553-9787 ********* POLICIES INGEST RAW DATA GOVERNANCE AND SECURITY Dynamic data masking PRIVATE PUBLIC GA
  • 31. © 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved DB 1 Table 1 Column 1 DB 1 View 1 Column 1 DB n Table n Column n <policy condition> <masking function> Masking Policy Resource(s) Policy Admin Apply CASE WHEN invoker_role() IN (‘pii_role’) THEN val WHEN invoker_role() IN (‘support’) THEN regexp_replace(val,'.+@','*****@') ELSE ‘********’ END; Masking Policy Example Unmask Partial mask Mask Masking Policy • Policy contains condition(s) and masking function to apply under those conditions • Policy is applied to one or more table, view, or external table columns in an account • Nested policy execution for views - policy on table executed before policy on view(s) Supports • All data types • Data sharing • Streams • Clone carries over policy associations GOVERNANCE AND SECURITY Dynamic data masking policies PRIVATE PUBLIC GA
  • 32. © 2020 Snowflake Computing Inc. All Rights Reserved© 2020 Snowflake Inc. All Rights Reserved ID Phone SSN 101 408-123-5534 387-78-3456 102 510-334-3564 226-44-8908 103 214-553-9787 359-9987-0098 Ingest tokenized data Ingest Protected (PII/PHI) Data as Externally Tokenized • Using Protegrity agents on ETL tools. De-tokenize for Authorized Users at Query Time • Protegrity DSG called using external functions to de-tokenize data. • For unauthorized users, Protegrity DSG is not called. Policy Based Control • Table/View owners and privileged users (such as accountadmin) unauthorized by default • Centralized policy mgt Customer VPC / VNet Data Security Gateway (DSG) POLICIES Tokenized De-tokenized De-tokenized Tokenized ID Phone SSN 101 111-222-3333 000-78-9999 102 002-778-9904 779-66-8908 103 100-887-8888 111-00-8888 REST API Alice (Unauthorized) Bob (Authorized) EXTERNAL FUNCTION GOVERNANCE AND SECURITY External tokenization using third party PRIVATE PUBLIC GA
  • 33. © 2020 Snowflake Inc. All Rights Reserved “I can only show you the door. You’re the one that has to walk through it”
  • 34. © 2020 Snowflake Inc. All Rights Reserved THANK YOU