Organizations are struggling to make sense of their data within antiquated data platforms. Snowflake, the data warehouse built for the cloud, can help.
8. • Our Vision
8
Diverse Data Analytics & Apps
100% Cloud
Scale of Data
Workload, and concurrency
Data exploration BI / Reporting
Predictive analytics Data-driven apps
Enterprise apps Corporate Web
Mobile Internet of things 3rd party
9. Diverse Data Analytics & Apps
100% Cloud
Scale of Data
Workload, and concurrency
Enterprise apps Corporate Web
Mobile Internet of things 3rd party
Complete SQL
Database
Zero
Management
All of
your Data
All of
your Users
Pay only for
what you use
9
5 Key Breakthroughs
10. Complete SQL Database
z
Query
Data definition
Role based security
Updates and deletes
Multi-statement
transactions
Complete database
Support for the SQL that
business users already know
Broad ecosystem of
technology partners
I want to talk about data struggles that we are seeing in the real world.
Don’t go into too much detail on this slide. Here’s more detail for context (don’t mention all of this, but go over briefly):
Preparing disparate data to load
The struggle to load data begins with the need to prepare disparate datasets to load. Many organizations are dealing with a host of new semi-structured data in formats like JSON and Avro that require flattening to load into a relational database. Or, they choose to store semi-structured data separate from relational data in a NoSQL store, creating silos.
Capacity planning
Finding space for data can be another enormous challenge. Large numbers of complex datasets can quickly snowball into a storage capacity issue on fixed size on-premises or cloud data platforms.
Resource contention
Loading large datasets also requires significant compute capacity. Many data warehouses are already strained under normal business workloads, and the compute needed for loading forces those other processes to be pushed back in the priority queue.
Don’t go into too much detail on this slide. Here’s more detail for context (don’t mention all of this, but go over briefly):
Making sense of data in silos
With data scattered across NoSQL data lakes, cloud applications, and data warehouses (not to mention flat files and CSVs), organizations are struggling to combine and analyze their data in one cohesive picture.
Editing and transforming data
Every system that stores data has it’s challenges, but many organizations are finding it particularly hard to analyze and understand data in NoSQL systems like Hadoop. Semi-structured open source data stores require a large amount of custom configuration, uncommon skillets, and transformation to successfully combine with other business data. They also rarely support edit, update and insert commands that are essential to data modeling and transformation.
Supporting evolving business logic and disparate use cases
It’s hard for the business to drive evolutions in business logic within the database when it takes arduous manual process to test and update. Often, entire databases need to be physically copied in order to test a simple change to a table or derived field, which can be extremely expensive and time consuming. Because different people within the organization have different data needs, a “single source of the truth” is often too ungainly and impractical for most organizations to maintain and use.
Don’t go into too much detail on this slide. Here’s more detail for context (don’t mention all of this, but go over briefly):
Queues
Analytics users are always at the bottom of the resource priority queue. It’s not always designed to be that way, but if ETL, as a simplified example, needs to run for 45 minutes every hour, then there’s little time left over for the analytics team to access and iterate on the database.
Delays
Through the eyes of an analyst, nothing ever works fast enough. But, often disappointing performance isn’t for lack of trying. Many data warehouses require hours and hours of painstaking optimization, tuning, indexing, sorting, and vacuuming from a dedicated data engineer. To add to the pain, often one optimization will lead to deoptimization in another area
Incessant fixing
If the organization spends all its time endlessly solving loading, integration, and analytics struggles, it’s impossible to break away and think at a higher level about what needs to be accomplished. Data is a constant flash point of disagreement, rather than a rallying point for collaboration.
Siloed teams
Historically, there’s been a dividing line between technical, IT implementers, and less-technical business side consumers. This was partly driven by technology, but reinforced by organizational structures that don’t favor cross team collaboration.
The struggle for dating is ending now. At Snowflake set out to replace these four struggles with performance, concurrency, and simplicity. Here’s how we do it.
Snowflake has a completely new approach to data warehousing that eliminates the struggles from the ground up. You can host all of your data, use it with all of your apps, use the database as a 100% saas, cloud solution, and scale to any size of workload or concurrency.
Our solution gives you all of the things that you need from a data warehouse in the cloud.
-Complete SQL database with ansi Standard SQL. ACID compliant
-Zero management, no tuning or optimization
-All of your data, support for structured, semi-structured, whatever
-All of your users, concurrency is ok with virutal warehouses
-Pay only for what you use
Let’s take a deeper look.
Our own survey of over 300 respondents indicated 80% of database users preferred SQL to query their database.
Being a complete SQL database means Snowflake works the tools business users understand such as ETL and analytics.
Fully ACID-compliant relational database that supports role-based security, multi-statement transactions and a host of other features.
Now detailing the specifics of the value propositions - simplicity
Now detailing the specifics of the value propositions - simplicity
Now detailing the specifics of the value propositions -- concurrency
Pay only for what you use
Pay independently for only the storage and compute used
Scale elastically to match demand
Cloud economies of scale
Leverage low-cost cloud storage
Eliminate need for complex storage tiering
Shared-disk (Oracle): Difficult and expensive to scale and upgrade
Traditional shared-nothing (Netezza and Postgres): Expensive to upgrade, non-native handling of machine data
Hadoop shared nothing: Complex and requires special programming skills
We’ve separated the process of working with information into three distinct layers.
First is the storage layer, where all the data is always stored encrypted and always columnar compressed.
Second is the compute layer comprised of ”virtual warehouses, which are compute nodes that do all of the data processing.
You can have multiple virtual warehouses working on the same data at the same data.
Third is the services layer or the brains of Snowflake. All security information and metadata is stored here, and all query processing is done here.
The services layer also includes all transaction management, which coordinates across all of the virtual warehouses, allowing for a consistent set of operations against the same data at the same time.
Snowflake is the only database that has ever implemented this, and allows Snowflake to essentially scale without limits.
Summarizing Snowflake’s core value propositions
available as a Virtual Private or Dedicated edition