SlideShare a Scribd company logo
1 of 43
Download to read offline
Building The
Enterprise Data Lake

Important Considerations Before
You Jump In
December 8, 2015
Building The Enterprise Data Lake
Today’s Presenters
Mark
Madsen
Industry
Analyst
Third Nature
@markmadsen
Craig
Stewart
Sr. Dir.
Product
Management
SnapLogic
@01Badger
Erin
Curtis
Sr. Dir.
Product
Marketing
SnapLogic
@erncrts
Building	
  the	
  
Enterprise	
  Data	
  Lake	
  
Considera6ons	
  before	
  you	
  
jump	
  in	
  
	
  
	
  
	
  
	
  
	
  
December,	
  2015	
  
	
  
Mark	
  Madsen	
  
www.ThirdNature.net	
  
@markmadsen1	
  
What	
  This	
  Session	
  Isn’t	
  
SQL..
.
SQL!
SQL?
SQL
The	
  craB	
  model	
  of	
  informa6on	
  delivery	
  does	
  not	
  scale	
  
©	
  Third	
  Nature,	
  Inc.	
  
So	
  we	
  shiBed	
  to	
  data	
  publishing	
  
Industrialized	
  data	
  delivery	
  for	
  self-­‐service	
  access.	
  
Events	
  and	
  sensors	
  are	
  a	
  rela6vely	
  new	
  data	
  source	
  
Sensor	
  data	
  doesn’t	
  fit	
  well	
  with	
  current	
  methods	
  of	
  modeling,	
  
collecEon	
  and	
  storage,	
  or	
  with	
  the	
  technology	
  to	
  process	
  and	
  analyze	
  it.	
  
There’s	
  lots	
  of	
  other	
  new	
  data	
  involved	
  
©	
  Third	
  Nature,	
  Inc.	
  
You	
  can	
  store	
  this	
  data	
  in	
  an	
  RDBMS,	
  but…	
  
These	
  sorts	
  of	
  things	
  slow	
  user	
  requests	
  down	
  
Conclusion:	
  any	
  methodology	
  built	
  on	
  the	
  premise	
  that	
  you	
  
must	
  know	
  and	
  model	
  all	
  the	
  data	
  first	
  is	
  untenable	
  	
  
©	
  Third	
  Nature,	
  Inc.	
  
Analy6cs	
  embiggens	
  data	
  volume	
  problems	
  
Many	
  of	
  the	
  processing	
  problems	
  are	
  O(n2)	
  or	
  worse,	
  so	
  
moderate	
  data	
  can	
  be	
  a	
  problem	
  for	
  scale-­‐up	
  plaOorms	
  
©	
  Third	
  Nature,	
  Inc.	
  
Old	
  market	
  says:	
  There’s	
  nothing	
  wrong	
  with	
  what	
  
you	
  have,	
  just	
  keep	
  buying	
  new	
  products	
  from	
  us	
  
The	
  emerging	
  big	
  data	
  market	
  has	
  an	
  answer…	
  
©	
  Third	
  Nature,	
  Inc.	
  
The	
  data	
  lake	
  
©	
  Third	
  Nature,	
  Inc.	
  
Views	
  of	
  the	
  lake	
  
Is	
  the	
  business	
  vs	
  supports	
  the	
  business?	
  
ApplicaEon	
  vs	
  infrastructure?	
  
©	
  Third	
  Nature,	
  Inc.	
  
The	
  naïve	
  idea	
  of	
  a	
  data	
  lake	
  leads	
  to	
  predictable	
  results
©	
  Third	
  Nature,	
  Inc.	
  
You	
  can’t	
  install	
  Hadoop	
  and	
  hope	
  it	
  solves	
  all	
  the	
  problems	
  
Big	
  data	
  no	
  2	
  
Slide 18
The	
  answer	
  isn’t	
  just	
  technology,	
  it’s	
  architecture	
  
Schema
In	
  the	
  DW	
  world	
  both	
  data	
  and	
  processing	
  are	
  bounded	
  
No consideration for feedback loops and change
Processing only
happens here
Carefully
controlled
access
here
Nobodyherecreates
newinformation
Sources few and
well understood
Complex DI
is controlled
by IT
Schemas are few
and designed
Tools are authorized,
few in number and
kind
One way flow
This	
  is	
  a	
  monolithic,	
  layered	
  architecture	
  
©	
  Third	
  Nature,	
  Inc.	
  
In	
  the	
  big	
  data	
  world	
  flow	
  is	
  unbounded	
  and	
  con6nuous	
  
Feedback
loops allowed
End-of-analysis
dataset may be
start of a BI dataset
Continuous data
integration and delivery
Files are back as both
input and storage
Minimal
barrier of /
control on
collection
Areas of
provisioned
data
Any shape in,
rectangles out
This	
  needs	
  a	
  distributed	
  service	
  architecture	
  
©	
  Third	
  Nature,	
  Inc.	
  
Deconstruc6ng	
  data	
  environments	
  
There	
  are	
  three	
  
things	
  happening	
  in	
  a	
  
data	
  warehouse:	
  
▪  Data	
  acquisiEon	
  
▪  Data	
  management	
  
▪  Data	
  delivery	
  
Isolate	
  them	
  from	
  one	
  
another,	
  allow	
  read-­‐
write	
  use,	
  and	
  you	
  are	
  
on	
  the	
  path.	
  
Data
Warehouse
Data	
  lake	
  subsystems	
  /	
  components	
  
The	
  acquisi6on	
  component	
  allows	
  any	
  data	
  to	
  be	
  collected	
  at	
  any	
  latency.	
  The	
  
management	
  	
  component	
  allows	
  some	
  data	
  to	
  be	
  standardized	
  and	
  integrated.	
  The	
  
access	
  component	
  provides	
  access	
  at	
  any	
  latency	
  and	
  via	
  any	
  means	
  an	
  applica6on	
  
chooses.	
  Processing	
  can	
  be	
  done	
  to	
  any	
  data	
  at	
  any	
  6me	
  from	
  any	
  area.	
  
Data	
  AcquisiEon	
  
Collect	
  &	
  Store	
  
Incremental	
  
Batch	
  
One-­‐Eme	
  copy	
  
Real	
  Eme	
  
Data	
  Lake	
  PlaOorm	
  Services	
  
Data	
  Management	
  
Process	
  &	
  Integrate	
  
Data	
  Access	
  
Deliver	
  &	
  Use	
  
Data	
  storage	
  
In	
  reality,	
  you	
  are	
  building	
  three	
  systems,	
  not	
  one.	
  Avoid	
  the	
  monolith.	
  
©	
  Third	
  Nature,	
  Inc.	
  
Data	
  lake	
  func6ons	
  depend	
  on	
  plaUorm	
  services	
  
Base Platform Services
Data Movement MetadataData Persistence
Workflow
Management
Processing Engines Dataflow Services
Data Curation
Data Access
Services
Data	
  AcquisiEon	
  
Collect	
  &	
  Store	
  
Data	
  Management	
  
Process	
  &	
  Integrate	
  
Data	
  Access	
  
Deliver	
  &	
  Use	
  
PlaOorm	
  services	
  needed	
  
DATA	
  ARCHITECTURE	
  
We’re	
  so	
  focused	
  on	
  the	
  light	
  switch	
  that	
  we’re	
  not	
  
talking	
  about	
  the	
  light	
  
©	
  Third	
  Nature,	
  Inc.	
  
Decouple	
  the	
  Data	
  Architecture	
  
The	
  core	
  of	
  the	
  data	
  lake	
  isn’t	
  a	
  database	
  or	
  HDFS,	
  
it’s	
  the	
  data	
  architecture	
  that	
  the	
  tools	
  implement.	
  
	
  
We	
  need	
  a	
  data	
  architecture	
  that	
  is	
  not	
  limiEng:	
  
▪  Deals	
  with	
  change	
  easily	
  and	
  at	
  scale	
  
▪  Does	
  not	
  enforce	
  requirements	
  and	
  models	
  up	
  front	
  
▪  Does	
  not	
  limit	
  the	
  format	
  or	
  structure	
  of	
  data	
  
▪  Assumes	
  the	
  range	
  of	
  data	
  latencies	
  in	
  and	
  out,	
  from	
  
streaming	
  to	
  one-­‐Eme	
  bulk	
  
©	
  Third	
  Nature,	
  Inc.	
  
Food	
  supply	
  chain:	
  an	
  analogy	
  for	
  data	
  
MulEple	
  contexts	
  of	
  use,	
  differing	
  quality	
  levels	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
You	
  need	
  to	
  keep	
  the	
  original	
  because	
  just	
  like	
  baking,	
  
you	
  can’t	
  unmake	
  dough	
  once	
  it’s	
  mixed.	
  
©	
  Third	
  Nature,	
  Inc.	
  
Data	
  architecture	
  is	
  required	
  by	
  the	
  services,	
  and	
  vice	
  versa	
  
Raw data in an immutable
storage area
Standardized or
enhanced data
Common or
usage-
specific data
Transient data
Data	
  AcquisiEon	
  
Collect	
  &	
  Store	
  
PlaOorm	
  Services	
  
Data	
  Access	
  
Deliver	
  &	
  Use	
  
Data	
  Management	
  
Process	
  &	
  Integrate	
  
©	
  Third	
  Nature,	
  Inc.	
  
The	
  data	
  areas	
  map	
  (mostly)	
  to	
  func6onal	
  areas	
  of	
  the	
  lake	
  
CollecEon	
  can’t	
  be	
  limited	
  by	
  database	
  scale	
  and	
  latency.	
  
Immutability,	
  persistence	
  and	
  concurrency	
  are	
  required.	
  
Incremental	
  
Collect	
  
Batch	
  
One-­‐Eme	
  copy	
  
Real	
  Eme	
  
Manage	
  	
  &	
  Integrate	
   Process,	
  Deliver,	
  Use	
  
©	
  Third	
  Nature,	
  Inc.	
  
Stages,	
  not	
  layers	
  
Some	
  tools	
  require	
  specific	
  repositories	
  or	
  models.	
  
Others	
  can	
  reach	
  in	
  to	
  get	
  what	
  they	
  need.	
  Do	
  not	
  
enforce	
  a	
  single	
  access	
  point	
  or	
  model.	
  
©	
  Third	
  Nature,	
  Inc.	
  
The	
  geography	
  has	
  been	
  redefined	
  
The	
  box	
  IT	
  created:	
  
• not	
  any	
  data,	
  rigidly	
  typed	
  data	
  
• not	
  any	
  form,	
  tabular	
  rows	
  and	
  
columns	
  of	
  typed	
  data	
  
• not	
  any	
  latency,	
  persist	
  what	
  the	
  
DB	
  can	
  keep	
  up	
  with	
  
• not	
  any	
  process,	
  only	
  queries	
  
	
  
The	
  digital	
  world	
  was	
  diminished	
  
to	
  only	
  what’s	
  inside	
  the	
  box	
  un6l	
  
we	
  forgot	
  the	
  box	
  was	
  there.	
  
	
  
©	
  Third	
  Nature,	
  Inc.	
  
Layered	
  data	
  architecture	
  
The	
  DW	
  assumed	
  a	
  single	
  flat	
  
model	
  of	
  data,	
  DB	
  in	
  the	
  center.	
  	
  
The	
  data	
  lake	
  enables	
  new	
  ways	
  
to	
  organize	
  data:	
  
▪  Raw	
  –	
  straight	
  from	
  the	
  source	
  
▪  Enhanced	
  –cleaned,	
  standardized	
  
▪  Integrated	
  –	
  modeled,	
  
augmented,	
  ~semi-­‐persistent	
  
▪  Derived	
  –	
  analyEc	
  output,	
  
pacern	
  based	
  sets,	
  ephemeral	
  
Implies	
  a	
  new	
  technology	
  architecture	
  
and	
  data	
  modeling	
  approaches.	
  
©	
  Third	
  Nature,	
  Inc.	
  
The	
  data	
  lake	
  enables	
  evolu6onary	
  design	
  for	
  data	
  
EvoluEonary	
  design	
  is	
  required	
  because	
  data	
  needs	
  change.	
  You	
  
need	
  a	
  system	
  not	
  for	
  stability	
  –	
  we	
  have	
  that	
  in	
  the	
  DW	
  -­‐	
  but	
  for	
  
evoluEon	
  and	
  change,	
  the	
  data	
  lake.	
  	
  
Data	
  AcquisiEon	
  
Collect	
  &	
  Store	
  
Incremental	
  
Batch	
  
One-­‐Eme	
  copy	
  
Real	
  Eme	
  
Data	
  Lake	
  PlaOorm	
  Services	
  
Data	
  Management	
  
Process	
  &	
  Integrate	
  
Data	
  Access	
  
Deliver	
  &	
  Use	
  
Data	
  storage	
  
You	
  can’t	
  build	
  this	
  all	
  at	
  once.	
  You	
  need	
  to	
  grow	
  it	
  over	
  6me.	
  
©	
  Third	
  Nature,	
  Inc.	
  
Away	
  from	
  “one	
  throat	
  to	
  choke”,	
  back	
  to	
  best	
  of	
  breed	
  
Tight	
  coupling	
  leads	
  to	
  efficient	
  
reuse	
  and	
  standardizaEon,	
  and	
  
to	
  slow	
  changes.	
  
In	
  a	
  rapidly	
  evolving	
  market	
  
componenEzed	
  architectures,	
  
modularity	
  	
  and	
  loose	
  coupling	
  
are	
  favorable	
  over	
  monolithic	
  
stacks,	
  single-­‐vendor	
  
architectures	
  and	
  Eght	
  
coupling.	
  
Architecture,	
  not	
  blueprints:	
  
there	
  is	
  no	
  single	
  answer.	
  It	
  
depends	
  on	
  your	
  goals	
  and	
  
starEng	
  posiEon.	
  
	
  
Ques6ons?	
  “When	
  a	
  new	
  technology	
  rolls	
  over	
  you,	
  you're	
  either	
  part	
  
of	
  the	
  steamroller	
  or	
  part	
  of	
  the	
  road.”	
  –	
  Stewart	
  Brand	
  
©	
  Third	
  Nature,	
  Inc.	
  
CC	
  Image	
  Abribu6ons	
  
Thanks	
  to	
  the	
  people	
  who	
  supplied	
  the	
  creaEve	
  commons	
  licensed	
  images	
  used	
  in	
  this	
  presentaEon:	
  
	
  
donuts_4_views.jpg	
  -­‐	
  hcp://www.flickr.com/photos/le_hibou/76718773/	
  
glass_buildings.jpg	
  -­‐	
  hcp://www.flickr.com/photos/erikvanhannen/547701721	
  
	
  
	
  
©	
  Third	
  Nature,	
  Inc.	
  
About	
  the	
  Presenter	
  
Mark	
  Madsen	
  is	
  president	
  of	
  Third	
  Nature,	
  a	
  
consulEng	
  and	
  advisory	
  firm	
  focused	
  on	
  
analyEcs,	
  business	
  intelligence	
  and	
  data	
  
management.	
  Mark	
  is	
  an	
  award-­‐winning	
  
author,	
  architect	
  and	
  CTO.	
  Over	
  the	
  past	
  ten	
  
years	
  Mark	
  received	
  awards	
  for	
  his	
  work	
  
from	
  the	
  American	
  ProducEvity	
  &	
  Quality	
  
Center,	
  TDWI,	
  and	
  the	
  Smithsonian	
  InsEtute.	
  
He	
  is	
  an	
  internaEonal	
  speaker,	
  a	
  contributor	
  
to	
  Forbes,	
  member	
  of	
  the	
  O’Reilly	
  Strata	
  
program	
  commicee.	
  For	
  more	
  informaEon	
  or	
  
to	
  contact	
  Mark,	
  follow	
  @markmadsen	
  on	
  
Twicer	
  or	
  visit	
  	
  hcp://ThirdNature.net	
  	
  
About	
  Third	
  Nature	
  
Third	
  Nature	
  is	
  a	
  consulEng	
  and	
  advisory	
  firm	
  focused	
  on	
  new	
  and	
  emerging	
  technology	
  
and	
  pracEces	
  in	
  informaEon	
  strategy,	
  analyEcs,	
  business	
  intelligence	
  and	
  data	
  
management.	
  If	
  your	
  quesEon	
  is	
  related	
  to	
  data,	
  analyEcs,	
  informaEon	
  strategy	
  and	
  
technology	
  infrastructure	
  then	
  you‘re	
  at	
  the	
  right	
  place.	
  
Our	
  goal	
  is	
  to	
  help	
  organizaEons	
  solve	
  problems	
  using	
  data.	
  We	
  offer	
  educaEon,	
  
consulEng	
  and	
  research	
  services	
  to	
  support	
  business	
  and	
  IT	
  organizaEons	
  as	
  well	
  as	
  
technology	
  vendors.	
  
We	
  fill	
  the	
  gap	
  between	
  what	
  the	
  industry	
  analyst	
  firms	
  cover	
  and	
  what	
  IT	
  needs.	
  We	
  
specialize	
  in	
  strategy	
  and	
  architecture,	
  so	
  we	
  look	
  at	
  emerging	
  technologies	
  and	
  markets,	
  
evaluaEng	
  how	
  technologies	
  are	
  applied	
  to	
  solve	
  problems	
  rather	
  than	
  evaluaEng	
  product	
  
features.	
  
About SnapLogic
Anything
apps | APIs | things | data
Anytime 
batch | streaming | real-time
Anywhere
on premises | in the cloud


SnapLogic helps enterprises
connect data and 

applications faster
Modern Architecture: Hybrid and Elastic
Streams: No data is
stored/cached
Secure: 100%
standards-based
Elastic: Scales out &
handles data and app
integration use cases
Metadata
Data
Databases
On Prem
Apps
Big Data
Cloud Apps
and DataCloud-Based Designer, Manager,
Dashboard
Cloudplex
Groundplex
Hadooplex
Sparkplex
Firewall
z
Data
Acquisition
On Prem Apps
and Data
Data
Access
z
Data
Management
Data Lake
Add information
and improve data


Spark
Python
Scala
Java
R
Pig
Collect and
integrate data
from multiple
sources

HDFS

AWS S3

MS Azure Blob
•  ERP
•  CRM
•  RDBMS
Cloud Apps
and Data
•  CRM
•  HCM
•  Social
IoT Data
•  Sensors
•  Wearables
•  Devices
Lakeshore

Data Mart
•  MS Azure
•  AWS
Redshift
•  …
BI / Analytics
•  Tableau
•  MS
PowerBI /
Azure
•  AWS
QuickSight
Organize and
prepare data for
visualization


HDFS

AWS S3

MS Azure Blob
Hive
Batch
Streaming
Schedule and manage:
Oozie, Ambari
Kafka, Sqoop,
Flume
Real-time
Ingest Prepare Deliver
Impala, HiveSQL,
SparkSQL
z
Data
Acquisition
On Prem Apps
and Data
Data
Access
z
Data
Management
The Modern Data Lake
Powered by SnapLogic
•  ERP
•  CRM
•  RDBMS
Cloud Apps
and Data
•  CRM
•  HCM
•  Social
IoT Data
•  Sensors
•  Wearables
•  Devices
Lakeshore

Data Mart
•  MS Azure
•  AWS
Redshift
•  …
BI / Analytics
•  Tableau
•  MS
PowerBI /
Azure
•  AWS
QuickSight
Batch
Streaming
Schedule and manage: SnapLogicSnapLogic Pipelines
Real-time
Ingest Prepare Deliver
SnapLogic Pipelines
Sort,
Aggregate,
Join, Merge,
Transform



SnapLogic
abstracts and
operationalizes
with
SnapReduce or
Spark pipelines
Collect and
integrate data
from multiple
sources

SnapLogic
pipelines with
standard mode
execution
Organize and
prepare data for
visualization


SnapLogic
pipelines with
standard mode
execution
Thank You
Watch SnapLogic in action:"
video/snaplogic.com

Contact us:
info@snaplogic.com

Follow us on Twitter:
@SnapLogic

More Related Content

What's hot

Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2inovex GmbH
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 

What's hot (20)

Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Data lake
Data lakeData lake
Data lake
 

Viewers also liked

Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introductionIBM Analytics
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
 
SnapLogic Big Data Integration
SnapLogic Big Data IntegrationSnapLogic Big Data Integration
SnapLogic Big Data IntegrationSnapLogic
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThomas Kelly, PMP
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesCambridge Semantics
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architectureMilos Milovanovic
 
Taming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkTaming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkRamkumar Ravichandran
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviationranjit banshpal
 
March Marketers: Research Trends Presentation
March Marketers: Research Trends PresentationMarch Marketers: Research Trends Presentation
March Marketers: Research Trends PresentationAlexandra Knoll
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Cloudera, Inc.
 
A Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big DataA Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big DataSimon Price
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Usedmurph4
 
Azure data factory
Azure data factoryAzure data factory
Azure data factoryBizTalk360
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceMapR Technologies
 

Viewers also liked (20)

Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
 
SnapLogic Big Data Integration
SnapLogic Big Data IntegrationSnapLogic Big Data Integration
SnapLogic Big Data Integration
 
Datalake Architecture
Datalake ArchitectureDatalake Architecture
Datalake Architecture
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy
 
Accelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success StoriesAccelerating Insight - Smart Data Lake Customer Success Stories
Accelerating Insight - Smart Data Lake Customer Success Stories
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architecture
 
Taming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkTaming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model Framework
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 
March Marketers: Research Trends Presentation
March Marketers: Research Trends PresentationMarch Marketers: Research Trends Presentation
March Marketers: Research Trends Presentation
 
Big model, big data
Big model, big dataBig model, big data
Big model, big data
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
 
A Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big DataA Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big Data
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Use
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 

Similar to Building an Enterprise Data Lake: Considerations Before You Jump In

Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data LakeIRJET Journal
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...DataScienceConferenc1
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
AWS Data Lakes & Best Practices - GoDgtl
AWS Data Lakes & Best Practices - GoDgtlAWS Data Lakes & Best Practices - GoDgtl
AWS Data Lakes & Best Practices - GoDgtlMezzybatliwala
 
AWS Data Lakes and Best Practices
AWS Data Lakes and Best PracticesAWS Data Lakes and Best Practices
AWS Data Lakes and Best PracticesPeeterParkar
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothAdaryl "Bob" Wakefield, MBA
 
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data ProjectCitiusTech
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehousemark madsen
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platformsJamesAnderson599331
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIDenodo
 
Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Saurabh K. Gupta
 

Similar to Building an Enterprise Data Lake: Considerations Before You Jump In (20)

Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
AWS Data Lakes & Best Practices - GoDgtl
AWS Data Lakes & Best Practices - GoDgtlAWS Data Lakes & Best Practices - GoDgtl
AWS Data Lakes & Best Practices - GoDgtl
 
AWS Data Lakes and Best Practices
AWS Data Lakes and Best PracticesAWS Data Lakes and Best Practices
AWS Data Lakes and Best Practices
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
 
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
 
Everything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data WarehouseEverything Has Changed Except Us: Modernizing the Data Warehouse
Everything Has Changed Except Us: Modernizing the Data Warehouse
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platforms
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AI
 
Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration
 

More from SnapLogic

The AI Mindset: Bridging Industry and Academic Perspectives
The AI Mindset: Bridging Industry and Academic PerspectivesThe AI Mindset: Bridging Industry and Academic Perspectives
The AI Mindset: Bridging Industry and Academic PerspectivesSnapLogic
 
Supercharging Self-Service API Integration with AI
Supercharging Self-Service API Integration with AI Supercharging Self-Service API Integration with AI
Supercharging Self-Service API Integration with AI SnapLogic
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?SnapLogic
 
SnapLogic Culture Deck
SnapLogic Culture DeckSnapLogic Culture Deck
SnapLogic Culture DeckSnapLogic
 
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...SnapLogic
 
Digital Transformation is Cloud-Powered
Digital Transformation is Cloud-PoweredDigital Transformation is Cloud-Powered
Digital Transformation is Cloud-PoweredSnapLogic
 
How to Build a Winning Data Culture
How to Build a Winning Data CultureHow to Build a Winning Data Culture
How to Build a Winning Data CultureSnapLogic
 
Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies SnapLogic
 
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...SnapLogic
 
SnapLogic Technology Open House – January 2018
SnapLogic Technology Open House – January 2018SnapLogic Technology Open House – January 2018
SnapLogic Technology Open House – January 2018SnapLogic
 
Self-Service Integration in the Age of Digital Transformation at Box
Self-Service Integration in the Age of Digital Transformation at BoxSelf-Service Integration in the Age of Digital Transformation at Box
Self-Service Integration in the Age of Digital Transformation at BoxSnapLogic
 
Live Demo: Accelerate the integration of workday applications
Live Demo: Accelerate the integration of workday applicationsLive Demo: Accelerate the integration of workday applications
Live Demo: Accelerate the integration of workday applicationsSnapLogic
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
Spring 2017 release customer webinar
Spring 2017 release customer webinarSpring 2017 release customer webinar
Spring 2017 release customer webinarSnapLogic
 
SnapLogic unveils machine-learning-driven integration assistant
SnapLogic unveils machine-learning-driven integration assistantSnapLogic unveils machine-learning-driven integration assistant
SnapLogic unveils machine-learning-driven integration assistantSnapLogic
 
Webinar: Evolution of Data Management for the IoT
Webinar: Evolution of Data Management for the IoTWebinar: Evolution of Data Management for the IoT
Webinar: Evolution of Data Management for the IoTSnapLogic
 
SnapLogic Culture
SnapLogic CultureSnapLogic Culture
SnapLogic CultureSnapLogic
 
SnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen IntegratorSnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen IntegratorSnapLogic
 
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowBig Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowSnapLogic
 

More from SnapLogic (20)

The AI Mindset: Bridging Industry and Academic Perspectives
The AI Mindset: Bridging Industry and Academic PerspectivesThe AI Mindset: Bridging Industry and Academic Perspectives
The AI Mindset: Bridging Industry and Academic Perspectives
 
Supercharging Self-Service API Integration with AI
Supercharging Self-Service API Integration with AI Supercharging Self-Service API Integration with AI
Supercharging Self-Service API Integration with AI
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
SnapLogic Culture Deck
SnapLogic Culture DeckSnapLogic Culture Deck
SnapLogic Culture Deck
 
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
 
Digital Transformation is Cloud-Powered
Digital Transformation is Cloud-PoweredDigital Transformation is Cloud-Powered
Digital Transformation is Cloud-Powered
 
How to Build a Winning Data Culture
How to Build a Winning Data CultureHow to Build a Winning Data Culture
How to Build a Winning Data Culture
 
Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies Data Warehousing in the Cloud: Practical Migration Strategies
Data Warehousing in the Cloud: Practical Migration Strategies
 
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
Overcoming the challenge of multiple data frameworks in a multiple cloud envi...
 
SnapLogic Technology Open House – January 2018
SnapLogic Technology Open House – January 2018SnapLogic Technology Open House – January 2018
SnapLogic Technology Open House – January 2018
 
Self-Service Integration in the Age of Digital Transformation at Box
Self-Service Integration in the Age of Digital Transformation at BoxSelf-Service Integration in the Age of Digital Transformation at Box
Self-Service Integration in the Age of Digital Transformation at Box
 
Live Demo: Accelerate the integration of workday applications
Live Demo: Accelerate the integration of workday applicationsLive Demo: Accelerate the integration of workday applications
Live Demo: Accelerate the integration of workday applications
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Spring 2017 release customer webinar
Spring 2017 release customer webinarSpring 2017 release customer webinar
Spring 2017 release customer webinar
 
SnapLogic unveils machine-learning-driven integration assistant
SnapLogic unveils machine-learning-driven integration assistantSnapLogic unveils machine-learning-driven integration assistant
SnapLogic unveils machine-learning-driven integration assistant
 
Webinar: Evolution of Data Management for the IoT
Webinar: Evolution of Data Management for the IoTWebinar: Evolution of Data Management for the IoT
Webinar: Evolution of Data Management for the IoT
 
The API Lie
The API LieThe API Lie
The API Lie
 
SnapLogic Culture
SnapLogic CultureSnapLogic Culture
SnapLogic Culture
 
SnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen IntegratorSnapLogic Live: Enabling the Citizen Integrator
SnapLogic Live: Enabling the Citizen Integrator
 
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowBig Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To Know
 

Recently uploaded

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

Building an Enterprise Data Lake: Considerations Before You Jump In

  • 1. Building The Enterprise Data Lake
 Important Considerations Before You Jump In December 8, 2015
  • 2. Building The Enterprise Data Lake Today’s Presenters Mark Madsen Industry Analyst Third Nature @markmadsen Craig Stewart Sr. Dir. Product Management SnapLogic @01Badger Erin Curtis Sr. Dir. Product Marketing SnapLogic @erncrts
  • 3. Building  the   Enterprise  Data  Lake   Considera6ons  before  you   jump  in             December,  2015     Mark  Madsen   www.ThirdNature.net   @markmadsen1  
  • 4. What  This  Session  Isn’t   SQL.. . SQL! SQL? SQL
  • 5. The  craB  model  of  informa6on  delivery  does  not  scale  
  • 6. ©  Third  Nature,  Inc.   So  we  shiBed  to  data  publishing   Industrialized  data  delivery  for  self-­‐service  access.  
  • 7. Events  and  sensors  are  a  rela6vely  new  data  source   Sensor  data  doesn’t  fit  well  with  current  methods  of  modeling,   collecEon  and  storage,  or  with  the  technology  to  process  and  analyze  it.  
  • 8. There’s  lots  of  other  new  data  involved  
  • 9. ©  Third  Nature,  Inc.   You  can  store  this  data  in  an  RDBMS,  but…  
  • 10. These  sorts  of  things  slow  user  requests  down   Conclusion:  any  methodology  built  on  the  premise  that  you   must  know  and  model  all  the  data  first  is  untenable    
  • 11. ©  Third  Nature,  Inc.   Analy6cs  embiggens  data  volume  problems   Many  of  the  processing  problems  are  O(n2)  or  worse,  so   moderate  data  can  be  a  problem  for  scale-­‐up  plaOorms  
  • 12. ©  Third  Nature,  Inc.   Old  market  says:  There’s  nothing  wrong  with  what   you  have,  just  keep  buying  new  products  from  us  
  • 13. The  emerging  big  data  market  has  an  answer…  
  • 14. ©  Third  Nature,  Inc.   The  data  lake  
  • 15. ©  Third  Nature,  Inc.   Views  of  the  lake   Is  the  business  vs  supports  the  business?   ApplicaEon  vs  infrastructure?  
  • 16. ©  Third  Nature,  Inc.   The  naïve  idea  of  a  data  lake  leads  to  predictable  results
  • 17. ©  Third  Nature,  Inc.   You  can’t  install  Hadoop  and  hope  it  solves  all  the  problems   Big  data  no  2  
  • 18. Slide 18 The  answer  isn’t  just  technology,  it’s  architecture  
  • 19. Schema In  the  DW  world  both  data  and  processing  are  bounded   No consideration for feedback loops and change Processing only happens here Carefully controlled access here Nobodyherecreates newinformation Sources few and well understood Complex DI is controlled by IT Schemas are few and designed Tools are authorized, few in number and kind One way flow This  is  a  monolithic,  layered  architecture  
  • 20. ©  Third  Nature,  Inc.   In  the  big  data  world  flow  is  unbounded  and  con6nuous   Feedback loops allowed End-of-analysis dataset may be start of a BI dataset Continuous data integration and delivery Files are back as both input and storage Minimal barrier of / control on collection Areas of provisioned data Any shape in, rectangles out This  needs  a  distributed  service  architecture  
  • 21. ©  Third  Nature,  Inc.   Deconstruc6ng  data  environments   There  are  three   things  happening  in  a   data  warehouse:   ▪  Data  acquisiEon   ▪  Data  management   ▪  Data  delivery   Isolate  them  from  one   another,  allow  read-­‐ write  use,  and  you  are   on  the  path.   Data Warehouse
  • 22. Data  lake  subsystems  /  components   The  acquisi6on  component  allows  any  data  to  be  collected  at  any  latency.  The   management    component  allows  some  data  to  be  standardized  and  integrated.  The   access  component  provides  access  at  any  latency  and  via  any  means  an  applica6on   chooses.  Processing  can  be  done  to  any  data  at  any  6me  from  any  area.   Data  AcquisiEon   Collect  &  Store   Incremental   Batch   One-­‐Eme  copy   Real  Eme   Data  Lake  PlaOorm  Services   Data  Management   Process  &  Integrate   Data  Access   Deliver  &  Use   Data  storage   In  reality,  you  are  building  three  systems,  not  one.  Avoid  the  monolith.  
  • 23. ©  Third  Nature,  Inc.   Data  lake  func6ons  depend  on  plaUorm  services   Base Platform Services Data Movement MetadataData Persistence Workflow Management Processing Engines Dataflow Services Data Curation Data Access Services Data  AcquisiEon   Collect  &  Store   Data  Management   Process  &  Integrate   Data  Access   Deliver  &  Use   PlaOorm  services  needed  
  • 24. DATA  ARCHITECTURE   We’re  so  focused  on  the  light  switch  that  we’re  not   talking  about  the  light  
  • 25. ©  Third  Nature,  Inc.   Decouple  the  Data  Architecture   The  core  of  the  data  lake  isn’t  a  database  or  HDFS,   it’s  the  data  architecture  that  the  tools  implement.     We  need  a  data  architecture  that  is  not  limiEng:   ▪  Deals  with  change  easily  and  at  scale   ▪  Does  not  enforce  requirements  and  models  up  front   ▪  Does  not  limit  the  format  or  structure  of  data   ▪  Assumes  the  range  of  data  latencies  in  and  out,  from   streaming  to  one-­‐Eme  bulk  
  • 26. ©  Third  Nature,  Inc.   Food  supply  chain:  an  analogy  for  data   MulEple  contexts  of  use,  differing  quality  levels                   You  need  to  keep  the  original  because  just  like  baking,   you  can’t  unmake  dough  once  it’s  mixed.  
  • 27. ©  Third  Nature,  Inc.   Data  architecture  is  required  by  the  services,  and  vice  versa   Raw data in an immutable storage area Standardized or enhanced data Common or usage- specific data Transient data Data  AcquisiEon   Collect  &  Store   PlaOorm  Services   Data  Access   Deliver  &  Use   Data  Management   Process  &  Integrate  
  • 28. ©  Third  Nature,  Inc.   The  data  areas  map  (mostly)  to  func6onal  areas  of  the  lake   CollecEon  can’t  be  limited  by  database  scale  and  latency.   Immutability,  persistence  and  concurrency  are  required.   Incremental   Collect   Batch   One-­‐Eme  copy   Real  Eme   Manage    &  Integrate   Process,  Deliver,  Use  
  • 29. ©  Third  Nature,  Inc.   Stages,  not  layers   Some  tools  require  specific  repositories  or  models.   Others  can  reach  in  to  get  what  they  need.  Do  not   enforce  a  single  access  point  or  model.  
  • 30. ©  Third  Nature,  Inc.   The  geography  has  been  redefined   The  box  IT  created:   • not  any  data,  rigidly  typed  data   • not  any  form,  tabular  rows  and   columns  of  typed  data   • not  any  latency,  persist  what  the   DB  can  keep  up  with   • not  any  process,  only  queries     The  digital  world  was  diminished   to  only  what’s  inside  the  box  un6l   we  forgot  the  box  was  there.    
  • 31. ©  Third  Nature,  Inc.   Layered  data  architecture   The  DW  assumed  a  single  flat   model  of  data,  DB  in  the  center.     The  data  lake  enables  new  ways   to  organize  data:   ▪  Raw  –  straight  from  the  source   ▪  Enhanced  –cleaned,  standardized   ▪  Integrated  –  modeled,   augmented,  ~semi-­‐persistent   ▪  Derived  –  analyEc  output,   pacern  based  sets,  ephemeral   Implies  a  new  technology  architecture   and  data  modeling  approaches.  
  • 32. ©  Third  Nature,  Inc.   The  data  lake  enables  evolu6onary  design  for  data   EvoluEonary  design  is  required  because  data  needs  change.  You   need  a  system  not  for  stability  –  we  have  that  in  the  DW  -­‐  but  for   evoluEon  and  change,  the  data  lake.     Data  AcquisiEon   Collect  &  Store   Incremental   Batch   One-­‐Eme  copy   Real  Eme   Data  Lake  PlaOorm  Services   Data  Management   Process  &  Integrate   Data  Access   Deliver  &  Use   Data  storage   You  can’t  build  this  all  at  once.  You  need  to  grow  it  over  6me.  
  • 33. ©  Third  Nature,  Inc.   Away  from  “one  throat  to  choke”,  back  to  best  of  breed   Tight  coupling  leads  to  efficient   reuse  and  standardizaEon,  and   to  slow  changes.   In  a  rapidly  evolving  market   componenEzed  architectures,   modularity    and  loose  coupling   are  favorable  over  monolithic   stacks,  single-­‐vendor   architectures  and  Eght   coupling.   Architecture,  not  blueprints:   there  is  no  single  answer.  It   depends  on  your  goals  and   starEng  posiEon.    
  • 34. Ques6ons?  “When  a  new  technology  rolls  over  you,  you're  either  part   of  the  steamroller  or  part  of  the  road.”  –  Stewart  Brand  
  • 35. ©  Third  Nature,  Inc.   CC  Image  Abribu6ons   Thanks  to  the  people  who  supplied  the  creaEve  commons  licensed  images  used  in  this  presentaEon:     donuts_4_views.jpg  -­‐  hcp://www.flickr.com/photos/le_hibou/76718773/   glass_buildings.jpg  -­‐  hcp://www.flickr.com/photos/erikvanhannen/547701721      
  • 36. ©  Third  Nature,  Inc.   About  the  Presenter   Mark  Madsen  is  president  of  Third  Nature,  a   consulEng  and  advisory  firm  focused  on   analyEcs,  business  intelligence  and  data   management.  Mark  is  an  award-­‐winning   author,  architect  and  CTO.  Over  the  past  ten   years  Mark  received  awards  for  his  work   from  the  American  ProducEvity  &  Quality   Center,  TDWI,  and  the  Smithsonian  InsEtute.   He  is  an  internaEonal  speaker,  a  contributor   to  Forbes,  member  of  the  O’Reilly  Strata   program  commicee.  For  more  informaEon  or   to  contact  Mark,  follow  @markmadsen  on   Twicer  or  visit    hcp://ThirdNature.net    
  • 37. About  Third  Nature   Third  Nature  is  a  consulEng  and  advisory  firm  focused  on  new  and  emerging  technology   and  pracEces  in  informaEon  strategy,  analyEcs,  business  intelligence  and  data   management.  If  your  quesEon  is  related  to  data,  analyEcs,  informaEon  strategy  and   technology  infrastructure  then  you‘re  at  the  right  place.   Our  goal  is  to  help  organizaEons  solve  problems  using  data.  We  offer  educaEon,   consulEng  and  research  services  to  support  business  and  IT  organizaEons  as  well  as   technology  vendors.   We  fill  the  gap  between  what  the  industry  analyst  firms  cover  and  what  IT  needs.  We   specialize  in  strategy  and  architecture,  so  we  look  at  emerging  technologies  and  markets,   evaluaEng  how  technologies  are  applied  to  solve  problems  rather  than  evaluaEng  product   features.  
  • 39. Anything apps | APIs | things | data Anytime batch | streaming | real-time Anywhere on premises | in the cloud SnapLogic helps enterprises connect data and 
 applications faster
  • 40. Modern Architecture: Hybrid and Elastic Streams: No data is stored/cached Secure: 100% standards-based Elastic: Scales out & handles data and app integration use cases Metadata Data Databases On Prem Apps Big Data Cloud Apps and DataCloud-Based Designer, Manager, Dashboard Cloudplex Groundplex Hadooplex Sparkplex Firewall
  • 41. z Data Acquisition On Prem Apps and Data Data Access z Data Management Data Lake Add information and improve data Spark Python Scala Java R Pig Collect and integrate data from multiple sources HDFS
 AWS S3
 MS Azure Blob •  ERP •  CRM •  RDBMS Cloud Apps and Data •  CRM •  HCM •  Social IoT Data •  Sensors •  Wearables •  Devices Lakeshore
 Data Mart •  MS Azure •  AWS Redshift •  … BI / Analytics •  Tableau •  MS PowerBI / Azure •  AWS QuickSight Organize and prepare data for visualization HDFS
 AWS S3
 MS Azure Blob Hive Batch Streaming Schedule and manage: Oozie, Ambari Kafka, Sqoop, Flume Real-time Ingest Prepare Deliver Impala, HiveSQL, SparkSQL
  • 42. z Data Acquisition On Prem Apps and Data Data Access z Data Management The Modern Data Lake Powered by SnapLogic •  ERP •  CRM •  RDBMS Cloud Apps and Data •  CRM •  HCM •  Social IoT Data •  Sensors •  Wearables •  Devices Lakeshore
 Data Mart •  MS Azure •  AWS Redshift •  … BI / Analytics •  Tableau •  MS PowerBI / Azure •  AWS QuickSight Batch Streaming Schedule and manage: SnapLogicSnapLogic Pipelines Real-time Ingest Prepare Deliver SnapLogic Pipelines Sort, Aggregate, Join, Merge, Transform SnapLogic abstracts and operationalizes with SnapReduce or Spark pipelines Collect and integrate data from multiple sources SnapLogic pipelines with standard mode execution Organize and prepare data for visualization SnapLogic pipelines with standard mode execution
  • 43. Thank You Watch SnapLogic in action:" video/snaplogic.com Contact us: info@snaplogic.com Follow us on Twitter: @SnapLogic