SlideShare a Scribd company logo
1 of 119
Big Data, Business Intelligence and
Data Visualisation
Contact Details: Jen Stirrup Jen.Stirrup@datarelish.com
@Jenstirrup
www.datarelish.com
Who Am I?
• Postgraduate degrees in Artificial
Intelligence and Cognitive Science
• But you don’t need any of these to do
Data Visualisation
Credit: Mico Yuk
Digital Pragmatism
is about
collecting, sharing, quality-checking,
streamlining, improving, visualizing
data.
CONTROL
FREEDOM
$97B spend on
Business Intelligence by 2017
(Forrester Research)
• Average adoption rate….
21%
Genius depends upon the
data within its reach.Ernest Dimnet
You have to start with the truth. The
truth is the only way that we can get
anywhere. Because any decision-
making that is based upon lies or
ignorance can't lead to a good
conclusion.
Julian Assange, Wikileaks
You have to start with the truth. The
truth is the only way that we can get
anywhere. Because any decision-
making that is based upon lies or
ignorance can't lead to a good
conclusion.
Julian Assange, Wikileaks
Pie Charts
Pie Charts
Key Trends
Decision Making with Data
Internet of things
Audio /
Video
Log
Files
Text/Image
Social
Sentiment
Data Market
Feeds
eGov Feeds
Weather
Wikis / Blogs
Click
Stream
Sensors / RFID /
Devices
Spatial & GPS Coordinates
WEB 2.0Mobile
Advertisin
g
Collaboratio
n
eCommerce
Digital
Marketing
Search Marketing
Web Logs
Recommendation
s
ERP / CRM
Sales
Pipeline
Payables
Payroll
Inventor
y
Contacts
Deal
Tracking
Terabytes
(10E12)
Gigabytes
(10E9)
Exabytes
(10E18)
Petabytes
(10E15)
Velocity - Variety - variability
Volume
1980
190,000$
2010
0.07$
1990
9,000$
2000
15$
Storage/GB
ERP / CRM WEB
2.0
Internet of things
What Is Big Data?
DIGITAL
ANALOG
1985 1990 1995 2000 2005 2010 2015 2020
The world’s data
Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote:
https://www.youtube.com/watch?v=DZW1-
euLaQ4&feature=youtu.be&t=17m10s
The world’s data
DIGITAL
ANALOG
1985 1990 1995 2000 2005 2010 2015 2020
ANALOG
DATACENTERS (CLOUD)
PC / DEVICE
DIGITAL TAPE
DVD / BLU-RAY
CD
Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote:
https://www.youtube.com/watch?v=DZW1-
euLaQ4&feature=youtu.be&t=17m10s
Connected data
CONNECTED
DIGITAL
ANALOG
1985 1990 1995 2000 2005 2010 2015 2020
DATACENTERS (CLOUD)
PC / DEVICE
DIGITAL TAPE
DVD / BLU-RAY
CD
Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote:
https://www.youtube.com/watch?v=DZW1-
euLaQ4&feature=youtu.be&t=17m10s
Connected data
CONNECTED
DIGITAL
ANALOG
1985 1990 1995 2000 2005 2010 2015 2020
CLOUD / IoT
PC / MOBILE
Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote:
https://www.youtube.com/watch?v=DZW1-
euLaQ4&feature=youtu.be&t=17m10s
Connected data
CONNECTED
DIGITAL
ANALOG
1985 1990 1995 2000 2005 2010 2015 2020
CLOUD / IoT
MOBILE
Embracing data transforms business
It is central to outperforming competitors
Agriculture EducationManufacturing Aerospace FinancialAutomotive GovernmentRetailHealthcare
Credit: http://download.microsoft.com/documents/en-
us/making_the_right_analytics_investments_whitepaper.pdf
{ }
Relational
Cloud
• Disparate systems and processes
• Multiple tools and skillsets
• Siloed insights on
disconnected data
• High cost of ownership
Challenges of the modern data platform
Inefficiencies from fragmented architecture
Beyond relational
On-premises
Credit:
http://download.microsoft.com/documents/en-
us/making_the_right_analytics_investments_w
hitepaper.pdf
Azure SQL DB
Azure SQL DW
Analytics Platform System
Azure Data Lake
SQL Server 2016
Analytics Platform System
SQL
Relational Beyond relational
On-premisesCloud
Data Management
Power BI
Cortana Analytics
Azure IoT
Business
Analytics
Business Analytics & Data Management Platform
Credit:
http://download.microsoft.com/docum
ents/en-
us/making_the_right_analytics_investm
ents_whitepaper.pdf
25
So what IS Big Data, then?
Hadoop vs RDBMs
• Unstructured / Semi structured
• Structured
• Works together with RDBMs
Hadoop vs RDBMs
Apache Hadoop isn’t a substitute for a
database
• It is not Relational
• Key Value pairs
• Big Data
How can we make Big Data ‘Human
Scale’ and comprehensible?
Microsoft Power
1 Billion Office Users
Analyze Visualize Share Find
Q&A
MobileDiscover
Scalable | Manageable | Trusted
“Every American should have above
average income, and my
Administration is going to see they
get it.” (Bill Clinton on campaign
trail)
The ‘Golden Record’ problem
Bystander Effect
Effective
visualizations help
stakeholders use
that information for
decisionmaking.
In “about five to eight
seconds, someone’s
going to make the
decision of do they
devote any more time to
looking at what you’ve
got in front of them or
do they move on to the
next thing.”
Cole Naussbaumer
StorytellingwithData.com
From: http://cxcafe.maritzcx.com/storytelling-with-data-dashboarding-with-cole-nussbaumer/
London Cholera Map – John Snow
1854. London. Cholera strikes. In just
10 days, over 500 people have been
killed in one neighborhood. The
mysterious cluster of deaths is
especially terrifying because no one
understands the source.
No one besides John Snow, an
epidemiologist who realized the water
supply was spreading the disease.
5. London Cholera Map – John Snow
He plotted every death on a map with
ingenious mapped bar charts (see left)
and was able to show that the closer
to the Broad Street water pump he
plotted, the greater the number of
deaths.
The information helped convince the
public a true sewage system was
needed and spurred the city to action.
Gapminder – Hans Rosling
The Swedish scientist Hans Rosling had
been working with developmental data
for over 30 years – but it took a great
visualization and a 2007 TED talk for him
to share his passion with the world.
His original viz (now one of many) shows
the relationship between income and life
expectancy. The data is simple but
Rosling’s visual storytelling has allowed
him to spread his passion for this
fascinating, overlooked data to millions.
War Mortality – Florence Nightingale
War Mortality – Florence Nightingale
1855. The Crimea. Britain is fighting a
battle with both Russia and disease.
As a nurse, how do you convince an
army to invest in hospitals and
healthcare instead of guns and
ammunition?
Florence Nightingale told her story
with data by showing the staggering
amount of deaths due to preventable
disease (shown in blue/grey). After
this viz, sanitation became a major
priority for the British Army.
Designing
visualizations that
communicate clearly
doesn’t have to be
complicated.
Consider the kind of data
story you have.
Distribution Part to Whole Correlation
Time Series Compare
Categories
Ranking
Image credit: Column Five Media’s Visage Data Visualization 101
What’s next?
More data!
Data Visualisation User Centred
So, I know what a database is, but
what’s Big Data?
Microsoft Hadoop Vision
Insights to all users by activating new types of data
RDBMS vs. Hadoop
Why Big Data, now?
1980s Architecture
Database
Application
1990s – database as an integration hub
Database
ApplicationApplicationApplication
1990s – Decoupled Services
Database Database Database
ApplicationApplicationApplication
Key NOSQL
Concepts and
Architectures
Relational
Analytical (OLAP)
6
Data Sources Prior to NoSQL
Tipping Point to NoSQL
New
Paradigm
Large Data
Sets
Scalability
Social Media
Structured /
Unstructured
Data
What is NOSQL
• Any database that is not
Non-Relational SQL
Not ‘No SQL’
But Not Only SQL
relational
•
•
•
Where is NOSQL used?
Cassandra used on:
Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco
Hadoop used on:
Amazon Web Services, Pentaho, Yahoo!, The New York Times
CouchDB used on:
CERN, BBC, Interactive Mediums
MongoDB used on:
Foursquare, bit.ly, SourceForge, Fotopedia, Joomla Ads
Riak used on:
Widescript, Western Communications, Ask Sponsored Listings
Data
Structured
DocumentsTransactional
Relational DW XML JSON
Semi-
structured
Unstructured
VideoScientific
Data
Different Types of Data need Different Solutions
Relational
OLAP
6
Data Sources
Prior to NoSQL
Data Sources including NoSQL
Key Value
Key Value
Key-Value
Column - Family
Graph
Document
Relational
• Tabular format
• SQL concepts
• Tables
• Joins
• Rows / Columns
• SQL Language
• Rigid Data Modelling
Relational
• Built for the business
• Dimensions / Facts
• Fast reads
• Historical Data
Key-Value Stores
• Keys are used to access blobs of data
• Video
• Images..
• A key uniquely identifies each record.
• Dictionaries have records that are stored and
retrieved using a key.
• If it fast because the key uniquely identifies
each record.
• Data is a single opaque collection
Key Value
Key Value
Locker Analogy
• Keys are used to access blobs of data
• Video
• Images..
• A key uniquely identifies each record.
• Dictionaries have records that are stored and
retrieved using a key.
• The Value is simply an object.
Graph Store
• Data is stored in nodes, which have properties
• They are connected by critical relationships
Documents
• Data stored in nested​
• hierarchies​
• Logical data remains stored together as a unit​
• Any item in the document can be queried​
• Pros: No object-relational mapping layer, ideal
for search​
• Cons: Complex to implement, incompatible with
SQL​
Database Availability Online
Database Availability Means
CAP Theorem (BASE vs ACID)
Partitioning and Replication
Replication Diagram
“Ring” of Consistent Hashing
Next …. → Database Integrity
What is Database Availability?
● High Availability: database and application is available
in scheduled period, when maintenance period system
is temporarily down.
● Continuous Operation: system available all the time
with no scheduled outages.
● Continuous Availability: combination of HA & CO,
data is always available, and maintenance is done
without shutdown the system
CAP Theorem
Consistency,
Availability and
Partition Tolerance.
A shared-data system
can have at most two of
those three.
ACID and BASE
ACID
Atomicity: All or nothing
Consistency: Any transaction should result in valid tables
Isolation: separate transactions
Durability: Database will survive a system failures.
BASE
BASE
Basically Available - system seems to work all the time
Soft State - it doesn't have to be consistent all the time
Eventually Consistent - becomes consistent at some later
time
Scalability
Vertical scale
Improving server
RAM, and storage
Horizontal scale
specification by adding more processor,
device. Limited and expensive.
Adding more cheap computer as server expansion. Do
sharding and partitioning which is hard to implement and
expensive using relational databases (RDBMS)
Partitioning
Sharing the data between different nodes
Each node placed on a ring
Advantage : ability to scale incrementally
Issues : non-uniform data distribution
(data host)
Replication
Multiple nodes
Multiple datacenters
High availability and durability
•NoSQL solutions need to solve real-world
business problems
•Search
•High Availability
•Agility
• Big Data is not the same as NoSQL.
• NoSQL is more than dealing with big datasets.
• NoSQL includes concepts that can be managed by a single processor
• However, big data problems are a primary use case for NoSQL.
One or many databases?
One Database
• Easy to understand
• Easy to set up and configure
• Easy to administer
• Single source
• Limited scalability
Linear Scaling
Performance
Number of Processors
Expressivity
Degree of
distribution
Key-value
Expressivity
Column
Family
Row
Store
JSON XML
Column
Family
Raw
Stores
Graph-
stores
In
memory
cache
Scalability
Document
Stores
Big Data Problems
Big Data
Read-mostly
Documents
Full Text
Event Log
Real Time Batch
Graph
Read-write
Transactions Transactions
Why do databases fail?
• Anything that can go wrong, will go wrong – Murphy’s
law.
• Human error
• Network failure
• Hardware failure
• Security
What can we do to support Hadoop?
• Hadoop helps manage and process large datasets
• Hadoop provides linear scalability
• Hadoop brings computing logic to the data rather than
bringing the data to computing logic.
Hadoop Clustering basics
•Hadoop uses a cluster for data storage and
computation purposes.
•It runs and writes distributed applications for huge
amounts of data
What is the purpose of Hive?
83
Hive is a data warehousing system for Hadoop
To meet the needs of businesses, data scientists, analysts and BI professionals
Data, Summarized
Fit a structure onto data
Data, Analyzed
Analysis of Large Datasets stored in Hadoop File Systems
SQL-Like language called HiveQL
Custom mappers and reduces when HiveQL isn’t enough
Hive History
84
Hive History
85
86
What can Hive offer you?
Hive can help with a range of business problems:
• Log Processing
• Predictive Modelling
• Hypothesis testing
• And Business Intelligence
87
Hive is not a replacement for SQL
So don’t throw out your SQL Server instances!
• Hive is for processing large data sets that may span hundreds, or even thousands, of
machines
• Hive as a high overhead for starting a job. It translates queries to MR so it takes time
• Hive does not cache data, like SQL Server
• Hive performance tuning is mainly Hadoop performance tuning
• Similarity of the query engine, but different architectures for different purposes
HiveQL
88
Hive QL is a SQL-like language
It outputs naturally occurring groups for further analysis
Easy Data Summarization
Large Datasets, summarized
Fit a structure onto data
Analysis of Large Datasets stored in Hadoop file systems
SQL-Like language called HiveQL
Custom mappers and reduces when HiveQL isn’t enough
HiveQL Queries like SQL Queries?
89
Similarities in Syntax and Features
Similar features
SELECT
FROM
WHERE
GROUP BY / HAVING
Table Aliases
Computed Columns
HiveQL Queries like SQL Queries?
90
Similarities in Syntax and Features
Similar features
Aggregate Functions
Nested Select
CASE
LIKE / RLIKE
JOIN
ORDER BY / SORT BY
How does Hive work?
91
Hive as a Translation Tool
Compiles and executes queries
Hive translates the SQL Query to a Map Reduce Job
These are chained together
Queries are compiled and executed
How does Hive work?
92
Hive as a structuring Tool
Creates a schema around the data
Tables stored in Directories
Hive Tables
Rows and columns, like SQL tables
Hive Metastore
Namespace with a set of tables
Holds table definitions
Physical Layout
Column Types
Partition Information
Hive and SQL Data Types
Hive SQL
Tinyint Tinyint
SmallInt Smallint
Int Int
BigInt BigInt
Boolean Bit (setting as NOT NULL)
Float Float
Double Real
BigDecimal Decimal
94
Hive and SQL Data Types
HEADING HEADING
String Char, varchar, nvarchar, ntext, text, image
Binary binary
Timestamp Timestamp (note that this is being deprecated).
RowVersion
95
Hive Mathematical Operations
• Plus
• Negative
• Addition
• Subtraction
• Multiplication
• Division
• Modulus
• Primitive Types • Complex Types
• Arrays
• Maps
• Structs
• Union
96
Power View Power Map
• Highly Visual Design Experience
• Power View is an interactive, ad
hoc, query and visualization
experience.
• It is for business question
‘mystery’ solving
• Power Map is a new 3D
visualization add-in for Excel
helping you to analyse
geographical and temporal data
• Mapping
• Exploring
• Interacting
Different Tools for Different Jobs
Hive and Pig: Similarities
98
Hive and Pig are great at crunching large amounts of
data from HDFS to database
Both compile to Map Reduce jobs
Pig is Procedural, Hive is Declarative
Hive is much closer to SQL in terms of querying – this can be a
good or a bad thing!
Hive and Pig: Differences
99
Pig Hive
Procedural Declarative
Fits cleanly into pipeline paradigm; no
need for temporary tables
Temporary tables are ubiquitous but
can be disjointed; may involve clean up.
Greater control over dataflow:
- Checkpoints
- Naturally handles splitting of data
streams
SQL expects one result and works
towards it. Handles trees but not splits
Optimizing done by developer Hive optimisation is passed to the Hive
Query Optimizer
Hive and Pig: When are they best used?
100
Different Tools with Different Jobs
Pig is akin to SSIS
Great for dataflows and automated batch jobs
Hive is akin to ad-hoc, analytics SQL Queries
Results that make sense of the data
Why, Who & How of Power BI
More Specialized
BI Pros
Power Users
Decision Makers
Business Analysts
Information Workers
Self-Service
• Power Pivot
• Power View
• Power Query
• Power Map
Clients
• Excel Services
• Office Professional
Easy Access to Data, Big and Small
Easy Access to Data, Big and Small
Microsoft Power BI for Office 365
1 in 4 enterprise customers on Office 3651 Billion Office Users
Analyze Visualize Share Find
Q&A
MobileDiscover
Scalable | Manageable | Trusted
Power QueryEnable self-service data discovery, query, transformation and mashup experiences for Information Workers, via
Excel and PowerPivot
Discovery and connectivity to a wide range of data sources, spanning volume as well as variety of
data.
Highly interactive and intuitive experience for rapidly and iteratively building queries over any
data source, any size.
Consistency of experience, and parity of query capabilities over all data sources.
Joins across different data sources; ability to create custom views over data that can then be
shared with team/department.
Power Query
Discover, combine,
and refine Big Data,
small data, and any
data with Data
Explorer for Excel.
Power Query
S
Power Query Data Sources
Windows Azure
Marketplace
Windows Active
Directory
Azure SQL
Database
Azure HDInsight
Analyse and Model with Excel 2013
Power View
Powerful Self-Service BI with Excel 2013
Power View – Business Mysteries, Solved
Power View is an interactive data exploration,
visualization, and presentation experience
Highly visual design experience
Rich meta-driven interactivity
Presentation-ready at all times
It delivers intuitive ad-hoc reporting for business
users
Introducing Power View
It is now also available in Excel 2013, and with new features:
• Maps
• Pie charts
• Hierarchies
• KPIs
• Drill down/Drill up
• Report styles,
themes and text
resizing
• Backgrounds with
images
• Hyperlinks
• Printing
Power View in Excel
Excel Database
server
SQL AS
(Tabular)
Power View
SQL RS
ADOMD.NET
SQL AS
(PowerPivot)
Power View in SharePoint
Browser SharePoint
web server
Database
server
SharePoint
app server
SQL AS
(PowerPivot)
SQL AS
(Tabular)
SQL RS
Add-In
SQL RS
Power View
Powerful Self-Service BI with Excel 2013
Power Map for Microsoft Excel enables information workers to discover and share new insights
from geographical and temporal data through three-dimensional storytelling.
What Is Power Map?
Map Data
• Data in Excel
• Geo-Code
• 3D and 3 Visuals
Discover Insights
• Play over Time
• Annotate points
• Capture scenes
Share Stories
• Cinematic Effects
• Interactive Tours
• Share Workbook
Power Map: Steps to 3D insights
Map Data
Power Map
Excel Add-in to Enhance Data Visualization
Map data, discover insight, and share stories

More Related Content

What's hot

Documenting Data Transformations
Documenting Data TransformationsDocumenting Data Transformations
Documenting Data TransformationsARDC
 
Data Visualization and Analysis
Data Visualization and AnalysisData Visualization and Analysis
Data Visualization and AnalysisDaniel Rangel
 
From Volume to Value - A Guide to Data Engineering
From Volume to Value - A Guide to Data EngineeringFrom Volume to Value - A Guide to Data Engineering
From Volume to Value - A Guide to Data EngineeringRy Walker
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache SparkLucian Neghina
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceCaserta
 
Data Discoverability at SpotHero
Data Discoverability at SpotHeroData Discoverability at SpotHero
Data Discoverability at SpotHeroMaggie Hays
 
Big data - A Really Big Enchilada?
Big data - A Really Big Enchilada?Big data - A Really Big Enchilada?
Big data - A Really Big Enchilada?Keshav Deshpande
 
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Simplilearn
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsChandan Rajah
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldDez Blanchfield
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
Data Modeling for Big Data & NoSQL Technologies with Karen LopezData Modeling for Big Data & NoSQL Technologies with Karen Lopez
Data Modeling for Big Data & NoSQL Technologies with Karen LopezEmbarcadero Technologies
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedDunn Solutions Group
 
Crowdsourced Data Processing: Industry and Academic Perspectives
Crowdsourced Data Processing: Industry and Academic PerspectivesCrowdsourced Data Processing: Industry and Academic Perspectives
Crowdsourced Data Processing: Industry and Academic PerspectivesAditya Parameswaran
 
Data Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data ProtectionData Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data ProtectionKaren Lopez
 
Towards Visualization Recommendation Systems
Towards Visualization Recommendation SystemsTowards Visualization Recommendation Systems
Towards Visualization Recommendation SystemsAditya Parameswaran
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big datawebwinkelvakdag
 

What's hot (20)

Documenting Data Transformations
Documenting Data TransformationsDocumenting Data Transformations
Documenting Data Transformations
 
Data Visualization and Analysis
Data Visualization and AnalysisData Visualization and Analysis
Data Visualization and Analysis
 
From Volume to Value - A Guide to Data Engineering
From Volume to Value - A Guide to Data EngineeringFrom Volume to Value - A Guide to Data Engineering
From Volume to Value - A Guide to Data Engineering
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data Discoverability at SpotHero
Data Discoverability at SpotHeroData Discoverability at SpotHero
Data Discoverability at SpotHero
 
Big data - A Really Big Enchilada?
Big data - A Really Big Enchilada?Big data - A Really Big Enchilada?
Big data - A Really Big Enchilada?
 
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
Data Modeling for Big Data & NoSQL Technologies with Karen LopezData Modeling for Big Data & NoSQL Technologies with Karen Lopez
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
 
The Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They NeedThe Data Lake and Getting Buisnesses the Big Data Insights They Need
The Data Lake and Getting Buisnesses the Big Data Insights They Need
 
Crowdsourced Data Processing: Industry and Academic Perspectives
Crowdsourced Data Processing: Industry and Academic PerspectivesCrowdsourced Data Processing: Industry and Academic Perspectives
Crowdsourced Data Processing: Industry and Academic Perspectives
 
Data Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data ProtectionData Modeling for Security, Privacy and Data Protection
Data Modeling for Security, Privacy and Data Protection
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Towards Visualization Recommendation Systems
Towards Visualization Recommendation SystemsTowards Visualization Recommendation Systems
Towards Visualization Recommendation Systems
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big data
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 

Viewers also liked

Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop SampleAlan Quayle
 
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Jeffrey Breen
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelinprajods
 
Evolution Of The Computers
Evolution Of The ComputersEvolution Of The Computers
Evolution Of The Computerspanitiaict
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big DataBernard Marr
 
Software Architecture and Design - An Overview
Software Architecture and Design - An OverviewSoftware Architecture and Design - An Overview
Software Architecture and Design - An OverviewOliver Stadie
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...Svetlin Nakov
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingHealth Catalyst
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 

Viewers also liked (20)

Big data and its impact on indian business
Big data and its impact on indian businessBig data and its impact on indian business
Big data and its impact on indian business
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop Sample
 
3.1.2 classification of network
3.1.2 classification of network3.1.2 classification of network
3.1.2 classification of network
 
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
 
Evolution Of The Computers
Evolution Of The ComputersEvolution Of The Computers
Evolution Of The Computers
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big Data
 
Layered Software Architecture
Layered Software ArchitectureLayered Software Architecture
Layered Software Architecture
 
Software Architecture and Design - An Overview
Software Architecture and Design - An OverviewSoftware Architecture and Design - An Overview
Software Architecture and Design - An Overview
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...
Architectural Patterns and Software Architectures: Client-Server, Multi-Tier,...
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 

Similar to Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation

Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data scienceThinkful
 
democratization of data sql-konferenz
democratization of data sql-konferenzdemocratization of data sql-konferenz
democratization of data sql-konferenzJen Stirrup
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Thinkful
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 
Data In Action: Business Value of Data
Data In Action: Business Value of DataData In Action: Business Value of Data
Data In Action: Business Value of DataMatt Turner
 
Trends in Data Modeling
Trends in Data ModelingTrends in Data Modeling
Trends in Data ModelingDATAVERSITY
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...DATAVERSITY
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityPrecisely
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1RUHULAMINHAZARIKA
 
Predictive Analytics - How to get stuff out of your Crystal Ball
Predictive Analytics - How to get stuff out of your Crystal BallPredictive Analytics - How to get stuff out of your Crystal Ball
Predictive Analytics - How to get stuff out of your Crystal BallDATAVERSITY
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData Blueprint
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingDATAVERSITY
 

Similar to Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation (20)

Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data science
 
democratization of data sql-konferenz
democratization of data sql-konferenzdemocratization of data sql-konferenz
democratization of data sql-konferenz
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
Data In Action: Business Value of Data
Data In Action: Business Value of DataData In Action: Business Value of Data
Data In Action: Business Value of Data
 
Trends in Data Modeling
Trends in Data ModelingTrends in Data Modeling
Trends in Data Modeling
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data Quality
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1
 
Predictive Analytics - How to get stuff out of your Crystal Ball
Predictive Analytics - How to get stuff out of your Crystal BallPredictive Analytics - How to get stuff out of your Crystal Ball
Predictive Analytics - How to get stuff out of your Crystal Ball
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data Modeling
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
 

More from Jen Stirrup

AI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdfAI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdfJen Stirrup
 
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONBUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONJen Stirrup
 
CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners Jen Stirrup
 
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Jen Stirrup
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for releaseJen Stirrup
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for AnalyticsJen Stirrup
 
Comparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform TechnologiesComparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform TechnologiesJen Stirrup
 
Introduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and PythonIntroduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and PythonJen Stirrup
 
Sales Analytics in Power BI
Sales Analytics in Power BISales Analytics in Power BI
Sales Analytics in Power BIJen Stirrup
 
Analytics for Marketing
Analytics for MarketingAnalytics for Marketing
Analytics for MarketingJen Stirrup
 
Diversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doersDiversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doersJen Stirrup
 
Artificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspectiveArtificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspectiveJen Stirrup
 
How to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to successHow to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to successJen Stirrup
 
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Jen Stirrup
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpowerJen Stirrup
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsJen Stirrup
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowJen Stirrup
 
Blockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence ProfessionalsBlockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence ProfessionalsJen Stirrup
 
Examples of the worst data visualization ever
Examples of the worst data visualization everExamples of the worst data visualization ever
Examples of the worst data visualization everJen Stirrup
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureJen Stirrup
 

More from Jen Stirrup (20)

AI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdfAI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdf
 
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONBUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
 
CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners
 
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
Comparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform TechnologiesComparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform Technologies
 
Introduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and PythonIntroduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and Python
 
Sales Analytics in Power BI
Sales Analytics in Power BISales Analytics in Power BI
Sales Analytics in Power BI
 
Analytics for Marketing
Analytics for MarketingAnalytics for Marketing
Analytics for Marketing
 
Diversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doersDiversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doers
 
Artificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspectiveArtificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspective
 
How to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to successHow to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to success
 
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpower
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
 
Blockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence ProfessionalsBlockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence Professionals
 
Examples of the worst data visualization ever
Examples of the worst data visualization everExamples of the worst data visualization ever
Examples of the worst data visualization ever
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
 

Recently uploaded

Pitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deckPitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deckHajeJanKamps
 
Planetary and Vedic Yagyas Bring Positive Impacts in Life
Planetary and Vedic Yagyas Bring Positive Impacts in LifePlanetary and Vedic Yagyas Bring Positive Impacts in Life
Planetary and Vedic Yagyas Bring Positive Impacts in LifeBhavana Pujan Kendra
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdfShaun Heinrichs
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdfShaun Heinrichs
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterJamesConcepcion7
 
Introducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsIntroducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsKnowledgeSeed
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...ssuserf63bd7
 
20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdfChris Skinner
 
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh JiPsychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh Jiastral oracle
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersPeter Horsten
 
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOnemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOne Monitar
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7
 
Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFChandresh Chudasama
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationAnamaria Contreras
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environmentelijahj01012
 
Environmental Impact Of Rotary Screw Compressors
Environmental Impact Of Rotary Screw CompressorsEnvironmental Impact Of Rotary Screw Compressors
Environmental Impact Of Rotary Screw Compressorselgieurope
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfJamesConcepcion7
 
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdfChris Skinner
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers referencessuser2c065e
 

Recently uploaded (20)

Pitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deckPitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deck
 
Planetary and Vedic Yagyas Bring Positive Impacts in Life
Planetary and Vedic Yagyas Bring Positive Impacts in LifePlanetary and Vedic Yagyas Bring Positive Impacts in Life
Planetary and Vedic Yagyas Bring Positive Impacts in Life
 
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptxThe Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf
 
1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf1911 Gold Corporate Presentation Apr 2024.pdf
1911 Gold Corporate Presentation Apr 2024.pdf
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare Newsletter
 
Introducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsIntroducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applications
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
 
20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf
 
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh JiPsychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exporters
 
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring CapabilitiesOnemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
Onemonitar Android Spy App Features: Explore Advanced Monitoring Capabilities
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
 
Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDF
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environment
 
Environmental Impact Of Rotary Screw Compressors
Environmental Impact Of Rotary Screw CompressorsEnvironmental Impact Of Rotary Screw Compressors
Environmental Impact Of Rotary Screw Compressors
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdf
 
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
20220816-EthicsGrade_Scorecard-JP_Morgan_Chase-Q2-63_57.pdf
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers reference
 

Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation

  • 1. Big Data, Business Intelligence and Data Visualisation Contact Details: Jen Stirrup Jen.Stirrup@datarelish.com @Jenstirrup www.datarelish.com
  • 2. Who Am I? • Postgraduate degrees in Artificial Intelligence and Cognitive Science • But you don’t need any of these to do Data Visualisation
  • 4. Digital Pragmatism is about collecting, sharing, quality-checking, streamlining, improving, visualizing data.
  • 6. $97B spend on Business Intelligence by 2017 (Forrester Research) • Average adoption rate…. 21%
  • 7. Genius depends upon the data within its reach.Ernest Dimnet
  • 8.
  • 9. You have to start with the truth. The truth is the only way that we can get anywhere. Because any decision- making that is based upon lies or ignorance can't lead to a good conclusion. Julian Assange, Wikileaks
  • 10. You have to start with the truth. The truth is the only way that we can get anywhere. Because any decision- making that is based upon lies or ignorance can't lead to a good conclusion. Julian Assange, Wikileaks
  • 13.
  • 16. Internet of things Audio / Video Log Files Text/Image Social Sentiment Data Market Feeds eGov Feeds Weather Wikis / Blogs Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates WEB 2.0Mobile Advertisin g Collaboratio n eCommerce Digital Marketing Search Marketing Web Logs Recommendation s ERP / CRM Sales Pipeline Payables Payroll Inventor y Contacts Deal Tracking Terabytes (10E12) Gigabytes (10E9) Exabytes (10E18) Petabytes (10E15) Velocity - Variety - variability Volume 1980 190,000$ 2010 0.07$ 1990 9,000$ 2000 15$ Storage/GB ERP / CRM WEB 2.0 Internet of things What Is Big Data?
  • 17. DIGITAL ANALOG 1985 1990 1995 2000 2005 2010 2015 2020 The world’s data Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote: https://www.youtube.com/watch?v=DZW1- euLaQ4&feature=youtu.be&t=17m10s
  • 18. The world’s data DIGITAL ANALOG 1985 1990 1995 2000 2005 2010 2015 2020 ANALOG DATACENTERS (CLOUD) PC / DEVICE DIGITAL TAPE DVD / BLU-RAY CD Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote: https://www.youtube.com/watch?v=DZW1- euLaQ4&feature=youtu.be&t=17m10s
  • 19. Connected data CONNECTED DIGITAL ANALOG 1985 1990 1995 2000 2005 2010 2015 2020 DATACENTERS (CLOUD) PC / DEVICE DIGITAL TAPE DVD / BLU-RAY CD Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote: https://www.youtube.com/watch?v=DZW1- euLaQ4&feature=youtu.be&t=17m10s
  • 20. Connected data CONNECTED DIGITAL ANALOG 1985 1990 1995 2000 2005 2010 2015 2020 CLOUD / IoT PC / MOBILE Credit: 17:15-19:04 of Joseph Sirosh’s PASS Keynote: https://www.youtube.com/watch?v=DZW1- euLaQ4&feature=youtu.be&t=17m10s
  • 21. Connected data CONNECTED DIGITAL ANALOG 1985 1990 1995 2000 2005 2010 2015 2020 CLOUD / IoT MOBILE
  • 22. Embracing data transforms business It is central to outperforming competitors Agriculture EducationManufacturing Aerospace FinancialAutomotive GovernmentRetailHealthcare Credit: http://download.microsoft.com/documents/en- us/making_the_right_analytics_investments_whitepaper.pdf
  • 23. { } Relational Cloud • Disparate systems and processes • Multiple tools and skillsets • Siloed insights on disconnected data • High cost of ownership Challenges of the modern data platform Inefficiencies from fragmented architecture Beyond relational On-premises Credit: http://download.microsoft.com/documents/en- us/making_the_right_analytics_investments_w hitepaper.pdf
  • 24. Azure SQL DB Azure SQL DW Analytics Platform System Azure Data Lake SQL Server 2016 Analytics Platform System SQL Relational Beyond relational On-premisesCloud Data Management Power BI Cortana Analytics Azure IoT Business Analytics Business Analytics & Data Management Platform Credit: http://download.microsoft.com/docum ents/en- us/making_the_right_analytics_investm ents_whitepaper.pdf
  • 25. 25
  • 26. So what IS Big Data, then?
  • 27. Hadoop vs RDBMs • Unstructured / Semi structured • Structured • Works together with RDBMs
  • 28. Hadoop vs RDBMs Apache Hadoop isn’t a substitute for a database • It is not Relational • Key Value pairs • Big Data
  • 29. How can we make Big Data ‘Human Scale’ and comprehensible?
  • 30. Microsoft Power 1 Billion Office Users Analyze Visualize Share Find Q&A MobileDiscover Scalable | Manageable | Trusted
  • 31. “Every American should have above average income, and my Administration is going to see they get it.” (Bill Clinton on campaign trail)
  • 32.
  • 35. Effective visualizations help stakeholders use that information for decisionmaking.
  • 36. In “about five to eight seconds, someone’s going to make the decision of do they devote any more time to looking at what you’ve got in front of them or do they move on to the next thing.” Cole Naussbaumer StorytellingwithData.com From: http://cxcafe.maritzcx.com/storytelling-with-data-dashboarding-with-cole-nussbaumer/
  • 37. London Cholera Map – John Snow 1854. London. Cholera strikes. In just 10 days, over 500 people have been killed in one neighborhood. The mysterious cluster of deaths is especially terrifying because no one understands the source. No one besides John Snow, an epidemiologist who realized the water supply was spreading the disease.
  • 38. 5. London Cholera Map – John Snow He plotted every death on a map with ingenious mapped bar charts (see left) and was able to show that the closer to the Broad Street water pump he plotted, the greater the number of deaths. The information helped convince the public a true sewage system was needed and spurred the city to action.
  • 39. Gapminder – Hans Rosling The Swedish scientist Hans Rosling had been working with developmental data for over 30 years – but it took a great visualization and a 2007 TED talk for him to share his passion with the world. His original viz (now one of many) shows the relationship between income and life expectancy. The data is simple but Rosling’s visual storytelling has allowed him to spread his passion for this fascinating, overlooked data to millions.
  • 40. War Mortality – Florence Nightingale
  • 41. War Mortality – Florence Nightingale 1855. The Crimea. Britain is fighting a battle with both Russia and disease. As a nurse, how do you convince an army to invest in hospitals and healthcare instead of guns and ammunition? Florence Nightingale told her story with data by showing the staggering amount of deaths due to preventable disease (shown in blue/grey). After this viz, sanitation became a major priority for the British Army.
  • 43. Consider the kind of data story you have. Distribution Part to Whole Correlation Time Series Compare Categories Ranking Image credit: Column Five Media’s Visage Data Visualization 101
  • 44. What’s next? More data! Data Visualisation User Centred
  • 45.
  • 46. So, I know what a database is, but what’s Big Data?
  • 47. Microsoft Hadoop Vision Insights to all users by activating new types of data
  • 51. 1990s – database as an integration hub Database ApplicationApplicationApplication
  • 52. 1990s – Decoupled Services Database Database Database ApplicationApplicationApplication
  • 55. Tipping Point to NoSQL New Paradigm Large Data Sets Scalability Social Media Structured / Unstructured Data
  • 56. What is NOSQL • Any database that is not Non-Relational SQL Not ‘No SQL’ But Not Only SQL relational • • •
  • 57. Where is NOSQL used? Cassandra used on: Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco Hadoop used on: Amazon Web Services, Pentaho, Yahoo!, The New York Times CouchDB used on: CERN, BBC, Interactive Mediums MongoDB used on: Foursquare, bit.ly, SourceForge, Fotopedia, Joomla Ads Riak used on: Widescript, Western Communications, Ask Sponsored Listings
  • 58. Data Structured DocumentsTransactional Relational DW XML JSON Semi- structured Unstructured VideoScientific Data Different Types of Data need Different Solutions
  • 59. Relational OLAP 6 Data Sources Prior to NoSQL Data Sources including NoSQL Key Value Key Value Key-Value Column - Family Graph Document
  • 60. Relational • Tabular format • SQL concepts • Tables • Joins • Rows / Columns • SQL Language • Rigid Data Modelling
  • 61. Relational • Built for the business • Dimensions / Facts • Fast reads • Historical Data
  • 62. Key-Value Stores • Keys are used to access blobs of data • Video • Images.. • A key uniquely identifies each record. • Dictionaries have records that are stored and retrieved using a key. • If it fast because the key uniquely identifies each record. • Data is a single opaque collection Key Value Key Value
  • 63. Locker Analogy • Keys are used to access blobs of data • Video • Images.. • A key uniquely identifies each record. • Dictionaries have records that are stored and retrieved using a key. • The Value is simply an object.
  • 64. Graph Store • Data is stored in nodes, which have properties • They are connected by critical relationships
  • 65. Documents • Data stored in nested​ • hierarchies​ • Logical data remains stored together as a unit​ • Any item in the document can be queried​ • Pros: No object-relational mapping layer, ideal for search​ • Cons: Complex to implement, incompatible with SQL​
  • 66. Database Availability Online Database Availability Means CAP Theorem (BASE vs ACID) Partitioning and Replication Replication Diagram “Ring” of Consistent Hashing Next …. → Database Integrity
  • 67. What is Database Availability? ● High Availability: database and application is available in scheduled period, when maintenance period system is temporarily down. ● Continuous Operation: system available all the time with no scheduled outages. ● Continuous Availability: combination of HA & CO, data is always available, and maintenance is done without shutdown the system
  • 68. CAP Theorem Consistency, Availability and Partition Tolerance. A shared-data system can have at most two of those three.
  • 69. ACID and BASE ACID Atomicity: All or nothing Consistency: Any transaction should result in valid tables Isolation: separate transactions Durability: Database will survive a system failures.
  • 70. BASE BASE Basically Available - system seems to work all the time Soft State - it doesn't have to be consistent all the time Eventually Consistent - becomes consistent at some later time
  • 71. Scalability Vertical scale Improving server RAM, and storage Horizontal scale specification by adding more processor, device. Limited and expensive. Adding more cheap computer as server expansion. Do sharding and partitioning which is hard to implement and expensive using relational databases (RDBMS)
  • 72. Partitioning Sharing the data between different nodes Each node placed on a ring Advantage : ability to scale incrementally Issues : non-uniform data distribution (data host)
  • 74. •NoSQL solutions need to solve real-world business problems •Search •High Availability •Agility
  • 75. • Big Data is not the same as NoSQL. • NoSQL is more than dealing with big datasets. • NoSQL includes concepts that can be managed by a single processor • However, big data problems are a primary use case for NoSQL.
  • 76. One or many databases? One Database • Easy to understand • Easy to set up and configure • Easy to administer • Single source • Limited scalability
  • 79. Big Data Problems Big Data Read-mostly Documents Full Text Event Log Real Time Batch Graph Read-write Transactions Transactions
  • 80. Why do databases fail? • Anything that can go wrong, will go wrong – Murphy’s law. • Human error • Network failure • Hardware failure • Security
  • 81. What can we do to support Hadoop? • Hadoop helps manage and process large datasets • Hadoop provides linear scalability • Hadoop brings computing logic to the data rather than bringing the data to computing logic.
  • 82. Hadoop Clustering basics •Hadoop uses a cluster for data storage and computation purposes. •It runs and writes distributed applications for huge amounts of data
  • 83. What is the purpose of Hive? 83 Hive is a data warehousing system for Hadoop To meet the needs of businesses, data scientists, analysts and BI professionals Data, Summarized Fit a structure onto data Data, Analyzed Analysis of Large Datasets stored in Hadoop File Systems SQL-Like language called HiveQL Custom mappers and reduces when HiveQL isn’t enough
  • 86. 86 What can Hive offer you? Hive can help with a range of business problems: • Log Processing • Predictive Modelling • Hypothesis testing • And Business Intelligence
  • 87. 87 Hive is not a replacement for SQL So don’t throw out your SQL Server instances! • Hive is for processing large data sets that may span hundreds, or even thousands, of machines • Hive as a high overhead for starting a job. It translates queries to MR so it takes time • Hive does not cache data, like SQL Server • Hive performance tuning is mainly Hadoop performance tuning • Similarity of the query engine, but different architectures for different purposes
  • 88. HiveQL 88 Hive QL is a SQL-like language It outputs naturally occurring groups for further analysis Easy Data Summarization Large Datasets, summarized Fit a structure onto data Analysis of Large Datasets stored in Hadoop file systems SQL-Like language called HiveQL Custom mappers and reduces when HiveQL isn’t enough
  • 89. HiveQL Queries like SQL Queries? 89 Similarities in Syntax and Features Similar features SELECT FROM WHERE GROUP BY / HAVING Table Aliases Computed Columns
  • 90. HiveQL Queries like SQL Queries? 90 Similarities in Syntax and Features Similar features Aggregate Functions Nested Select CASE LIKE / RLIKE JOIN ORDER BY / SORT BY
  • 91. How does Hive work? 91 Hive as a Translation Tool Compiles and executes queries Hive translates the SQL Query to a Map Reduce Job These are chained together Queries are compiled and executed
  • 92. How does Hive work? 92 Hive as a structuring Tool Creates a schema around the data Tables stored in Directories Hive Tables Rows and columns, like SQL tables Hive Metastore Namespace with a set of tables Holds table definitions Physical Layout Column Types Partition Information
  • 93. Hive and SQL Data Types Hive SQL Tinyint Tinyint SmallInt Smallint Int Int BigInt BigInt Boolean Bit (setting as NOT NULL) Float Float Double Real BigDecimal Decimal 94
  • 94. Hive and SQL Data Types HEADING HEADING String Char, varchar, nvarchar, ntext, text, image Binary binary Timestamp Timestamp (note that this is being deprecated). RowVersion 95
  • 95. Hive Mathematical Operations • Plus • Negative • Addition • Subtraction • Multiplication • Division • Modulus • Primitive Types • Complex Types • Arrays • Maps • Structs • Union 96
  • 96. Power View Power Map • Highly Visual Design Experience • Power View is an interactive, ad hoc, query and visualization experience. • It is for business question ‘mystery’ solving • Power Map is a new 3D visualization add-in for Excel helping you to analyse geographical and temporal data • Mapping • Exploring • Interacting Different Tools for Different Jobs
  • 97. Hive and Pig: Similarities 98 Hive and Pig are great at crunching large amounts of data from HDFS to database Both compile to Map Reduce jobs Pig is Procedural, Hive is Declarative Hive is much closer to SQL in terms of querying – this can be a good or a bad thing!
  • 98. Hive and Pig: Differences 99 Pig Hive Procedural Declarative Fits cleanly into pipeline paradigm; no need for temporary tables Temporary tables are ubiquitous but can be disjointed; may involve clean up. Greater control over dataflow: - Checkpoints - Naturally handles splitting of data streams SQL expects one result and works towards it. Handles trees but not splits Optimizing done by developer Hive optimisation is passed to the Hive Query Optimizer
  • 99. Hive and Pig: When are they best used? 100 Different Tools with Different Jobs Pig is akin to SSIS Great for dataflows and automated batch jobs Hive is akin to ad-hoc, analytics SQL Queries Results that make sense of the data
  • 100. Why, Who & How of Power BI More Specialized BI Pros Power Users Decision Makers Business Analysts Information Workers Self-Service • Power Pivot • Power View • Power Query • Power Map Clients • Excel Services • Office Professional
  • 101. Easy Access to Data, Big and Small
  • 102. Easy Access to Data, Big and Small
  • 103. Microsoft Power BI for Office 365 1 in 4 enterprise customers on Office 3651 Billion Office Users Analyze Visualize Share Find Q&A MobileDiscover Scalable | Manageable | Trusted
  • 104. Power QueryEnable self-service data discovery, query, transformation and mashup experiences for Information Workers, via Excel and PowerPivot Discovery and connectivity to a wide range of data sources, spanning volume as well as variety of data. Highly interactive and intuitive experience for rapidly and iteratively building queries over any data source, any size. Consistency of experience, and parity of query capabilities over all data sources. Joins across different data sources; ability to create custom views over data that can then be shared with team/department.
  • 105. Power Query Discover, combine, and refine Big Data, small data, and any data with Data Explorer for Excel.
  • 107. S Power Query Data Sources Windows Azure Marketplace Windows Active Directory Azure SQL Database Azure HDInsight
  • 108. Analyse and Model with Excel 2013
  • 110. Powerful Self-Service BI with Excel 2013
  • 111. Power View – Business Mysteries, Solved Power View is an interactive data exploration, visualization, and presentation experience Highly visual design experience Rich meta-driven interactivity Presentation-ready at all times It delivers intuitive ad-hoc reporting for business users
  • 112. Introducing Power View It is now also available in Excel 2013, and with new features: • Maps • Pie charts • Hierarchies • KPIs • Drill down/Drill up • Report styles, themes and text resizing • Backgrounds with images • Hyperlinks • Printing
  • 113. Power View in Excel Excel Database server SQL AS (Tabular) Power View SQL RS ADOMD.NET SQL AS (PowerPivot)
  • 114. Power View in SharePoint Browser SharePoint web server Database server SharePoint app server SQL AS (PowerPivot) SQL AS (Tabular) SQL RS Add-In SQL RS Power View
  • 115. Powerful Self-Service BI with Excel 2013
  • 116. Power Map for Microsoft Excel enables information workers to discover and share new insights from geographical and temporal data through three-dimensional storytelling. What Is Power Map?
  • 117. Map Data • Data in Excel • Geo-Code • 3D and 3 Visuals Discover Insights • Play over Time • Annotate points • Capture scenes Share Stories • Cinematic Effects • Interactive Tours • Share Workbook Power Map: Steps to 3D insights
  • 119. Power Map Excel Add-in to Enhance Data Visualization Map data, discover insight, and share stories