1. Why do businesses need to keep a watch over their Big Data sources?
Vijay P Rao
2. Changing Paradigms over the decades
1950 to 1980
IBM ruled the roost and its
business revolved around
hardware selling
1980 to 2000
Microsoft dominated and big bucks
lay in the software that drove the
hardware
2010 onwards
Likes of Facebook, Twitter are
relying on enormous data and
content to drive business value
2000 to 2010
Google changed the paradigm yet
again by focusing on productivity
enhancing apps, data indexing and
data fetch
3. The Data Explosion
From the dawn of civilization till 2003, we had
created 5 Exabytes of Information
Now we are creating that sort of
information every 2 days
Eric Schmidt – Google Founder
6. GBs TBs PBs EBs ZBs
No Maybe Definitely
Volume
Variety
As we move across the volume spectrum, SQL based data processing technologies start breaking down
from a “Processing Time Taken v/s Value Delivered” perspective. Big Data Processing Technologies enter
Combining two or more dimensions or adding the 4th dimension of variability forces enterprises to
look beyond traditional data handling technologies and at Big Data Processing technologies starting
right at gigabyte volumes
When do Businesses need Big Data Technologies?
GB/s TB/s PB/s and above
No Maybe Definitely
Velocity
As we move across the velocity spectrum, drawing insights from low gigabytes of data flowing in per
second may still be in the realm of traditional analytic tools. Need for Big Data Processing tools will be felt
as we starting moving to higher GB volumes per second
1 3 More varieties of data
One or two varieties of data do not pose a major problem for traditional data processing technologies. As
varieties grow beyond 2 or 3, Big Data technologies are increasingly felt
2
No Maybe Definitely
4
8. How Big Data is re-defining Analytics
Basic Analytics
Performance Management
What happened in
the past?
Advanced Analytics
Complex Even
Processing
Natural Language
Processing What is happening
at this moment?Multivariate Statistical
Analysis
Text Mining
Time-series Analysis Entity Extraction What will happen?
Data Mining Sentiment Analysis
Predictive Modeling Semantic Analysis
What is most likely
to happen?
Ensemble Modeling Behavioral Analytics
Constrained Based
Optimization
Social Network
Analysis What might happen
if we give it a little
nudge?
Social Media
Analytics
9. Big Data Business Use Cases - examples
Industry Business use case examples
Media &
Entertainment
Cross-channel marketing attribution, Ad revenue optimization,
Consumer segmentation & micro-targeting, Predictive behavioral
targeting,Web analytics, Campaign tracking and management
Retail Web analytics, Personalization, Recommendation engines, Lifecycle
marketing, Loyalty program, Forecasting, Inventory optimization
Energy & Utilities Preventive Asset maintenance, Graph analysis, Recommendation
engines, Smart Metering
Financial Services Bottoms-up risk analysis, Line of business link and fraud analysis,
Cross-account referral analysis, Lifecycle marketing, Counterparty
network risk analysis
Communications Targeted marketing promotions, Fraud analysis and prevention,
Customer churn prevention analysis, Network optimization, CDR (call
detail record) analysis
Insurance Policy pricing engine, Customer retention, Re-insurance risk
assessment, Loyalty campaign targeting
12. Startups making a Mark – Modak Analytics
Modak Analytics, a three-year-old start-up, has collected and sifted through a whopping 18
tera bytes of data, which includes 10 TB (one TB is 1,000 gigabytes or GB) in .pdf format.
• Of the 13.4 crore voters in Uttar Pradesh, the country’s biggest State by number of voters, at least 1.2 crore
people have Ram somewhere in their name.
• In Andhra Pradesh, the name Srinivas is spelt in 600 different ways.
• About three lakh women in Gujarat have Gita Ben as their first name.
• Bihar is home to 3.27 lakh women with Sita as their first name and an almost equal number of women
named Geeta.
• Ramesh seems to be the most common first name across the country.
• The other names that are quite popular are:
• Lakshmi (19.28 lakh, Andhra Pradesh),
• Fernandes (81,000, Goa),
• Shankar (11.41 lakh) and
• Patil (24 lakh, Maharashtra).
• Two longest names for voters are registered in Andhra Pradesh –
• E Janake Sathya Surya Vijaya Durga Maheshvari in Sangareddy constituency and
• Venkata Sathya Suriya Maitreyi Kumari Toleti in Narsapur constituency.
• There is comedy of errors too. In Chhattisgarh, the age of a voter is marked as 19,545 years while 64 voters
in AP are ‘0’ years of age.
The 10-employee strong self-funded firm hopes to register a turnover of Rs 1 crore this year.
13. How should businesses be responding?
Manage
Preventing IT systems
from being overrun by
data
Unearth Patterns
Unearth key patterns to help
•Optimize business processes
•Reduce OpEx
Monetize
Seeking revenue
opportunities
through new business
models and offerings
Know more,
Make better
decisions
BIG DATA
BEAUTIFUL DATA
Evolve Insights
•Data Overlay
•PatternsVisualization
Bottom Line
Savings
Top Line
Growth
14. “Organizations that leverage big data will
financially outperform their peers by 20
percent or more”
IBM 2010 Global CFO Study
“It’s not just big data in the sense that
we have lots of data. You can also think
of it as ‘nano’ data, in the sense that we
have very, very fine-grained data – an
ability to measure things much more
precisely than in the past”
Erik Brynjolfsson, MIT
15. Companies will spend BIG on Big Data Analytics
FINANCIAL
SERVICES
SOFTWARE
/ INTERNET GOVERNMENT
COMMS. &
MEDIA
ENERGY &
UTILITIES
16. Big Data Analytics - Advantages
Move beyond
Linear
Approximation
Predictive analyses
using data mining
and
statistical modeling
Allows detection of
“Black Swans”
Extraction and
Analysis of Low
Incidence
Populations
Doing away with the need for Statistical Significance
17. Data Interpretation is no joke…
• The sheer volume and velocity of data involved call for newer
methods through which the human mind can interpret the data to
take decisions.
• HANS ROSLING – WORLD DATA
• Visually comprehend the relationship between actual efficacy and
popularity of ‘super foods’
• http://www.informationisbeautiful.net/visualizations/snake-oil-
superfoods/
18. Careers in Big Data
• Data scientist
• Often at the top of the big data hierarchical chart
• Typically proven professionals who posses deep analytical talent
• Data architect
• Computer programmers who are skilled in working with undefined data and
disparate types of data
• Data visualizer
• Professionals who are able to translate data into information that people can
effectively use
• Data change agent
• Use data analytics to recommend and drive changes within an organization
• Data engineer and operator
• Designers, builders and managers of big data systems
19. Big Questions on Big Data Analytics
that you need to ask if you want your business to succeed in this NEW DATA DRIVEN WORLD
• What happens in a world of radical transparency, with data widely
available?
• If you could test all your decisions, how would that change the way you
compete?
• How would your business change if you used big data for widespread, real
time customization?
• How can big data augment or even replace Management?
• Could you create a new business model based on data?
• …
20. Big Data Driven Disruption
Enterprise Data
Warehouses
Relational
Database
Structured
Data
Analytics Traditional BI
OLAP &
Reporting
Relational
Queries
Big Data Disruption
Non-
Relational
Databases
Un-Structured
Data
Structured
Data
New Data Sources
Sensors
Logs
Apps
Bots
Crawlers
Streams
Flat files, raw text, system
logs, images, audio, video
Data
Exploration
Multi-Dimensional
Data Overlay
Pattern Recognition
Analytics
Spatial Analysis
Predictive Modeling
Simulation Unsupervised Learning
Natural Language
Processing
Cluster Analysis
Visualization