SlideShare a Scribd company logo
1 of 56
Download to read offline
BigData Use-
cases
- Prepared by
- Vishal Shukla
- Pranav Shukla
- Krishna Meet
Brevitaz Overview
● Founded in 2014
● Small team of technocrats delivering Big Data Solutions
● Global client-base in Europe and Asia-pacific region
● Expertise
○ Full-text search
○ Real-time analytics
○ Log analytics
○ BigData analytics
○ IoT based solutions
○ Machine learning
○ BigData warehousing
● Technologies
○ Spark, Hadoop, Kafka, Flume, Storm
○ Elasticsearch, Logstash, Kibana
○ MongoDB, Cassandra, HBase, Apache Titan
○ Impala, Spark SQL, Hawk
○ Java & Spring stack, Typesafe stack (Scala, Akka, Spray, Slick)
○ AngularJS
Agenda
➔ Big Data & Analytics
➔ Full-text search
➔ Log analytics
➔ Big Data Analytics
➔ Real-time Analytics
➔ IoT Analytics
➔ Machine Learning on BigData
➔ Big Data Warehousing
➔ Big Data is for Everyone
Analytics
Data is growing!
Full-text
Search
Spot the right data
quickly
“
It’s all about being able to spot
Right Information at Right Time
◎ Relevance search in near real-time
○ Find results matching “iphone”. Please don’t show me
Iphone chargers in first page.
◎ Fuzzy search and search suggestions
○ Find results matching “iphne"
◎ Faceted search
○ Filters in amazon after searching a keyword
◎ Complex search with multiple criteria
○ Find me products matching “iphone” with in price range
30000 INR to 50000 INR and color “Space grey”
◎ Geo-spatial search
○ Find restaurants within 10 km radius from my current
location. And yes, I want to see closer ones on top.
Full-text search - What it is?
What are you talking! I am here for
BigData!
Elasticsearch - does all of these for massive
volume, variety and velocity
◎ Crawl third-party websites
◎ Aggregate and classify the data
◎ Develop custom application on top of classified
data
Use-case - Information Aggregator
◎ Google’s “Did you mean?”
◎ Search suggestions as you type
◎ Text analytics
◎ McGrowHill - Transform text-books into digital
learning resource
◎ SoundCloud - Quickly find music that interests
them
Other use-cases
Log
Analytics
Collect, analyze and
Improvise
“
Transform your dumb logs into
actionable insights
● Use machine generated logs to get operational
insights
● Sensors, application servers, web servers or any
IoT device logs
To interactively answer questions like...
◎ How many users signed up this week?
◎ How users are using your website / mobile app
◎ How successful is our advertising campaign?
◎ Why is the database slow?
◎ Which are the websites categories my team is
spending the most time at?
◎ Who are the potential employees to resign next?
Log Analytics - What it is?
What’s the big deal!
Use-case - Network Logs Analysis
◎ High velocity
◎ High volume
◎ Collect, analyze and improvise
◎ Analyze click stream data to provide
personalized offers and user experience
◎ Interactive drill-down analysis
◎ Compliance reporting through interactive
dashboard
◎ Real-time alerts on invalid login attempts
◎ Detect outages
◎ Multi-channel funnel reporting for your
Advertising campaigns to find out which
channels contribute the most for conversions
Other use-cases
BigData
Analytics
Make your data speak
“
Combine all sources of data to
uncover hidden patterns and
unknown relations in your data
● Take your transactional data from various
sources
● Take operational and user behaviour logs data
● Collect social data
● Combine data collected from various sources to
To interactively answer questions like...
◎ What is increase or decrease in sales over the
years?
◎ How many unique customers are acquired this
year?
◎ Which products are trending disproportionately
this year?
Big Data Analytics - What it is?
Usecase - Supply chain management
◎ RFID labels can indicate which product is where
at what time
◎ Get more accurate business insights
◎ Theft detection
◎ Social media sentiment analysis to get end-user
feedback on launched products
◎ Identify market trends
◎ Predict employees attrition
◎ Customer churn analysis
◎ Influencer analysis
◎ Lead generation
◎ Proactive issues monitoring
◎ For insurance companies, identify potential
customers by combining birth, marriage and
health data
Other use-cases
Real-time
Analytics
Analyse instantaneously
as you collect data
“
Lag of seconds can make a
fraudster and you
● Ingest streaming data, possibly at high velocity
● Analyse and react immediately
To solve problems like...
◎ Identify changing trends in real-time
◎ Detect fraud
◎ Analyse policy violations and react immediately
◎ Reduce downtimes
◎ Provide better and quicker business decisions
Real-time Analytics - What it is?
Use-case - Enrich Customer Experience
◎ Get real-time feeds about customer location or
products being browsed
◎ Combine with historical user behaviours
◎ Roll out offers in real-time
◎ Hospitality Industry
○ Bad weather reduces travel, which then
reduces overnight lodging
○ Combine weather data with flight
cancellation to identify stranded travellers
○ Offer hotel coupons based on near by
location.
Other use-cases
◎ Fraud detection
◎ Predict and enrich customer experience based on
location, lifestyle
◎ Real-time process visibility across an enterprise
◎ Suggest optimal routes based on current traffic
data
◎ Get player performance metrics in real-time to
substitute players at right time
Other use-cases
IoT
Analytics
Let machines communicate
● Use sensors to detect low level data
● Report the captured data to server
● Analyse and get back to user
To provide smart alerts and suggestions like
◎ Schedule maintenance of machines
◎ Your pulse rate is disproportionately increasing
◎ Medicines manufactured in a batch is not
complying to standards
IoT Based Smart Solutions - What it is?
Use-case
◎ Performance measurement & maintenance
schedule
DIAGRAM
◎ In agriculture, Sensors can detect crop health
along with geo data and based on that alert can
be sent to farmers where they need to focus
◎ In retail, smart-shelves can detect and send
alerts on when to replenish
◎ Smart home can analyze the patterns of each
family member and optimize energy usage
Other use-cases
Machine
Learning
on
BigData
Make the machines
learn from data
What is machine learning?
◎ Machine learning is not programming a machine to
do stuff
◎ Machine learning is making the machine learn and
adapt based on the observed data
Where is machine learning used?
● Identify similarities between products, users
● Predict values from past data
● Classify items into categories, like an email is spam
or not spam
in order to ...
◎ Predict expected outcome
◎ Categorize large amounts of data
◎ Optimize algorithms or paths
◎ Find similarities
◎ Improve quality of predictions continuously
“
Recommending the right products
makes the difference between
selling or not selling a product
Use-case - Recommending Products
◎ Compare thousands of
users/products with each other
to find similar “clusters”
◎ Content-based filtering -
Recommend similar products
to what customer has already
bought
◎ Find similar customers to the
current customer and
recommend him what they
have bought
◎ Apply what is known as
Clustering algorithms in
machine learning on Big Data
Use-case - Optimise team combination in Sports
◎ Choose best performing team with limited
budget
◎ It was first applied in Baseball, now many
professional games use these techniques
◎ Choose a team consisting of players who could
win at least enough games to make to the play-
offs
◎ Use data analysis techniques to find undervalued
players
Use-case - Sports
What they achieved?
◎ Average 90 wins in each
season in less than 30M $
◎ Same number of wins in
1/3rd of budget than
another team
◎ 20 more wins than
another team with similar
budget
Other use-cases
◎ Fraud detection in banking and other sectors
◎ Fine grained customer segmentation for targeted
products
◎ Predicting next product failure and sending a
replacement part in advance
◎ Predict best candidates
Big Data
Warehousing
Catch all that you can so, you
can analyze it later
Why modernize Data Warehouse with Big Data?
Traditional Enterprise Data Warehouse (EDW) can only
◎ Store only structured data
◎ Extremely expensive license cost per TB of storage
◎ Capacity constrained with ETL and query workloads
big data will help to...
◎ Store unstructured, semi-structured data
◎ Combine your structured data with other sources
◎ Run interactive SQL queries on big data
◎ Offload ETL workload from your EDW
◎ Offload less frequently used data from your EDW
◎ Save licensing costs
Use-case - Modernizing Data Warehouse
◎ Low cost storage for years of data
◎ Data lake for structured, unstructured and semi-
structured data
◎ Interactive queries on historic data
◎ Online archival with reporting
○ Make years of data available
◎ ETL off-loading
○ Spark jobs to reduce ETL job time from hours
to minutes
◎ Batch reports off-loading
○ Reduce load on your warehouse by off-
loading batch reports
◎ Big Data Discovery
○ Proactively find patterns guided by the
system
Other use-cases
But we are just a startup !
“
Start small. Then scale.
Next steps
ry, evaluate and adopt in risk-
free manner
◎ Identify sources of your unused data
○ like server logs
○ social streams
◎ Collect and store on cloud to minimize initial
investment
◎ Many cloud options like Amazon EC2,
Databricks, Altiscale...
◎ Use open-source analytics engines like
Elasticsearch, Kibana. They are free to use.
◎ Experience the success
◎ Automate using sensors or IoT devices to add
more sources of useful data
Start small and then scale
◎ https://aws.amazon.com/public-data-sets/
◎ https://data.gov.in/
◎ https://open-data.europa.eu/en/data/
◎ https://www.data.gov/
◎ https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
Some open datasets to play with
Woo-ha! I am feeling empowered!
Thanks!
Any questions?
Contact Us
@pranavshukla81
http://in.linkedin.
com/in/pranavshukla81
pranav.shukla@brevitaz.com
@vishal1shukla2
https://in.linkedin.com/in/vishalshu
vishal.shukla@brevitaz.com

More Related Content

What's hot

Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches DataWorks Summit
 
Webinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the DarkWebinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the DarkDataStax
 
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...In-Memory Computing Summit
 
Pivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming dataPivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming dataImply
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLDataStax
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayDataStax
 
Getting Big Value from Big Data
Getting Big Value from Big DataGetting Big Value from Big Data
Getting Big Value from Big DataDataStax
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server ProLynn Langit
 
Big Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudBig Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudJen Aman
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...DataStax
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for ArchitectsTomasz Kopacz
 
Data saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewData saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewRiccardo Zamana
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduCloudera, Inc.
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxDataStax
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Data Con LA
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azureDavid Giard
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBaseJames Serra
 

What's hot (20)

Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
 
Webinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the DarkWebinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the Dark
 
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...
 
Pivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming dataPivot 2.0 - The next generation visualization tool for your streaming data
Pivot 2.0 - The next generation visualization tool for your streaming data
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQL
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each Day
 
Getting Big Value from Big Data
Getting Big Value from Big DataGetting Big Value from Big Data
Getting Big Value from Big Data
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server Pro
 
Big Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudBig Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the Cloud
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
 
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
 
Data saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overviewData saturday malta - ADX Azure Data Explorer overview
Data saturday malta - ADX Azure Data Explorer overview
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra Rockstar
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStax
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
 

Similar to Big Data Usecases

Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)ShehryarSH1
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Denodo
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessInside Analysis
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data PlatformDani Solà Lagares
 
Big Data overview
Big Data overviewBig Data overview
Big Data overviewalexisroos
 
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4jNeo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4jNeo4j
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
Big data from the trenches
Big data from the trenchesBig data from the trenches
Big data from the trenchesAzrul MADISA
 
So many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS providerSo many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS providerSirris
 
Innovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data AnalyticsInnovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data AnalyticsAmazon Web Services
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDatabricks
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your EnterpriseWSO2
 
7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaS7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaSFrederik Denkens
 
Three Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data InfrastructureThree Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data InfrastructureFlyData Inc.
 

Similar to Big Data Usecases (20)

Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Optimisation vs prediction
Optimisation vs predictionOptimisation vs prediction
Optimisation vs prediction
 
Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of Success
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
 
Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4jNeo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Big data from the trenches
Big data from the trenchesBig data from the trenches
Big data from the trenches
 
So many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS providerSo many clouds - 7 things to consider when choosing your IaaS provider
So many clouds - 7 things to consider when choosing your IaaS provider
 
Innovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data AnalyticsInnovation with AWS on : Big Data Analytics
Innovation with AWS on : Big Data Analytics
 
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
 
Tf gsds
Tf gsdsTf gsds
Tf gsds
 
Big Data and Business Insight
Big Data and Business InsightBig Data and Business Insight
Big Data and Business Insight
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your Enterprise
 
7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaS7 things to consider when choosing your IaaS provider for ISV/SaaS
7 things to consider when choosing your IaaS provider for ISV/SaaS
 
Three Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data InfrastructureThree Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data Infrastructure
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 

Big Data Usecases

  • 1. BigData Use- cases - Prepared by - Vishal Shukla - Pranav Shukla - Krishna Meet
  • 2. Brevitaz Overview ● Founded in 2014 ● Small team of technocrats delivering Big Data Solutions ● Global client-base in Europe and Asia-pacific region ● Expertise ○ Full-text search ○ Real-time analytics ○ Log analytics ○ BigData analytics ○ IoT based solutions ○ Machine learning ○ BigData warehousing ● Technologies ○ Spark, Hadoop, Kafka, Flume, Storm ○ Elasticsearch, Logstash, Kibana ○ MongoDB, Cassandra, HBase, Apache Titan ○ Impala, Spark SQL, Hawk ○ Java & Spring stack, Typesafe stack (Scala, Akka, Spray, Slick) ○ AngularJS
  • 3.
  • 4. Agenda ➔ Big Data & Analytics ➔ Full-text search ➔ Log analytics ➔ Big Data Analytics ➔ Real-time Analytics ➔ IoT Analytics ➔ Machine Learning on BigData ➔ Big Data Warehousing ➔ Big Data is for Everyone
  • 8. “ It’s all about being able to spot Right Information at Right Time
  • 9. ◎ Relevance search in near real-time ○ Find results matching “iphone”. Please don’t show me Iphone chargers in first page. ◎ Fuzzy search and search suggestions ○ Find results matching “iphne" ◎ Faceted search ○ Filters in amazon after searching a keyword ◎ Complex search with multiple criteria ○ Find me products matching “iphone” with in price range 30000 INR to 50000 INR and color “Space grey” ◎ Geo-spatial search ○ Find restaurants within 10 km radius from my current location. And yes, I want to see closer ones on top. Full-text search - What it is?
  • 10. What are you talking! I am here for BigData!
  • 11. Elasticsearch - does all of these for massive volume, variety and velocity
  • 12. ◎ Crawl third-party websites ◎ Aggregate and classify the data ◎ Develop custom application on top of classified data Use-case - Information Aggregator
  • 13. ◎ Google’s “Did you mean?” ◎ Search suggestions as you type ◎ Text analytics ◎ McGrowHill - Transform text-books into digital learning resource ◎ SoundCloud - Quickly find music that interests them Other use-cases
  • 15. “ Transform your dumb logs into actionable insights
  • 16. ● Use machine generated logs to get operational insights ● Sensors, application servers, web servers or any IoT device logs To interactively answer questions like... ◎ How many users signed up this week? ◎ How users are using your website / mobile app ◎ How successful is our advertising campaign? ◎ Why is the database slow? ◎ Which are the websites categories my team is spending the most time at? ◎ Who are the potential employees to resign next? Log Analytics - What it is?
  • 18. Use-case - Network Logs Analysis ◎ High velocity ◎ High volume ◎ Collect, analyze and improvise
  • 19.
  • 20. ◎ Analyze click stream data to provide personalized offers and user experience ◎ Interactive drill-down analysis ◎ Compliance reporting through interactive dashboard ◎ Real-time alerts on invalid login attempts ◎ Detect outages ◎ Multi-channel funnel reporting for your Advertising campaigns to find out which channels contribute the most for conversions Other use-cases
  • 22. “ Combine all sources of data to uncover hidden patterns and unknown relations in your data
  • 23. ● Take your transactional data from various sources ● Take operational and user behaviour logs data ● Collect social data ● Combine data collected from various sources to To interactively answer questions like... ◎ What is increase or decrease in sales over the years? ◎ How many unique customers are acquired this year? ◎ Which products are trending disproportionately this year? Big Data Analytics - What it is?
  • 24. Usecase - Supply chain management ◎ RFID labels can indicate which product is where at what time ◎ Get more accurate business insights ◎ Theft detection
  • 25. ◎ Social media sentiment analysis to get end-user feedback on launched products ◎ Identify market trends ◎ Predict employees attrition ◎ Customer churn analysis ◎ Influencer analysis ◎ Lead generation ◎ Proactive issues monitoring ◎ For insurance companies, identify potential customers by combining birth, marriage and health data Other use-cases
  • 27. “ Lag of seconds can make a fraudster and you
  • 28. ● Ingest streaming data, possibly at high velocity ● Analyse and react immediately To solve problems like... ◎ Identify changing trends in real-time ◎ Detect fraud ◎ Analyse policy violations and react immediately ◎ Reduce downtimes ◎ Provide better and quicker business decisions Real-time Analytics - What it is?
  • 29. Use-case - Enrich Customer Experience ◎ Get real-time feeds about customer location or products being browsed ◎ Combine with historical user behaviours ◎ Roll out offers in real-time
  • 30. ◎ Hospitality Industry ○ Bad weather reduces travel, which then reduces overnight lodging ○ Combine weather data with flight cancellation to identify stranded travellers ○ Offer hotel coupons based on near by location. Other use-cases
  • 31. ◎ Fraud detection ◎ Predict and enrich customer experience based on location, lifestyle ◎ Real-time process visibility across an enterprise ◎ Suggest optimal routes based on current traffic data ◎ Get player performance metrics in real-time to substitute players at right time Other use-cases
  • 33. ● Use sensors to detect low level data ● Report the captured data to server ● Analyse and get back to user To provide smart alerts and suggestions like ◎ Schedule maintenance of machines ◎ Your pulse rate is disproportionately increasing ◎ Medicines manufactured in a batch is not complying to standards IoT Based Smart Solutions - What it is?
  • 34. Use-case ◎ Performance measurement & maintenance schedule DIAGRAM
  • 35. ◎ In agriculture, Sensors can detect crop health along with geo data and based on that alert can be sent to farmers where they need to focus ◎ In retail, smart-shelves can detect and send alerts on when to replenish ◎ Smart home can analyze the patterns of each family member and optimize energy usage Other use-cases
  • 37. What is machine learning? ◎ Machine learning is not programming a machine to do stuff ◎ Machine learning is making the machine learn and adapt based on the observed data
  • 38. Where is machine learning used? ● Identify similarities between products, users ● Predict values from past data ● Classify items into categories, like an email is spam or not spam in order to ... ◎ Predict expected outcome ◎ Categorize large amounts of data ◎ Optimize algorithms or paths ◎ Find similarities ◎ Improve quality of predictions continuously
  • 39. “ Recommending the right products makes the difference between selling or not selling a product
  • 40. Use-case - Recommending Products ◎ Compare thousands of users/products with each other to find similar “clusters” ◎ Content-based filtering - Recommend similar products to what customer has already bought ◎ Find similar customers to the current customer and recommend him what they have bought ◎ Apply what is known as Clustering algorithms in machine learning on Big Data
  • 41. Use-case - Optimise team combination in Sports ◎ Choose best performing team with limited budget ◎ It was first applied in Baseball, now many professional games use these techniques ◎ Choose a team consisting of players who could win at least enough games to make to the play- offs ◎ Use data analysis techniques to find undervalued players
  • 43. What they achieved? ◎ Average 90 wins in each season in less than 30M $ ◎ Same number of wins in 1/3rd of budget than another team ◎ 20 more wins than another team with similar budget
  • 44. Other use-cases ◎ Fraud detection in banking and other sectors ◎ Fine grained customer segmentation for targeted products ◎ Predicting next product failure and sending a replacement part in advance ◎ Predict best candidates
  • 45. Big Data Warehousing Catch all that you can so, you can analyze it later
  • 46. Why modernize Data Warehouse with Big Data? Traditional Enterprise Data Warehouse (EDW) can only ◎ Store only structured data ◎ Extremely expensive license cost per TB of storage ◎ Capacity constrained with ETL and query workloads big data will help to... ◎ Store unstructured, semi-structured data ◎ Combine your structured data with other sources ◎ Run interactive SQL queries on big data ◎ Offload ETL workload from your EDW ◎ Offload less frequently used data from your EDW ◎ Save licensing costs
  • 47. Use-case - Modernizing Data Warehouse ◎ Low cost storage for years of data ◎ Data lake for structured, unstructured and semi- structured data ◎ Interactive queries on historic data
  • 48. ◎ Online archival with reporting ○ Make years of data available ◎ ETL off-loading ○ Spark jobs to reduce ETL job time from hours to minutes ◎ Batch reports off-loading ○ Reduce load on your warehouse by off- loading batch reports ◎ Big Data Discovery ○ Proactively find patterns guided by the system Other use-cases
  • 49. But we are just a startup !
  • 51. Next steps ry, evaluate and adopt in risk- free manner
  • 52. ◎ Identify sources of your unused data ○ like server logs ○ social streams ◎ Collect and store on cloud to minimize initial investment ◎ Many cloud options like Amazon EC2, Databricks, Altiscale... ◎ Use open-source analytics engines like Elasticsearch, Kibana. They are free to use. ◎ Experience the success ◎ Automate using sensors or IoT devices to add more sources of useful data Start small and then scale
  • 53. ◎ https://aws.amazon.com/public-data-sets/ ◎ https://data.gov.in/ ◎ https://open-data.europa.eu/en/data/ ◎ https://www.data.gov/ ◎ https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public Some open datasets to play with
  • 54. Woo-ha! I am feeling empowered!