Fundamentals of Big Data, Hadoop project design and case study or Use case
General planning consideration and most necessaries in Hadoop ecosystem and Hadoop projects
This will provide the basis for choosing the right Hadoop implementation, Hadoop technologies integration, adoption and creating an infrastructure.
Building applications using Apache Hadoop with a use-case of WI-FI log analysis has real life example.
2. Agenda
Before Getting in into projects
General Planning Considerations
Most necessary things.
Data life cycle In Hadoop.
Retail WiFi Log-file Analysis with Hadoop – A Use Case
3. Before Getting in into projects
Understand problem, requirement and feasibility .
Come up with document and model, design.
Study existing model modify and adopt
Resource, cost, expert building more important.
Define stage and development path
Existing model to new model- Define correspondent components .
Consider Hadoop Design principles
4. Before Getting in into projects continue…
This will provide the basis for choosing the right Hadoop implementation and creating
an infrastructure.
What data will be stored and analyzed? What questions do you want to answer?
What methods of analysis do you plan to use?
What data do you want to bring into Hadoop?
How will Hadoop fit into my environment?
Is Hoadoop/NoSql actually required ?
Which among the various big data technologies resolve the problem in most effect
manner?
Is ensure that adoption of thus technologies address your business needs?
How will your use of Hadoop expand across users, applications, and so on?
What is the scope of applications to be supported and what are their requirements
(disaster recovery, availability, and so on)?
What existing, new tools and infrastructure do you want to integrate with Hadoop?
Will your administrators need management tools or will administrators be hired or
trained for managing Hadoop?
Get your answer
5. Hadoop &Things are complicated ?
When it comes to Hadoop, however, things become a little
bit more complicated.
Hadoop encompasses a multiplicity of tools that are
designed and implemented to work together
Read notes
6. General Planning Considerations
To deploy, configure, manage and scale Hadoop clusters in a way
that optimizes performance and resource utilization there is a lot
to consider.
Operating system: Using a 64-bit operating system helps to avoid
constraining the amount of memory that can be used on worker
nodes.
Computation V/S Data: Computational (or processing) capacity is
determined by the aggregate number of Map/Reduce slots
available across all nodes in a cluster. Map/Reduce slots are
configured on a per-server basis. I/O performance issues can arise
from sub-optimal disk-to-core ratios (too many slots and too few
disks). HyperThreading improves process scheduling, allowing
you to configure more Map/Reduce slots.
7. General Planning Considerations continue…
Memory: Depending on the application, your system’s
memory requirements will vary..
Storage: A Hadoop platform that’s designed to achieve
performance and scalability by moving the compute activity
to the data is preferable. Data storage requirements for the
worker nodes may be best met by direct attached storage
(DAS) in a Just a Bunch of Disks (JBOD) configuration and
not as DAS with RAID or Network Attached Storage (NAS).
8. General Planning Considerations continue…
Capacity: The number of disks and their corresponding
storage capacity determines the total amount of the
FileServer storage capacity for your cluster. The more disks
you have, the less likely it is that you will have multiple tasks
accessing a given disk at the same time.
Network:
Hadoop is very bandwidth-intensive, Use dedicated switches
use rack awareness.
Beware of oversubscription in top-of-rack and core switches
Consider bonded Ethernet to mitigate against failure
9. Most necessary in Hadoop Ecosystem
Automated deployment
Automated deployment of both operating systems and the Hadoop
software ensures consistent, streamlined deployments.
Configurations that are documented and simpler to deploy than
traditional manual IT deployment strategies, testing, and validation.
Configuration management
Configuration management tool for all Hadoop environments.
configuration changes, managing the installation of the Hadoop
software, and providing a single interface to the Hadoop
environment for updates and configuration changes.
Monitoring and alerting
Hardware monitoring and alerting is an important part of all
dynamic IT environments. Successful monitoring and alerting
ensures that problems are caught as soon as possible and
administrators are alerted so the problems can be corrected before
users are impacted.
10. A Framework for Considering Hadoop
Distributions
Core distribution: All vendors use the Apache Hadoop core and
package it for enterprise use.
Management capabilities: Some vendors provide an additional layer
of management software that helps administrators con"gure,
monitor, and tune Hadoop.
Enterprise reliability and integration: A third party of vendors offers
a more robust package, including a management layer augmented
with connectors to existingenterprise systems
and engineered to provide the same high level
of availability, scalability, and reliabilty as
other enterprise systems
And Support.
11. Data life cycle In Hadoop.
Data input(collection and load)
Data storage
Data analysis and processing.
Data product
12. Data life cycle In Hadoop continue…
Data collection
Getting the data into a Hadoop cluster is the first step in any
Big Data deployment
This Raw data. Required in many model
1. Flume: Apche Flume is a distributed, reliable, and available
service for efficiently collecting, aggregating, and moving
large amounts of log data.
2. Sqoop: Apache Sqoop(TM) is a tool designed for
efficiently transferring bulk data between Apache Hadoop
and structured datastores such as relational databases.
3. Your own is also best !
13. Data life cycle In Hadoop continue…
Data storage
HDFS,
HBase
Hive
Use ETL model before store.
14. Data life cycle In Hadoop continue…
Data Process
1. Map-reduce best way
Streaming
2. Map-reduce difficult ? : use Hbase, Hive, pig
15. Data life cycle In Hadoop continue…
Data product
1. Integration Rest Architecture.
2. Application level computation
3. Integration application level RDBMS
19. Tracking users visits
Tracking users in the online world to retail stores.
Data source:
logs in WiFi access point when connected to visitors WiFi
enabled mobile phones.
20. Collected data able to answer
Following questions for a particular store:
How many people visited the store (unique visits)?
How many visits did we have in total?
What is the average visit duration?
How many people are new vs. returning?
In what situation visitors are more ?
Answer security related questions
21. Business Architecture
21
Data Logic & Rules Applications
Data Sources
• Receive and
process
messages
• Store in flat files
• Store in
operational &
reporting
databases
Rules
• Apply Business
Rules – event and
condition specific
• Consistent rules
and logic used by
multiple
applications and
tools
• Store in
operational &
reporting
databases
General Analytics
Service and Offers
subscription system
Security
22. Set up
WiFi access points to simulate two different stores with OpenWRT, a
linux based firmware for routers, installed *
A virtual machine acting as central syslog daemon collecting all log
messages from the WiFi routers
Flume to move all log messages to HDFS, without any manual
intervention (no transformation, no filtering)
CDH4 cluster running with installed and monitored with Cloudera
Manager.
Pentaho Data Integration‘s graphical designer for data transformation,
parsing, filtering and loading to the warehouse (Hive)
Hive as data warehouse system on top of Hadoop to project structure
onto data
Impala for querying data from Hive in real time
Microsoft Excel to visualize results **
23. Data collection
Configured WiFi access points to send all their local syslog
messages to a central syslog server a shared storage.
Using UDP,TCP,
In OpenWRT’s Unified Configuration Interface, simply called
UCI. You can collect logs UDP/TCP service. Assuming your
syslog server listens on address 192.168.0.1 with UDP/TCP,
and extend detailed log output, the configuration looks like:
24. Data collection
The logs are send through periodic base like 4 hour or
12hours intervals and some are real time.
The shared storage server are configured to accept
messages from remote hosts through HTTP/FTP and
instructed where to write those messages to defined for
format like time based or WiFi access points . They written
as just text files in disk.
The logs stored in shared storage like:
/var/logs/wifilogs/20130612/bang-mainbranch.log
/var/logs/wifilogs/20130612/bang-branch1.log
/var/logs/wifilogs/20130612/bang-branch2.log
/var/logs/wifilogs/20130612/realtime.logs
25. Data collection
WiFi- Access
TCP/UDP
Data collection Server Shared Storage Data Collection sever
point
WiFi- Access
point
Localsysetm (Local Disk)
HTTP/FTP client
HTTP/FTP sever
WiFi- Access
point
WiFi- Access
point
Localsysetm (Local Disk)
HTTP/FTP client
HTTP/FTP
26. Data sample
Exported log-file look like:
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy WPA: start authentication
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy IEEE 802.1X: unauthorizing port
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy WPA: sending 1/4 msg of 4-Way Handshake
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy WPA: received EAPOL-Key frame (2/4 Pairwise)
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy WPA: sending 3/4 msg of 4-Way Handshake
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy WPA: received EAPOL-Key frame (4/4 Pairwise)
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy IEEE 802.1X: authorizing port
2013-01-21T13:39:51+01:00 buffalo hostapd: wlan0: STA 10:68:3f:40:xx:yy WPA: pairwise key handshake completed (RSN)
2013-01-21T13:41:25+01:00 fonera hostapd: wlan0: STA 24:ab:81:91:xx:yy IEEE 802.11: authentication OK (open system)
2013-01-21T13:41:25+01:00 fonera hostapd: wlan0: STA 24:ab:81:91:xx:yy MLME: MLME-AUTHENTICATE.
indication(24:ab:81:91:c8:62, OPEN_SYSTEM)
2013-01-21T13:41:25+01:00 fonera hostapd: wlan0: STA 24:ab:81:91:xx:yy MLME: MLME-DELETEKEYS.request(24:ab:81:91:c8:62)
2013-01-21T13:41:25+01:00 fonera hostapd: wlan0: STA 24:ab:81:91:xx:yy IEEE 802.11: authenticated
2013-01-21T13:41:25+01:00 fonera hostapd: wlan0: STA 24:ab:81:91:xx:yy IEEE 802.11: association OK (aid 1)
2013-01-21T13:41:25+01:00 fonera hostapd: wlan0: STA 24:ab:81:91:xx:yy IEEE 802.11: associated (aid 1)
28. Logical Architecture
Ingest
•Transportation and Storage
•HTTP/FLUME
Parse
• Sectioning and Record formation
•FLUME,PDI
Transform
• Object creation
•PDI
Publish
• Real Time
• Batch mode
• Integration patterns (RDBMS)
View • Reporting
Enterprise Architecture Team 28
29. Data ingestion to HDFS
we have a log-file as data source we set up Flume to stream
the incoming content to HDFS. Due to the Flume
terminology we had the components
data from the log-file as source
HDFS folder /user/flume/bang-mainbranch as sink
as you preferred a flat directory layout to simplify the
access/processing of the files later on
a channel, c1, to connect the source to the sink
30. FLUME :
Flume is a distributed, reliable,
and available service for
efficiently collecting,
aggregating, and moving large
amounts of log data. It has a
simple and flexible architecture
based on streaming data flows.
It is robust and fault tolerant
with tunable reliability
mechanisms and many failover
and recovery mechanisms. It
uses a simple extensible data
model that allows for online
analytic application.
32. Data ingestion to HDFS continue..
Flume is able to collect data from various sources it is
possible to configure Flume as “ with different source
points server” itself.
The WiFi access points would send their log messages
directly to the Flume agent also possible.
flume-ng agent --conf-file ./flume-datastream.conf --name a1 -Dflume.root.logger=INFO,console
33. Sample data looks in HDFS
With the directory structure /user/flume/20120612/datastream
2013-01-17T15:50:41+01:00 192.168.201.197 dropbear[1172]: Child connection from 192.168.201.99:55001
2013-01-17T15:50:46+01:00 192.168.201.197 dropbear[1172]: Password auth succeeded for 'root' from
192.168.201.99:55001
2013-01-17T15:50:52+01:00 192.168.201.197 dropbear[1172]: Exit (root): Disconnect received
2013-01-17T15:52:14+01:00 fonera hostapd: wlan0: STA 8c:64:22:3a:74:1f IEEE 802.11: disassociated due to
inactivity
2013-01-17T15:52:14+01:00 fonera hostapd: wlan0: STA 8c:64:22:3a:74:1f MLME: MLME-DISASSOCIATE.
indication(8c:64:22:3a:74:1f, 4)
2013-01-17T15:52:14+01:00 fonera hostapd: wlan0: STA 8c:64:22:3a:74:1f MLME: MLME-DELETEKEYS.
request(8c:64:22:3a:74:1f)
34. Parse and Transform
To insert data raw formate the data onto based on Hive
scheme. the data to the Hive data warehouse need to parse
the raw data into a comma separated format .
There are quite a few open-source BI tools on the market for
this: Palo, SpargoBI, Pentaho, ETL, Talend and many more.
Used Pentaho Data Integration(PDI).
35. Pentaho:
Pentaho Data Integration is the
ETL Server technology that will
be used to facilitate movement
of data between the new back-end
Hadoop environment and
downstream RDBMS systems.
36. Parse and Transform continue…
Pentaho for ETL
The collect WiFi router logs with Flume to store in HDFS, The
PDI used for transformation, parsing, filtering and finally
loading into Hive’s data warehouse.
Import the raw data to the Hive data warehouse we need to
parse the raw data into a comma separated format
according to Hive scheme.
The enabled us to design a MapReduce job for distributed
processing across multiple nodes for this task without any
programming environment.
38. Parse and Transform continue…
Map Reduce
The map phase will read all the raw log files collected by
Flume on HDFS.
The input is interpreted as TextInputFormat and therefore
every line will go through a regex evaluation during the map
phase.
Filtering and transformation used “Regex Evaluation”
In Filtering only line matches the pattern are selected.
In transformation within the selected line on only few
columns are taken.
39. Parse and Transform continue…
In Map Phase:
Transformation
By matching a particular line against a regular expression
Split up the pattern matched line in fields that will be used as
columns.
This is the regex for the transformation:
^((d{4})-(d{2})-(d{2})w(d{2}):(d{2}):(d{2})([+-]d{2}:d{2})) ([.a-zA-Z_0-9]*?) (.*?): (.*?): w*? ([w+:]{0,18}) (.*?): (.*)$
40. Parse and Transform continue…
The matched lines from the regex are divided into different columns
like this:
year Integer
month Integer
day Integer
hour Integer
minute Integer
second Integer
timezone String
host String
facility_level String
service_level String
mac_address String
protocol String
message String
41. Parse and Transform continue…
Filtering
All lines that do not match the regular expression are filtered, Like error
These line are discarded those lines because they carry useless
Disconnect received information for this case.
‘Filter Rows’ we remove empty lines. This is to ensure there are no
empty lines in the output after matching against the regular expression
Example lines:
2013-01-17T15:50:41+01:00 192.168.201.197 dropbear[1172]: Child connection from 192.168.201.99:55001
2013-01-17T15:50:46+01:00 192.168.201.197 dropbear[1172]: Password auth succeeded for 'root' from
192.168.201.99:55001
2013-01-17T15:50:52+01:00 192.168.201.197 dropbear[1172]: Exit (root):
42. Parse and Transform continue…
To produce a comma separated lines
Used a ‘User Defined Java Expression’ and concatenate the emitted
fields delimiting by ‘,’.
At the beginning of each line we did a further transformation and
addition of string : ISO 8601 string to unix timestamp.
To answer time related questions, e.g. average visit duration we need
values to calculate with. The unix timestamp is suitable for this.
The outputted stored to /user/20130612/routerlogs/parsed/
Here is the ‘User Defined Java Expression’:
(javax.xml.bind.DatatypeConverter.parseDateTime(iso_8601).getTimeInMillis()/1000) + "," + year
+ "," + month + "," + day + "," + hour + "," + minute + "," + second + "," + timezone + "," + host +
"," + facility_level + "," + service_level + "," + mac_address + "," + protocol + "," + message
43. Parse and Transform continue…
The each contains comma seperated fields
Configured our Pentaho MapReduce job to clean output path before
execution.
The outputted stored to /user/20130612/routerlogs/parsed/
$hadoop fs –ls /user/20130612/routerlogs/parsed/
drwxrwxrwx - hadoopuser hadoopuser 0 2013-01-21 15:24 /user/20130612/routerlogs/parse/_logs
-rw-r--r-- 3 hadoopuser hadoopuser 118963 2013-01-21 15:25 /user/20130612/routerlogs/parse/part-00000
-rw-r--r-- 3 hadoopuser hadoopuser 100500 2013-01-21 15:25 /user/20130612/routerlogs/parse/part-00001
-rw-r--r-- 3 hadoopuser hadoopuser 11826 2013-01-21 15:25 /user/20130612/routerlogs/parse/part-00002
44. Load
At the very end, the transformed and parsed raw data lands
in HDFS once the MapReduce job has finished
1358765267,2013,1,21,11,47,47,+01:00,buffalo,hostapd,wlan0,10:68:3f:40:20:2d,IEEE 802.1X,authorizing port
1358765267,2013,1,21,11,47,47,+01:00,buffalo,hostapd,wlan0,10:68:3f:40:20:2d,WPA,pairwise key handshake
completed (RSN)
Now we have parsed and Transformed log files on HDFS.
Used Pentaho Data Integration once again to import the
data to Hive’s warehouse
45. HIVE
Hive is a data warehouse
system for Hadoop that
facilitates easy data
summarization, ad-hoc
queries, and the analysis of
large datasets stored in
Hadoop compatible file
systems. Hive provides a
mechanism to project
structure onto this data and
query the data using a SQL-like
language called HiveQL.
46. Load…
created a HIVE table that matches the previously defined
schema with the query editor any query editor.
47. Load…
Loading data into the hive table is basically done by copying
files on HDFS from /user/20130612/routerlogs/parsed to
/user/hive/warehouse/routerlogs.20130612.
Automating the MapReduce job on a scheduled base, e.g
with Oozie.
Ensure incremental updates on the Hive table by using
partitioned table technique or unique output file naming
With date.
48. Load …
Querying the data with Impala:
Querying the data from Hive Table. The Sample data in hive
table
Sudhakara@loaclhost~]# hadoop fs -cat /user/hive/warehouse/routerlogs
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,MLME,MLME-AUTHENTICATE.
indication(98:0c:82:dc:8b:15, OPEN_SYSTEM)
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,MLME,MLME-DELETEKEYS.
request(98:0c:82:dc:8b:15)
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,IEEE 802.11,authenticated
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,IEEE 802.11,association OK (aid 2)
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,IEEE 802.11,associated (aid 2)
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,MLME,MLME-ASSOCIATE.
indication(98:0c:82:dc:8b:15)
1358756939,2013,1,21,9,28,59,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,MLME,MLME-DELETEKEYS.
request(98:0c:82:dc:8b:15)
1358757010,2013,1,21,9,30,10,+01:00,buffalo,hostapd,wlan0,98:0c:82:dc:8b:15,IEEE 802.11,deauthenticated
49. Analysis and Report
You can see, the line “authentication OK‘. It represents user
enter to the WiFi access area i.e. login.
You can see, the line “deauthenticated‘ . It represents user
exit from the WiFi access area i.e. logout.
After querying the data through impala , application should
calculate the duration calculation.
50. Impala
With Impala, you can query data,
whether stored in HDFS or Apache
HBase – including SELECT, JOIN, and
aggregate functions – in real time.
Furthermore, it uses the same metadata,
SQL syntax (Hive SQL), ODBC driver and
user interface (Hue Beeswax) as Apache
Hive, providing a familiar and unified
platform for batch-oriented or real-time
queries. (For that reason, Hive users can
utilize Impala with little setup overhead.)
The first beta drop includes support for
text files and SequenceFiles;
SequenceFiles can be compressed as
Snappy, GZIP, and BZIP (with Snappy
recommended for maximum
performance)
51. Analysis and Report….
Calculate the duration calculation for vistor the Hive query
looks like
SELECT A.ts, MIN(B.ts - A.ts), A.host, A.mac_address FROM routerlogs A,
routerlogs B WHERE A.host = B.host AND A.mac_address = B.mac_address AND
A.ts <= B.ts AND A.message LIKE '%authentication OK%' AND B.message LIKE
'%deauthenticated%' GROUP BY A.host, A.mac_address, A.ts;
Created a new Hive table called ‘visit_duration’ and loaded
the CSV file into it
create table visit_duration ( ts int,
duration_in_seconds int,
router string,
mac_address string) row format delimited
fields terminated by ',‘;
52. Analysis and Report…
Counting the visits for store number one reatail
(Build version: Impala v0.3 (3cb725b) built on Fri Nov 23 13:51:59 PST 2012)
[localhost:21000] > SELECT COUNT(*) FROM visit_durationWHERE router =
"buffalo";
135
Collecting the number of unique visitors is even simpler as
we have the mac addresses of visitors that make them
unique:
(Build version: Impala v0.3 (3cb725b) built on Fri Nov 23 13:51:59 PST 2012)
[localhost:21000] > SELECT COUNT(DISTINCT(mac_address)) FROM visit_duration
WHERE router = "buffalo";
53. Analysis and Report…
The plot (figure 1) indicates that about 85% of the visits were
detected in store number one and about 15% in store
number two. One might draw the conclusion that store
number one is in a much better location with more
occasional customers. But let’s gain more insights by
analyzing the number of unique visitors.
54. Analysis and Report…
The average visit duration in store number
[localhost:21000] > SELECT AVG(duration_in_seconds) FROM visit_duration
WHERE router = "buffalo";
976.6666666666666
Each user visit duration in store.
55. Many more…
How many people visited the store (unique visitors)?
Note: Unlike the traditional customer frequency counter at the
doors we have mac addresses at the log files that are unique for
mobile phones. Supposed people do not change their mobile
phones we can recognize unique visitors and not just visits.
How many visits did we have?
What is the average visit duration?
What is the peak hour for visitors?
How many people are new vs. returning?
Which location getting most vistors?
Which location branch has most regular customers?
What is the average length of time between two visits?
56. Ingest Parse Transform Publish View
Collectors
File
System
Real Time
Business Rules
PDI
Batch
Integrated
Data
Store
Internet
Functional Architecture View and summary
Analysis
Data
Services
Business Rules
Business Rules
Security
Service and
Offers
subscription
system
General
Analytics
Hive
Flume
System of Record Single Source of Truth Consumers
57. Conclusion
Analysing WiFi router log files could be done with a
traditional RDBMS database approach as well. But one of
the main benefits of this architecture is the ability to store a
variety of semi structured files.
Easy adoption for existing BI/analysis and reporting tools
with a BigData platform integration.
Modification, Evaluation
Share the data between many applications
Etc…
Editor's Notes
When architects and developers discuss software, they typically immediately qualify a software
tool for its specific usage. For example, they may say that Apache Tomcat is a web server and that
MySQL is a database.
When it comes to Hadoop, however, things become a little bit more complicated. Hadoop
encompasses a multiplicity of tools that are designed and implemented to work together. As a result,
Hadoop can be used for many things, and, consequently, people often define it based on the way
they are using it.
Because Hadoop provides such a wide array of
capabilities that can be adapted to solve many problems, many consider it to be a basic framework.
Certainly, Hadoop provides all of these capabilities, but Hadoop should be classified as an
ecosystem comprised of many components that range from data storage, to data integration, to data
processing, to specialized tools for data analysts.
Memory: Depending on the application, your system’s memory requirements will vary. They differ between the management services and the worker services. For the worker services, sufficient memory is needed to manage the TaskTracker and FileServer services in addition to the sum of all the memory assigned to each of the Map/Reduce slots. If you have a memory-bound Map/Reduce Job, you may need to increase the amount of memory on all the nodes running worker services.
Storage: A Hadoop platform that’s designed to achieve performance and scalability by moving the compute activity to the data is preferable. Data storage requirements for the worker nodes may be best met by direct attached storage (DAS) in a Just a Bunch of Disks (JBOD) configuration and not as DAS with RAID or Network Attached Storage (NAS).
Show the rules as more of a filter.
Look at using the other slide.
-Show license management, how do we automate that.
ASUP Ecosystem
Receive & process messages
Store in flat files
Store in Databases
2. Rules Ecosystem
Business rules for processing based on ASUP messages
Automate Bug identification and attach rate to cases - signature (pattern) detection
Consistent rules and Logic used by multiple applications and tools
3. UI
SAP: Auto Case Creation and Auto Parts Dispatch
MASUP: Health Checks/At Risk Systems, Storage Trending, Storage Efficiency
Unified Tools Portal: Central Place to Access Support Tools, Multiple Tools using consistent data and business rules
Support Site: Install Base Info, License Management
eBI: Up sell & cross sell, Solution Stickiness, Aggregated Account and Segment Views
Service Automation for Secure Site (SASS)