SlideShare a Scribd company logo
1 of 107
David Kroenke
Business Intelligence and Knowledge
Management
Chapter 9
© 2007 Prentice Hall, Inc. 1
 Understand the need for business
intelligence systems.
 Know the characteristics of reporting
systems.
 Know the purpose and role of data
warehouses and data marts.
 Understand fundamental data-mining
techniques.
 Know the purpose, features, and functions of
knowledge management systems.
© 2007 Prentice Hall, Inc. 2
 According to a study done at the University of
California at Berkeley, a total of 403
petabytes of new data were created.
 403 petabytes is roughly the amount of all
printed material ever written.
◦ The printed collection of the Library of Congress is
.01 petabytes.
◦ 400 petabytes equals 40,000 copies of the print
collection of the Library of Congress.
© 2007 Prentice Hall, Inc. 3
 The generation of all these data has much to
do with Moore’s Law.
 The capacity of storage devices increases as
their costs decrease.
 Today, storage capacity is nearly unlimited.
 We are drowning in data and starving for
information.
© 2007 Prentice Hall, Inc. 4
© 2007 Prentice Hall, Inc. 5
Source: Used with permission of Peter Lyman and Hal R. Varian, University of California at Berkeley.
© 2007 Prentice Hall, Inc. 6
Source: Used with permission of Peter Lyman and Hal R. Varian, University of California at Berkeley.
 Tools for searching business data in an
attempt to find patterns is called business
intelligence (BI) tools.
 Reporting tools are programs that read data
from a variety of sources, process that data,
produce formatted reports, and deliver those
reports to the users who need them.
© 2007 Prentice Hall, Inc. 7
 The processing of data is simple:
◦ Data are sorted and grouped.
◦ Simple totals and averages are calculated.
 Reporting tools are used primarily for
assessment
◦ They are used to address questions like:
 What has happened in the past?
 What is the current situation?
 How does the current situation compare to the past?
© 2007 Prentice Hall, Inc. 8
 Data-mining tools process data using
statistical techniques, many of which are
sophisticated and mathematically complex.
 Data mining involves searching for patterns
and relationships among data.
 In most cases, data-mining tools are used to
make predictions.
 For example, we can use one form of analysis to compute
the probability that a customer will default on a loan.
© 2007 Prentice Hall, Inc. 9
 Another way to distinguish the differences of
reporting tools and data-mining tools is :
◦ Reporting tools use simple operations like sorting,
grouping, and summing.
◦ Data-mining tools use sophisticated techniques.
© 2007 Prentice Hall, Inc. 10
 An information system is a collection of
hardware, software, data, procedures, and
people.
 The purpose of a business intelligence (BI)
system is to provide the right information, to
the right user, at the right time.
 BI systems help users accomplish their goals
and objectives by producing insights that
lead to actions.
© 2007 Prentice Hall, Inc. 11
 A reporting tool can generate a report that
shows a customer has canceled an important
order.
 A reporting system, however, alerts that
customer’s salesperson with this unwanted
news, and does so in time for the salesperson
to try to alter the customer’s decision.
 A data-mining tool can create an equation
that computes the probability that a customer
will default on a loan.
© 2007 Prentice Hall, Inc. 12
 A data-mining system uses that equation to
enable banking personnel to assess new loan
applications.
© 2007 Prentice Hall, Inc. 13
 The purpose of a reporting system is to
create meaningful information from disparate
data sources and to deliver that information
to the proper user on a timely basis.
 Reporting systems generate information from
data as a result of four operations:
◦ Filtering data
◦ Sorting data
◦ Grouping data
◦ Making simple calculations on the data
© 2007 Prentice Hall, Inc. 14
© 2007 Prentice Hall, Inc. 15
© 2007 Prentice Hall, Inc. 16
 A reporting system maintains a database of
reporting metadata.
 The metadata describes the reports, users,
groups, roles, events, and other entities
involved in the reporting activity.
 The reporting system uses the metadata to
prepare and deliver reports to the proper
users on a timely basis.
© 2007 Prentice Hall, Inc. 17
© 2007 Prentice Hall, Inc. 18
© 2007 Prentice Hall, Inc. 19
 In terms of a report type, reports can be
static or dynamic.
 Static reports are prepared once from the
underlying data, and they do not change.
◦ Example, a report of past year’s sales
 Dynamic reports: the reporting system reads
the most current data and generates the
report using that fresh data.
◦ Examples are: a report on sales today and a report
on current stock prices
© 2007 Prentice Hall, Inc. 20
 Query reports are prepared in response to
data entered by users.
 Online analytical processing (OLAP) reports
allow the user to dynamically change the
report grouping structures.
© 2007 Prentice Hall, Inc. 21
 Reports are delivered via many different
report media or channels.
 Some reports are printed on paper, and
others are created in a format like PDF
whereby they can be printed or viewed
electronically.
 Other reports are delivered to computer
screens.
 Companies sometimes place reports on
internal corporate Web sites for employees to
access. © 2007 Prentice Hall, Inc. 22
 Another report medium is a digital
dashboard, which is an electronic display
customized for a particular user.
◦ Vendors like Yahoo! and MSN provide common
examples.
◦ Users of these services can define content they
want- say, a local weather forecast, a list of stock
prices, or a list of news sources.
◦ The vendor constructs the display customized for
each user.
© 2007 Prentice Hall, Inc. 23
 Other dashboards are particular to an
organization.
◦ The organization might have a dashboard that shows
up-to-the-minute production and sales activities.
 Alerts are another form of report.
◦ Users can declare that they wish to receive
notifications of events, say, via email or on their cell
phones.
 Reports can be published via a Web service.
◦ The Web service produces the report in response to
requests from the service-consuming application.
© 2007 Prentice Hall, Inc. 24
© 2007 Prentice Hall, Inc. 25
 The report mode can be either push report or
pull report.
 Organizations send a push report to users
according to a preset schedule.
◦ Users receive the report without any activity on
their part.
 Users must request a pull report.
◦ To obtain a pull report, a user goes to a Web portal
or digital dashboard and clicks a link or button to
cause the reporting system to produce and deliver
the report.
© 2007 Prentice Hall, Inc. 26
 Three functions of reporting systems are:
◦ Authoring
◦ Management
◦ Delivery
 Report authoring involves connecting to data
sources, creating the reporting structure, and
formatting the report.
© 2007 Prentice Hall, Inc. 27
© 2007 Prentice Hall, Inc. 28
Source: Microsoft product screen shot reprinted with permission from Microsoft Corporation.
© 2007 Prentice Hall, Inc. 29
Source: Microsoft product screen shot reprinted with permission from Microsoft Corporation.
 The purpose of report management is to
define who receives what reports, when, and
by what means.
 Most report-management systems allow the
report administrator to define user accounts
and user groups and to assign particular
users to particular groups.
 Reports that have been created using the
report-authoring system are assigned groups
and users.
© 2007 Prentice Hall, Inc. 30
 Assigning reports to groups saves the
administrator work.
◦ When a report is created, changed, or removed, the
administrator need only change the report
assignments to the group.
◦ All of the users in the group will inherit the
changes.
 Metadata also indicates what channel is to be
used and whether the report is to be pushed
or pulled.
◦ If the report is to be pushed, the administrator
declares whether the report is to be generated on a
regular schedule or as an alert.
© 2007 Prentice Hall, Inc. 31
 The report-delivery function of a reporting
system pushes reports or allows them to be
pulled according to report-management
metadata.
 Reports can be delivered via an email server,
Web site, XML Web services, or by other
program-specific means.
 The report-delivery system uses the
operating system and other program security
components to ensure that only authorized
users receive authorized reports.
© 2007 Prentice Hall, Inc. 32
 The report-delivery system also ensures that
push reports are produced at appropriate
times.
 For query reports, the report-delivery system
serves as an intermediary between the user
and the report generator.
◦ It receives user query data, such as item numbers
in an inventory query, passes the query data to the
report generator, receives the resulting report, and
delivers the report to the user.
© 2007 Prentice Hall, Inc. 33
 RFM analysis is a way of analyzing and
ranking customers according to their
purchasing patterns.
 It is a simple technique that considers how
recently (R) a customer has ordered, how
frequently (F) a customer orders, and how
much money (M) the customer spends per
order.
 To produce an RFM score, the program first
sorts customer purchase records by the date
of their most recent (R) purchase.
© 2007 Prentice Hall, Inc. 34
 In a common form of this analysis, the
program then divides the customers into five
groups and gives customers in each group a
score of 1 to 5.
◦ The top 20% of the customers having the most
recent orders are given an R score 1 (highest).
 The program then re-sorts the customers on
the basis of how frequently they order.
◦ The top 20% of the customers who order most
frequently are given a F score of 1 (highest).
 Finally the program sorts the customers again
according to the amount spent on their
orders.
◦ The 20% who have ordered the most expensive
items are given an M score of 1 (highest).© 2007 Prentice Hall, Inc. 35
 A reporting system can generate the RFM
data and deliver it in many ways:
◦ A report of RFM scores for all customers can be
pushed to the vice president of sales.
◦ Reports with scores for particular regions can be
pushed to regional sales managers.
◦ Reports of scores for particular accounts can be
pushed to the account salespeople.
◦ All of this reporting can be automated.
© 2007 Prentice Hall, Inc. 36
© 2007 Prentice Hall, Inc. 37
 Online analytical processing (OLAP) provides
the ability to sum, count, average, and
perform other simple arithmetic operations
on groups of data.
 The remarkable characteristics of OLAP
reports is that they are dynamic.
 The viewer of the report can change the
report’s format, hence, the term online.
© 2007 Prentice Hall, Inc. 38
 An OLAP report has measures and
dimensions.
 A measure is the data item of interest.
◦ It is the item that is to be summed or averaged or
otherwise processed in the OLAP report.
 A dimension is a characteristic of a measure.
◦ Purchase data, customer type, customer location, and
sales region are all examples of dimension.
© 2007 Prentice Hall, Inc. 39
 With an OLAP report, it is possible to drill
down into the data.
◦ This term means to further divide the data into
more detail.
 Special-purpose products called OLAP servers
have been developed to perform OLAP
analysis.
 An OLAP server reads data from an
operational database, performs preliminary
calculations, and stores the results of those
operations in an OLAP database.
© 2007 Prentice Hall, Inc. 40
© 2007 Prentice Hall, Inc. 41
© 2007 Prentice Hall, Inc. 42
© 2007 Prentice Hall, Inc. 43
© 2007 Prentice Hall, Inc. 44
 Basic reports and simple OLAP analyses can
be made directly from operational data.
 For the most part, such reports display the
current state of the business; and if there are
a few missing values or small inconsistencies
with the data, no one is too concerned.
 Operational data are unsuited to more
sophisticated analyses, particularly, data-
mining analyses that require high-quality
input for accurate and useful results.
© 2007 Prentice Hall, Inc. 45
 Many organizations choose to extract
operational data into facilities called data
warehouses and data marts, both of which
are facilities that prepare, store, and manage
data specifically for data mining and other
analyses.
 Programs read operational data and extract,
clean, and prepare that data for BI
processing.
 The prepared data are stored in a data-
warehouse database using data-warehouse
DBMS, which can be different from the
organization’s operational DBMS.
© 2007 Prentice Hall, Inc. 46
 Data warehouses include data that are
purchased from outside sources.
 Metadata concerning the data, its source, its
format, its assumptions and constraints, and
other facts about the data is kept in a data-
warehouse metadata database.
 The data-warehouse DBMS extracts and
provides data to business intelligence tools
such as data-mining programs.
© 2007 Prentice Hall, Inc. 47
© 2007 Prentice Hall, Inc. 48
© 2007 Prentice Hall, Inc. 49
 Most operational and purchased data have
problems that inhibit their usefulness for
business intelligence.
 Problematic data are termed dirty data.
◦ Examples are values of B for customer gender and
of 213 for customer age.
 Purchased data often contain missing elements.
◦ Most data vendors state the percentage of missing
values for each attribute in the data they sell.
◦ An organization buys such data because for some
uses, some data is better than no data at all.
© 2007 Prentice Hall, Inc. 50
 Inconsistent data are particularly common for
data that have been gathered over time.
◦ When an area code changes, for example, the
phone number for a given customer before the
change will not match the customer’s number after
the change.
 Some data inconsistencies occur from the
nature of the business activity.
 Nonintegrated data can cause problems when
data comes from different management
information systems.
© 2007 Prentice Hall, Inc. 51
 Data can be too fine or too coarse.
◦ It is possible to capture the customers clicking
behavior in what is termed clickstream data that
includes everything a customer does at a Web site.
 If data is in the wrong format, that condition
is sometimes expressed by saying the data
have the wrong granularity.
 Because of a phenomenon called the curse of
dimensionally, the more attributes there are,
the easier it is to build a model that fits the
sample data but that is worthless as a
predictor.
© 2007 Prentice Hall, Inc. 52
© 2007 Prentice Hall, Inc. 53
 The data warehouse takes data from the data
manufacturers (operational systems and
purchased data), cleans and processes the
data, and locates the data on the shelves, so
to speak, of the data warehouse.
 A data mart is a data collection, smaller than
the data warehouse, that addresses a
particular component or functional area of
the business.
© 2007 Prentice Hall, Inc. 54
 The data warehouse is like the distributor in
the supply chain and the data mart is like the
retail store in the supply chain.
 Users in the data mart obtain data that
pertain to a particular business function from
the data warehouse.
 It is expensive to create, staff, and operate
data warehouses and data marts.
© 2007 Prentice Hall, Inc. 55
© 2007 Prentice Hall, Inc. 56
 Data mining is the application of statistical
techniques to find patterns and relationships
among data and to classify and predict.
 Data mining represents a convergence of
disciplines.
 Data-mining techniques emerged from
statistics and mathematics and from artificial
intelligence and machine-learning fields in
computer science.
© 2007 Prentice Hall, Inc. 57
© 2007 Prentice Hall, Inc. 58
 With unsupervised data mining, analysts do
not create a model or hypothesis before
running the analysis.
 Instead, they apply the data-mining
technique to the data and observe the results.
 Analysts create hypotheses after the analysis
to explain the patterns found.
© 2007 Prentice Hall, Inc. 59
 One common unsupervised technique is
cluster analysis.
◦ A common use for cluster analysis is to find groups
of similar customers from customer order and
demographic data.
© 2007 Prentice Hall, Inc. 60
 With supervised data mining, data miners
develop a model prior to the analysis and
apply statistical techniques to data to
estimate parameters of the model.
 One such analysis, which measures the
impact of a set of variables on another
variable, is called a regression analysis.
 Neural networks are another popular
supervised data-mining technique used to
predict values and make classifications such
as “good prospect” or “poor prospect”
customers.
© 2007 Prentice Hall, Inc. 61
 A market-basket analysis is a data-mining
technique for determining sales patterns.
 A market-basket analysis shows the products
that customers tend to buy together.
 In market-basket terminology, support is the
probability that two items will be purchased
together.
 You can expect market-basket analysis to
become a standard CRM analysis during your
career.
© 2007 Prentice Hall, Inc. 62
© 2007 Prentice Hall, Inc. 63
 A decision tree is a hierarchical arrangement
of criteria that predict a classification or a
value.
 Decision tree analyses are an unsupervised
data-mining technique.
 The analyst sets up the computer program
and provides the data to analyze, and the
decision tree program produces the tree.
© 2007 Prentice Hall, Inc. 64
© 2007 Prentice Hall, Inc. 65
 A common business application of decision
trees is to classify loans by likelihood of
default.
 Organizations analyze data from past loans
to produce a decision tree that can be
converted to loan-decision rules.
◦ A financial institution could use such a tree to
assess the default risk on a new loan.
© 2007 Prentice Hall, Inc. 66
© 2007 Prentice Hall, Inc. 67
Source: Used with permission of Insightful Corporation. Copyright © 1999-2005 Insightful Corporation. All Rights Reserved.
 Knowledge management systems concern the
sharing of knowledge that is already known
to exist, either in libraries of documents, in
the heads of employees, or in other known
sources.
 Knowledge management (KM) is the process
of creating value from intellectual capital and
sharing that knowledge with employees,
managers, suppliers, customers, and others
who need that capital.
© 2007 Prentice Hall, Inc. 68
 Knowledge management is a process that is
supported by the five components of an
information system.
◦ Its emphasis is on people, their knowledge, and
effective means for sharing that knowledge with
others.
 The benefits of KM concern the application of
knowledge to enable employees and others to
leverage organizational knowledge to work
smarter.
 KM preserves organizational memory by
capturing and storing the lessons learned and
best practices of key employees.
© 2007 Prentice Hall, Inc. 69
 Content management systems are
information systems that track organizational
documents, Web pages, graphics, and related
materials.
 Such systems differ from operational
document systems in that they do not directly
support business operations.
 KM content management systems are
concerned with the creation, management,
and delivery of documents that exist for the
purpose of imparting knowledge.
© 2007 Prentice Hall, Inc. 70
 Typical users of content management
systems are companies that sell complicated
products and want to share their knowledge
of those products with employees and
customers.
 The basic functions of content management
systems are the same as for report
management systems: author, manage, and
deliver.
 The only requirement that content managers
place on document authoring is that the
document has been created in a standardized© 2007 Prentice Hall, Inc. 71
 Content management functions are, however,
exceedingly complicated.
 Most content databases are huge; some have
thousands of individual documents, pages, and
graphics.
© 2007 Prentice Hall, Inc. 72
 Documents may refer to one another or multiple
documents may refer to the same product or
procedure.
◦ When one of them changes, others must change as
well.
◦ Some content management systems keep semantic
linkages among documents so that content
dependencies can be known and used to maintain
document consistency.
© 2007 Prentice Hall, Inc. 73
 Document contents are perishable.
◦ Documents become obsolete and need to be altered,
removed, or replaced.
 Multinational companies have to ensure
document language translations.
© 2007 Prentice Hall, Inc. 74
© 2007 Prentice Hall, Inc. 75
Source: microsoft.com/backstage/inside.htm (accessed February 2004). © 2003 Microsoft Corporation. All rights reserved.
© 2007 Prentice Hall, Inc. 76
Source: Used with permission of Tom Rizzo of Microsoft Corporation.
© 2007 Prentice Hall, Inc. 77
Source: Used with permission of Tom Rizzo of Microsoft Corporation.
 Almost all users of content management
systems pull the contents.
 Users cannot pull content if they do not know
it exists.
◦ The content must be arranged and indexed, and a
facility for searching the content devised.
 Documents that reside behind a corporate
firewall, however, are not publicly accessible
and will not be reachable by Google or other
search engines.
◦ Organizations must index their own proprietary
documents and provide their own search capability
for them.
© 2007 Prentice Hall, Inc. 78
 Web browsers and other programs can readily
format content expressed in HTML, PDF, or
another standard format.
 XML documents often contain their own
formatting rules that browsers can interpret.
◦ The content management system will have to
determine an appropriate format for content
expressed in other ways.
© 2007 Prentice Hall, Inc. 79
 Nothing is more frustrating for a manager to
contemplate than the situation in which one
employee struggles with a problem that another
employee knows how to solve easily.
 KM systems are concerned with the sharing not
only of content, but also with the sharing of
knowledge among humans.
◦ How can one person share her knowledge with another?
◦ How can one person learn of another person’s great
idea?
© 2007 Prentice Hall, Inc. 80
 Three forms of technology are used for
knowledge- sharing among humans:
◦ Portals, discussion groups, and email
◦ Collaborations systems
◦ Expert systems
Portals
◦ Employees can share ideas by posting knowledge on a
Web portal whereby managers and employees can pull
the knowledge from the portal.
© 2007 Prentice Hall, Inc. 81
© 2007 Prentice Hall, Inc. 82
Discussion Groups
◦ Discussion groups allow employees or customers to
post questions and queries seeking solutions to
problems they have.
◦ Oracle, IBM, PeopleSoft, and other vendors support
product discussion groups where users can post
questions and where employees, vendors, and other
users can answer them.
◦ Later, the organization can edit and summarize the
questions from such discussion groups into frequently
asked questions (FAQs).
© 2007 Prentice Hall, Inc. 83
Discussion groups (continued)
◦ Basic email can also be used for knowledge-sharing,
especially if email lists have been constructed with KM
in mind.
◦ Two human factors inhibit knowledge-sharing.
 Employees can be reluctant to exhibit their ignorance.
 Competition exists between employees.
◦ A KM application may be ill-suited to a competitive
group.
 The company may be able to restructure rewards and
incentives to foster sharing of ideas among employees.
© 2007 Prentice Hall, Inc. 84
Collaboration Systems
◦ Collaboration systems are information systems that
enable people to work together more effectively.
◦ The Internet can be used as a broadcast medium for
speeches, panel discussion, and other types of
meetings.
◦ Web broadcasts, because they are digital, can be readily
saved and replayed at the viewer’s convenience.
◦ Web broadcasts can also be made interactive by
combining them with discussion group bulletin boards
that are live during the broadcast.
◦ Video conferencing is another popular form of IT-
supported meetings.
 Video-conferencing equipment is expensive and normally is
located in selected sites in the organization.
© 2007 Prentice Hall, Inc. 85
Collaboration Systems (continued)
◦ Net meetings are a means by which individuals can
participate in remote meetings without leaving their
desk.
 With a speaker and a Web camera, virtual meetings can be
conducted among employees who sit in their own offices.
© 2007 Prentice Hall, Inc. 86
© 2007 Prentice Hall, Inc. 87
Expert Systems
◦ Expert systems are created by interviewing experts in a
given business domain and codifying the rules stated
by those experts.
◦ Many expert systems were created in the late 1980s
and 1990s, and some of them have been successful.
◦ Expert systems suffer from three major disadvantages.
 They are difficult and expensive to develop.
 They are difficult to maintain.
 They were unable to live up to the high expectations set by
their name.
© 2007 Prentice Hall, Inc. 88
 Enormous amounts of data are generated
each year.
 Business intelligence (BI) tools search these
increasing amounts of data for useful
information.
 Reporting tools tend to be used for
assessment, process data using simple
calculations such as sums and averages.
© 2007 Prentice Hall, Inc. 89
 Data-mining tools, tend to be used for
prediction, process data using sophisticated
statistical and mathematical techniques.
 Reporting systems create meaningful
information from disparate data sources and
deliver that information to the proper user on
a timely basis.
 RFM and OLAP are two examples of report
applications.
© 2007 Prentice Hall, Inc. 90
 Data warehouses and data marts are facilities
that prepare, store, and manage data for data
mining and other analyses.
 Data Market-basket analysis determines
groups of products that customers tend to
purchase together.
 Decision trees are used to construct
“If…Then…” rules for predicting
classifications.
© 2007 Prentice Hall, Inc. 91
 Knowledge management is the process of
creating value from intellectual capital and
sharing that knowledge with employees,
managers, suppliers, customers, and others
who need that capital.
 Human knowledge-sharing systems use
portals, bulletin boards, and email to
facilitate knowledge interchange.
 Collaboration systems include net
conferencing, video conferencing, and expert
systems.
© 2007 Prentice Hall, Inc. 92
Business intelligence (BI)
systems
Business intelligence (BI)
tools
Clickstream data
Cluster analysis
Collaboration systems
Confidence
Content management
systems
Curse of dimensionality
Data mart
Data mining
Data-mining tools
Data warehouse © 2007 Prentice Hall, Inc. 93
Decision trees
Digital dashboard
Dimension
Dirty data
Discussion groups
Drill down
Dynamic report
Exabyte
Expert Systems
Frequently asked
questions (FAQs)
© 2007 Prentice Hall, Inc. 94
Granularity
If…then…rules
Knowledge management
(KM)
Lift
Market-basket analysis
Measure
Neural networks
OLAP cube
OLAP server
Online analytical processing
(OLAP)
Petabyte
Portals
Pull report
Push report
Query report
Regression analysis
Report media
Report mode
Report type
Reporting systems
Reporting tools
RFM analysis
Semantic security
© 2007 Prentice Hall, Inc. 95
Static report
Supervised data mining
Support
Unsupervised data mining
© 2007 Prentice Hall, Inc. 96
Security is a very difficult problem, and it gets
worse every year.
Physical security is hard enough: How do we know
that the person (or program) that signs on as
Megan Cho is really Megan Cho?
 We use passwords, but files of passwords can be
stolen.
Suppose Megan works in the HR department, so
she has access to personal and private data of
other employees.
© 2007 Prentice Hall, Inc. 97
We need to design the reporting system so that
Megan can access all of the data she needs to do
her job, and no more.
A reporting server is an obvious and juicy target
for any would-be intruder.
 Someone can break in and change access permissions.
 Or, a hacker could pose as someone else to obtain
reports.
© 2007 Prentice Hall, Inc. 98
Semantic security concerns the unintended
release of a combination of reports or documents
that are independently not protected.
Megan was given just two reports to do her job
 Yet she combined the information in those reports with
publicly available information and is able to deduce
salaries, for at least some employees.
 These salaries are much more than she is supposed to
know.
 This is a semantic security problem.
© 2007 Prentice Hall, Inc. 99
The product managers wanted the data miners to
analyze customer clicks on a Web page to
determine customer preferences for particular
product lines.
 The products were competing with one another for
resources.
 “Sampling?” asked the product managers in a chorus
 “Sampling? No way. We want all the data. This is
important, and we don’t want a guess.”
© 2007 Prentice Hall, Inc.
10
0
There’s nothing wrong with sampling
 Properly done, the results from a sample are just as
accurate as results from the complete data set.
 Studies done from samples are also cheaper and faster.
 Sampling is a great way to save time and money.
In truth, skill is required to develop a good sample.
 The product managers should have listened to the data
miners’ sampling plan and ensured that the sample
would be appropriate, given the goals of the study.
 Understanding this concept will save you and your
organization substantial money!
© 2007 Prentice Hall, Inc.
10
1
Classification is a useful human skill.
Sorting and classifying are necessary, important,
and essential activities.
 But those activities can also be dangerous
Serious ethical issues arise when we classify
people.
 What makes someone a good or bad “prospect”?
 If we’re talking about classifying customers in order to
prioritize our sales calls, then the ethical issue may not
be too serious.
 What about classifying applicants for college?
© 2007 Prentice Hall, Inc.
10
2
I’m not really a contrarian about data mining.
 I believe in it.
 But data mining in the real world is a lot different from
the way it’s described in textbooks
 One problem is that data are always dirty, with missing
values, values way out of the range of possibility, and
time values that make no sense.
“Another problem is that you know the least when
you start the study”.
 So you work for a few months and learn that if you had
another variable, say the customer’s zip code, or age, or
something else, you could do a much better analysis.
© 2007 Prentice Hall, Inc.
10
3
Overfitting is another problem, a huge one.
 With neural networks, you can create a model of any
level of complexity you want, except that none of those
equations will predict new cases with any accuracy at all.
 When using neural nets, you have to be very careful not
to overfit the data.
Another problem is seasonality:
 Say all your training data are from the summer-will your
model be valid for the winter?
© 2007 Prentice Hall, Inc.
10
4
“When you start a data-mining project, you never
know how it will turn out”;
 Some were bad and a wasted of time.
 Some were good and found to have interesting and
important patterns and information and created very
accurate predictive models.
It’s not easy, though, you have to be very careful
and lucky.
© 2007 Prentice Hall, Inc.
10
5
Computer simulation of World War III project at
Pentagon 1971-1973
Analysis process
 Run the simulation and obtain a set of results.
 The military analysts and weapons experts would
examine the results, and if results weren’t quite what
was expected or wanted, the analysts would ask to
change some of the inputs or a portion of the model.
 Over time, an accumulated set of results was approved.
 The accumulated results were presented to the four-star
generals and other senior Pentagon managers.
 Sometimes these senior people would see problems
in the analyses, and gave instruct ions to discard
some of the results.
© 2007 Prentice Hall, Inc.
10
6
Observation
 I do not believe that anyone thought they were deceiving
anyone else.
 The top managers didn’t realize that the results they saw
left out a substantial portion of the unfavorable
simulations.
 They never knew about the other results.
 The analysts who were filtering the outcomes by
throwing out the numbers didn’t like being dishonest
 They simply thought that those results were wrong
or unrealistic.
 I do not think they realized they were using the
computer to promulgate their prior ideas about
military needs.
© 2007 Prentice Hall, Inc.
10
7
Questions to think about
 Why perform the analysis?
 What are you going to do with the results?
 What is it that you want to know or to decide?
Answer the questions above before you begin the
analysis.
 Then, pay attention to the results.
 Don’t argue with the data.
 If the results don’t conform to your expectations, think
long and hard about changing the model, adjusting the
data, or modifying the answers.

More Related Content

What's hot

Management information system
Management information system Management information system
Management information system Manish Kaushik
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big DataSaurabh Shanbhag
 
Management Information System
Management Information SystemManagement Information System
Management Information SystemRachana Pradeep
 
Information systems in Organizations
Information systems in OrganizationsInformation systems in Organizations
Information systems in Organizationsmulugetaa
 
Transaction Processing (TP) & Enterprise Resource Planning (ERP)
Transaction Processing (TP) & Enterprise Resource Planning (ERP)Transaction Processing (TP) & Enterprise Resource Planning (ERP)
Transaction Processing (TP) & Enterprise Resource Planning (ERP)Sajal Eahsan
 
Information systems, organizations, management, and strategy
Information systems, organizations, management, and strategyInformation systems, organizations, management, and strategy
Information systems, organizations, management, and strategyProf. Othman Alsalloum
 
Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models? Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models? Raheel Ahmad
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
Information systems development methodologies
Information systems development methodologiesInformation systems development methodologies
Information systems development methodologiesFereshte Moghadam
 
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)Biswajit Bhattacharjee
 
Erp related technologies
Erp related technologiesErp related technologies
Erp related technologiesLalit Singh
 
System Analysis And Design Management Information System
System Analysis And Design Management Information SystemSystem Analysis And Design Management Information System
System Analysis And Design Management Information Systemnayanav
 

What's hot (20)

Management information system
Management information system Management information system
Management information system
 
Mis notes
Mis notesMis notes
Mis notes
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big Data
 
Management Information System
Management Information SystemManagement Information System
Management Information System
 
Strategic use of information systems
Strategic use of information systemsStrategic use of information systems
Strategic use of information systems
 
Information systems in Organizations
Information systems in OrganizationsInformation systems in Organizations
Information systems in Organizations
 
Transaction Processing (TP) & Enterprise Resource Planning (ERP)
Transaction Processing (TP) & Enterprise Resource Planning (ERP)Transaction Processing (TP) & Enterprise Resource Planning (ERP)
Transaction Processing (TP) & Enterprise Resource Planning (ERP)
 
Information systems, organizations, management, and strategy
Information systems, organizations, management, and strategyInformation systems, organizations, management, and strategy
Information systems, organizations, management, and strategy
 
Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models? Explainable AI: Building trustworthy AI models?
Explainable AI: Building trustworthy AI models?
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Data Exploration.pptx
Data Exploration.pptxData Exploration.pptx
Data Exploration.pptx
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Information systems development methodologies
Information systems development methodologiesInformation systems development methodologies
Information systems development methodologies
 
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
 
MIS concepts
MIS conceptsMIS concepts
MIS concepts
 
Deep learning ppt
Deep learning pptDeep learning ppt
Deep learning ppt
 
5desc
5desc5desc
5desc
 
Erp related technologies
Erp related technologiesErp related technologies
Erp related technologies
 
Data compression
Data compressionData compression
Data compression
 
System Analysis And Design Management Information System
System Analysis And Design Management Information SystemSystem Analysis And Design Management Information System
System Analysis And Design Management Information System
 

Viewers also liked

Guia do Gerente Responsável
Guia do Gerente ResponsávelGuia do Gerente Responsável
Guia do Gerente ResponsávelAlain Winandy
 
Mis2013 chapter 0 kontrak belajar
Mis2013   chapter 0 kontrak belajarMis2013   chapter 0 kontrak belajar
Mis2013 chapter 0 kontrak belajarAndi Iswoyo
 
A experiência do Cliente em lojas de varejo e supermercados
A experiência do Cliente em lojas de varejo e supermercadosA experiência do Cliente em lojas de varejo e supermercados
A experiência do Cliente em lojas de varejo e supermercadosAlain Winandy
 
los 7 pasos de la planificación de ventas
los 7 pasos de la planificación de ventaslos 7 pasos de la planificación de ventas
los 7 pasos de la planificación de ventasAlain Winandy
 
Metpen 1 Penelitian Ilmiah
Metpen 1   Penelitian IlmiahMetpen 1   Penelitian Ilmiah
Metpen 1 Penelitian IlmiahAndi Iswoyo
 
Gestão de Varejo Ambiental no Varejo e Supermercados
Gestão de Varejo Ambiental no Varejo e SupermercadosGestão de Varejo Ambiental no Varejo e Supermercados
Gestão de Varejo Ambiental no Varejo e SupermercadosAlain Winandy
 
Cartões e diferenciação de preços: argumentos e contra-argumentos
Cartões e diferenciação de preços: argumentos e contra-argumentosCartões e diferenciação de preços: argumentos e contra-argumentos
Cartões e diferenciação de preços: argumentos e contra-argumentosAlain Winandy
 
03 Teori Organisasi Adm Publik
03 Teori Organisasi   Adm Publik03 Teori Organisasi   Adm Publik
03 Teori Organisasi Adm PublikAndi Iswoyo
 
Mis2013 chapter 4 - database processing n data communication
Mis2013   chapter 4 - database processing n data communicationMis2013   chapter 4 - database processing n data communication
Mis2013 chapter 4 - database processing n data communicationAndi Iswoyo
 
Ob2013 chapter 16 budaya organisasi
Ob2013   chapter 16 budaya organisasiOb2013   chapter 16 budaya organisasi
Ob2013 chapter 16 budaya organisasiAndi Iswoyo
 
Mis2013 chapter 2 purposes of information systems id
Mis2013   chapter 2 purposes of information systems idMis2013   chapter 2 purposes of information systems id
Mis2013 chapter 2 purposes of information systems idAndi Iswoyo
 
Konsep Process dalam Sistem Komputer
Konsep Process dalam Sistem KomputerKonsep Process dalam Sistem Komputer
Konsep Process dalam Sistem KomputerS N M P Simamora
 
Chapter 1 introduction to ob
Chapter 1 introduction to obChapter 1 introduction to ob
Chapter 1 introduction to obAndi Iswoyo
 
Organisasi Komputer bhn kuliah m10 r1
Organisasi Komputer bhn kuliah m10 r1Organisasi Komputer bhn kuliah m10 r1
Organisasi Komputer bhn kuliah m10 r1S N M P Simamora
 
Cover paper Algoritma Symboolon
Cover paper Algoritma SymboolonCover paper Algoritma Symboolon
Cover paper Algoritma SymboolonS N M P Simamora
 
Manajemen Chapter 7 (Pengambilan Keputusan)
Manajemen Chapter 7 (Pengambilan Keputusan)Manajemen Chapter 7 (Pengambilan Keputusan)
Manajemen Chapter 7 (Pengambilan Keputusan)Fathi Arief
 
Data communication & telecommunication
Data communication & telecommunicationData communication & telecommunication
Data communication & telecommunicationDhani Ahmad
 

Viewers also liked (20)

Guia do Gerente Responsável
Guia do Gerente ResponsávelGuia do Gerente Responsável
Guia do Gerente Responsável
 
Mis2013 chapter 0 kontrak belajar
Mis2013   chapter 0 kontrak belajarMis2013   chapter 0 kontrak belajar
Mis2013 chapter 0 kontrak belajar
 
A experiência do Cliente em lojas de varejo e supermercados
A experiência do Cliente em lojas de varejo e supermercadosA experiência do Cliente em lojas de varejo e supermercados
A experiência do Cliente em lojas de varejo e supermercados
 
los 7 pasos de la planificación de ventas
los 7 pasos de la planificación de ventaslos 7 pasos de la planificación de ventas
los 7 pasos de la planificación de ventas
 
Metpen 1 Penelitian Ilmiah
Metpen 1   Penelitian IlmiahMetpen 1   Penelitian Ilmiah
Metpen 1 Penelitian Ilmiah
 
Gestão de Varejo Ambiental no Varejo e Supermercados
Gestão de Varejo Ambiental no Varejo e SupermercadosGestão de Varejo Ambiental no Varejo e Supermercados
Gestão de Varejo Ambiental no Varejo e Supermercados
 
Cartões e diferenciação de preços: argumentos e contra-argumentos
Cartões e diferenciação de preços: argumentos e contra-argumentosCartões e diferenciação de preços: argumentos e contra-argumentos
Cartões e diferenciação de preços: argumentos e contra-argumentos
 
03 Teori Organisasi Adm Publik
03 Teori Organisasi   Adm Publik03 Teori Organisasi   Adm Publik
03 Teori Organisasi Adm Publik
 
Mis2013 chapter 4 - database processing n data communication
Mis2013   chapter 4 - database processing n data communicationMis2013   chapter 4 - database processing n data communication
Mis2013 chapter 4 - database processing n data communication
 
Ob2013 chapter 16 budaya organisasi
Ob2013   chapter 16 budaya organisasiOb2013   chapter 16 budaya organisasi
Ob2013 chapter 16 budaya organisasi
 
Mis2013 chapter 2 purposes of information systems id
Mis2013   chapter 2 purposes of information systems idMis2013   chapter 2 purposes of information systems id
Mis2013 chapter 2 purposes of information systems id
 
Konsep Process dalam Sistem Komputer
Konsep Process dalam Sistem KomputerKonsep Process dalam Sistem Komputer
Konsep Process dalam Sistem Komputer
 
Chapter 1 introduction to ob
Chapter 1 introduction to obChapter 1 introduction to ob
Chapter 1 introduction to ob
 
Organisasi Komputer bhn kuliah m10 r1
Organisasi Komputer bhn kuliah m10 r1Organisasi Komputer bhn kuliah m10 r1
Organisasi Komputer bhn kuliah m10 r1
 
Cover paper Algoritma Symboolon
Cover paper Algoritma SymboolonCover paper Algoritma Symboolon
Cover paper Algoritma Symboolon
 
Sis tel its_solutions
Sis tel its_solutionsSis tel its_solutions
Sis tel its_solutions
 
Algoritma Symboolon
Algoritma SymboolonAlgoritma Symboolon
Algoritma Symboolon
 
Wireless Sensor Network
Wireless Sensor NetworkWireless Sensor Network
Wireless Sensor Network
 
Manajemen Chapter 7 (Pengambilan Keputusan)
Manajemen Chapter 7 (Pengambilan Keputusan)Manajemen Chapter 7 (Pengambilan Keputusan)
Manajemen Chapter 7 (Pengambilan Keputusan)
 
Data communication & telecommunication
Data communication & telecommunicationData communication & telecommunication
Data communication & telecommunication
 

Similar to Mis2013 chapter 12 business intelligence and knowledge management

AJAY _ Synopsis-1(1).pdf for project report for bca
AJAY _ Synopsis-1(1).pdf for project report for bcaAJAY _ Synopsis-1(1).pdf for project report for bca
AJAY _ Synopsis-1(1).pdf for project report for bcachauhanajay68136
 
Lecture 5 the information system a general model of ais:update version
Lecture 5  the information system   a general model of ais:update versionLecture 5  the information system   a general model of ais:update version
Lecture 5 the information system a general model of ais:update versionHabib Ullah Qamar
 
Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big DataRavinder Kamboj
 
INTRODUCTION TO DATABASE
INTRODUCTION TO DATABASEINTRODUCTION TO DATABASE
INTRODUCTION TO DATABASECS_GDRCST
 
Analytic Snapshots: Common Use Cases that Everyone Can Utilize (Dreamforce 2...
Analytic Snapshots:  Common Use Cases that Everyone Can Utilize (Dreamforce 2...Analytic Snapshots:  Common Use Cases that Everyone Can Utilize (Dreamforce 2...
Analytic Snapshots: Common Use Cases that Everyone Can Utilize (Dreamforce 2...Rhonda Ross
 
Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)
Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)
Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)AYESHA JAVED
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousingamooool2000
 
Building Information System
Building Information SystemBuilding Information System
Building Information SystemRabia Jabeen
 
04.project billing system
04.project billing system04.project billing system
04.project billing systemgirivaishali
 
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Sunil Nair
 
Data warehousing has quickly evolved into a unique and popular busin.pdf
Data warehousing has quickly evolved into a unique and popular busin.pdfData warehousing has quickly evolved into a unique and popular busin.pdf
Data warehousing has quickly evolved into a unique and popular busin.pdfapleather
 
Big data – A Review
Big data – A ReviewBig data – A Review
Big data – A ReviewIRJET Journal
 
MIS Project 5Just choose one of the projects cases there are .docx
MIS  Project 5Just choose one of the projects cases there are .docxMIS  Project 5Just choose one of the projects cases there are .docx
MIS Project 5Just choose one of the projects cases there are .docxraju957290
 
Analytics and Self Service
Analytics and Self ServiceAnalytics and Self Service
Analytics and Self ServiceMike Streb
 
Library mangement system project srs documentation.doc
Library mangement system project srs documentation.docLibrary mangement system project srs documentation.doc
Library mangement system project srs documentation.docjimmykhan
 
3 recent development
3 recent development3 recent development
3 recent developmentsakshi garg
 

Similar to Mis2013 chapter 12 business intelligence and knowledge management (20)

AJAY _ Synopsis-1(1).pdf for project report for bca
AJAY _ Synopsis-1(1).pdf for project report for bcaAJAY _ Synopsis-1(1).pdf for project report for bca
AJAY _ Synopsis-1(1).pdf for project report for bca
 
Lecture 5 the information system a general model of ais:update version
Lecture 5  the information system   a general model of ais:update versionLecture 5  the information system   a general model of ais:update version
Lecture 5 the information system a general model of ais:update version
 
Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big Data
 
INTRODUCTION TO DATABASE
INTRODUCTION TO DATABASEINTRODUCTION TO DATABASE
INTRODUCTION TO DATABASE
 
Analytic Snapshots: Common Use Cases that Everyone Can Utilize (Dreamforce 2...
Analytic Snapshots:  Common Use Cases that Everyone Can Utilize (Dreamforce 2...Analytic Snapshots:  Common Use Cases that Everyone Can Utilize (Dreamforce 2...
Analytic Snapshots: Common Use Cases that Everyone Can Utilize (Dreamforce 2...
 
Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)
Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)
Exercise solution of chapter1 of datawarehouse cs614(solution of exercise)
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Building Information System
Building Information SystemBuilding Information System
Building Information System
 
04.project billing system
04.project billing system04.project billing system
04.project billing system
 
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
 
Data warehousing has quickly evolved into a unique and popular busin.pdf
Data warehousing has quickly evolved into a unique and popular busin.pdfData warehousing has quickly evolved into a unique and popular busin.pdf
Data warehousing has quickly evolved into a unique and popular busin.pdf
 
Big data – A Review
Big data – A ReviewBig data – A Review
Big data – A Review
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
 
Offshore Projects
Offshore ProjectsOffshore Projects
Offshore Projects
 
MIS Project 5Just choose one of the projects cases there are .docx
MIS  Project 5Just choose one of the projects cases there are .docxMIS  Project 5Just choose one of the projects cases there are .docx
MIS Project 5Just choose one of the projects cases there are .docx
 
Analytics and Self Service
Analytics and Self ServiceAnalytics and Self Service
Analytics and Self Service
 
Big Data analytics usage
Big Data analytics usageBig Data analytics usage
Big Data analytics usage
 
Library mangement system project srs documentation.doc
Library mangement system project srs documentation.docLibrary mangement system project srs documentation.doc
Library mangement system project srs documentation.doc
 
3 recent development
3 recent development3 recent development
3 recent development
 
ms-11.pdf
ms-11.pdfms-11.pdf
ms-11.pdf
 

More from Andi Iswoyo

Ob2013 chapter 17 perubahan dan pengembangan organisasi
Ob2013   chapter 17 perubahan dan pengembangan organisasiOb2013   chapter 17 perubahan dan pengembangan organisasi
Ob2013 chapter 17 perubahan dan pengembangan organisasiAndi Iswoyo
 
Ob2013 chapter 14 teknologi dan desain kerja
Ob2013   chapter 14 teknologi dan desain kerjaOb2013   chapter 14 teknologi dan desain kerja
Ob2013 chapter 14 teknologi dan desain kerjaAndi Iswoyo
 
Ob2013 chapter 13 dasar-dasar struktur organisasi
Ob2013   chapter 13 dasar-dasar struktur organisasiOb2013   chapter 13 dasar-dasar struktur organisasi
Ob2013 chapter 13 dasar-dasar struktur organisasiAndi Iswoyo
 
Ob2013 chapter 15 sietem penilaian kinerja dan penghargaan
Ob2013   chapter 15 sietem penilaian kinerja dan penghargaanOb2013   chapter 15 sietem penilaian kinerja dan penghargaan
Ob2013 chapter 15 sietem penilaian kinerja dan penghargaanAndi Iswoyo
 
Mis2013 chapter 11 kecerdasan buatan
Mis2013   chapter 11 kecerdasan buatanMis2013   chapter 11 kecerdasan buatan
Mis2013 chapter 11 kecerdasan buatanAndi Iswoyo
 
Mis2013 chapter 10 sistem informasi dalam organisasi (2)
Mis2013   chapter 10 sistem informasi dalam organisasi (2)Mis2013   chapter 10 sistem informasi dalam organisasi (2)
Mis2013 chapter 10 sistem informasi dalam organisasi (2)Andi Iswoyo
 
Mis2013 chapter 9 sistem informasi dalam organisasi (1)
Mis2013   chapter 9 sistem informasi dalam organisasi (1)Mis2013   chapter 9 sistem informasi dalam organisasi (1)
Mis2013 chapter 9 sistem informasi dalam organisasi (1)Andi Iswoyo
 
Mis2013 chapter 8 - supply chain management
Mis2013   chapter 8 - supply chain managementMis2013   chapter 8 - supply chain management
Mis2013 chapter 8 - supply chain managementAndi Iswoyo
 
Mis2013 chapter 7 e-commerce
Mis2013   chapter 7 e-commerceMis2013   chapter 7 e-commerce
Mis2013 chapter 7 e-commerceAndi Iswoyo
 
Mis2013 chapter 6 - pengembangan sistem
Mis2013   chapter 6 - pengembangan sistemMis2013   chapter 6 - pengembangan sistem
Mis2013 chapter 6 - pengembangan sistemAndi Iswoyo
 
Mis2013 chapter 5 - teknologi internet
Mis2013   chapter 5 - teknologi internetMis2013   chapter 5 - teknologi internet
Mis2013 chapter 5 - teknologi internetAndi Iswoyo
 
Mis2013 chapter 3 hardware and software id
Mis2013   chapter 3 hardware and software idMis2013   chapter 3 hardware and software id
Mis2013 chapter 3 hardware and software idAndi Iswoyo
 
Mis2013 chapter 1-pengantar manajemen informasi
Mis2013   chapter 1-pengantar manajemen informasiMis2013   chapter 1-pengantar manajemen informasi
Mis2013 chapter 1-pengantar manajemen informasiAndi Iswoyo
 
Mis2013 chapter 13-keamanan sistem informasi
Mis2013   chapter 13-keamanan sistem informasiMis2013   chapter 13-keamanan sistem informasi
Mis2013 chapter 13-keamanan sistem informasiAndi Iswoyo
 
OB2013 - chapter 12 konflik dan negosiasi
OB2013 - chapter 12 konflik dan negosiasiOB2013 - chapter 12 konflik dan negosiasi
OB2013 - chapter 12 konflik dan negosiasiAndi Iswoyo
 
OB2013 - chapter 10 kepemimpinan
OB2013 - chapter 10 kepemimpinanOB2013 - chapter 10 kepemimpinan
OB2013 - chapter 10 kepemimpinanAndi Iswoyo
 
OB2013 - chapter 11 kekuasaan dan politik
OB2013 - chapter 11 kekuasaan dan politikOB2013 - chapter 11 kekuasaan dan politik
OB2013 - chapter 11 kekuasaan dan politikAndi Iswoyo
 
OB2013 - chapter 9 memahami tim kerja
OB2013 - chapter 9 memahami tim kerjaOB2013 - chapter 9 memahami tim kerja
OB2013 - chapter 9 memahami tim kerjaAndi Iswoyo
 
Chapter 8 komunikasi
Chapter 8 komunikasiChapter 8 komunikasi
Chapter 8 komunikasiAndi Iswoyo
 
Chapter 4 database processing n data communication
Chapter 4   database processing n data communicationChapter 4   database processing n data communication
Chapter 4 database processing n data communicationAndi Iswoyo
 

More from Andi Iswoyo (20)

Ob2013 chapter 17 perubahan dan pengembangan organisasi
Ob2013   chapter 17 perubahan dan pengembangan organisasiOb2013   chapter 17 perubahan dan pengembangan organisasi
Ob2013 chapter 17 perubahan dan pengembangan organisasi
 
Ob2013 chapter 14 teknologi dan desain kerja
Ob2013   chapter 14 teknologi dan desain kerjaOb2013   chapter 14 teknologi dan desain kerja
Ob2013 chapter 14 teknologi dan desain kerja
 
Ob2013 chapter 13 dasar-dasar struktur organisasi
Ob2013   chapter 13 dasar-dasar struktur organisasiOb2013   chapter 13 dasar-dasar struktur organisasi
Ob2013 chapter 13 dasar-dasar struktur organisasi
 
Ob2013 chapter 15 sietem penilaian kinerja dan penghargaan
Ob2013   chapter 15 sietem penilaian kinerja dan penghargaanOb2013   chapter 15 sietem penilaian kinerja dan penghargaan
Ob2013 chapter 15 sietem penilaian kinerja dan penghargaan
 
Mis2013 chapter 11 kecerdasan buatan
Mis2013   chapter 11 kecerdasan buatanMis2013   chapter 11 kecerdasan buatan
Mis2013 chapter 11 kecerdasan buatan
 
Mis2013 chapter 10 sistem informasi dalam organisasi (2)
Mis2013   chapter 10 sistem informasi dalam organisasi (2)Mis2013   chapter 10 sistem informasi dalam organisasi (2)
Mis2013 chapter 10 sistem informasi dalam organisasi (2)
 
Mis2013 chapter 9 sistem informasi dalam organisasi (1)
Mis2013   chapter 9 sistem informasi dalam organisasi (1)Mis2013   chapter 9 sistem informasi dalam organisasi (1)
Mis2013 chapter 9 sistem informasi dalam organisasi (1)
 
Mis2013 chapter 8 - supply chain management
Mis2013   chapter 8 - supply chain managementMis2013   chapter 8 - supply chain management
Mis2013 chapter 8 - supply chain management
 
Mis2013 chapter 7 e-commerce
Mis2013   chapter 7 e-commerceMis2013   chapter 7 e-commerce
Mis2013 chapter 7 e-commerce
 
Mis2013 chapter 6 - pengembangan sistem
Mis2013   chapter 6 - pengembangan sistemMis2013   chapter 6 - pengembangan sistem
Mis2013 chapter 6 - pengembangan sistem
 
Mis2013 chapter 5 - teknologi internet
Mis2013   chapter 5 - teknologi internetMis2013   chapter 5 - teknologi internet
Mis2013 chapter 5 - teknologi internet
 
Mis2013 chapter 3 hardware and software id
Mis2013   chapter 3 hardware and software idMis2013   chapter 3 hardware and software id
Mis2013 chapter 3 hardware and software id
 
Mis2013 chapter 1-pengantar manajemen informasi
Mis2013   chapter 1-pengantar manajemen informasiMis2013   chapter 1-pengantar manajemen informasi
Mis2013 chapter 1-pengantar manajemen informasi
 
Mis2013 chapter 13-keamanan sistem informasi
Mis2013   chapter 13-keamanan sistem informasiMis2013   chapter 13-keamanan sistem informasi
Mis2013 chapter 13-keamanan sistem informasi
 
OB2013 - chapter 12 konflik dan negosiasi
OB2013 - chapter 12 konflik dan negosiasiOB2013 - chapter 12 konflik dan negosiasi
OB2013 - chapter 12 konflik dan negosiasi
 
OB2013 - chapter 10 kepemimpinan
OB2013 - chapter 10 kepemimpinanOB2013 - chapter 10 kepemimpinan
OB2013 - chapter 10 kepemimpinan
 
OB2013 - chapter 11 kekuasaan dan politik
OB2013 - chapter 11 kekuasaan dan politikOB2013 - chapter 11 kekuasaan dan politik
OB2013 - chapter 11 kekuasaan dan politik
 
OB2013 - chapter 9 memahami tim kerja
OB2013 - chapter 9 memahami tim kerjaOB2013 - chapter 9 memahami tim kerja
OB2013 - chapter 9 memahami tim kerja
 
Chapter 8 komunikasi
Chapter 8 komunikasiChapter 8 komunikasi
Chapter 8 komunikasi
 
Chapter 4 database processing n data communication
Chapter 4   database processing n data communicationChapter 4   database processing n data communication
Chapter 4 database processing n data communication
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

Mis2013 chapter 12 business intelligence and knowledge management

  • 1. David Kroenke Business Intelligence and Knowledge Management Chapter 9 © 2007 Prentice Hall, Inc. 1
  • 2.  Understand the need for business intelligence systems.  Know the characteristics of reporting systems.  Know the purpose and role of data warehouses and data marts.  Understand fundamental data-mining techniques.  Know the purpose, features, and functions of knowledge management systems. © 2007 Prentice Hall, Inc. 2
  • 3.  According to a study done at the University of California at Berkeley, a total of 403 petabytes of new data were created.  403 petabytes is roughly the amount of all printed material ever written. ◦ The printed collection of the Library of Congress is .01 petabytes. ◦ 400 petabytes equals 40,000 copies of the print collection of the Library of Congress. © 2007 Prentice Hall, Inc. 3
  • 4.  The generation of all these data has much to do with Moore’s Law.  The capacity of storage devices increases as their costs decrease.  Today, storage capacity is nearly unlimited.  We are drowning in data and starving for information. © 2007 Prentice Hall, Inc. 4
  • 5. © 2007 Prentice Hall, Inc. 5 Source: Used with permission of Peter Lyman and Hal R. Varian, University of California at Berkeley.
  • 6. © 2007 Prentice Hall, Inc. 6 Source: Used with permission of Peter Lyman and Hal R. Varian, University of California at Berkeley.
  • 7.  Tools for searching business data in an attempt to find patterns is called business intelligence (BI) tools.  Reporting tools are programs that read data from a variety of sources, process that data, produce formatted reports, and deliver those reports to the users who need them. © 2007 Prentice Hall, Inc. 7
  • 8.  The processing of data is simple: ◦ Data are sorted and grouped. ◦ Simple totals and averages are calculated.  Reporting tools are used primarily for assessment ◦ They are used to address questions like:  What has happened in the past?  What is the current situation?  How does the current situation compare to the past? © 2007 Prentice Hall, Inc. 8
  • 9.  Data-mining tools process data using statistical techniques, many of which are sophisticated and mathematically complex.  Data mining involves searching for patterns and relationships among data.  In most cases, data-mining tools are used to make predictions.  For example, we can use one form of analysis to compute the probability that a customer will default on a loan. © 2007 Prentice Hall, Inc. 9
  • 10.  Another way to distinguish the differences of reporting tools and data-mining tools is : ◦ Reporting tools use simple operations like sorting, grouping, and summing. ◦ Data-mining tools use sophisticated techniques. © 2007 Prentice Hall, Inc. 10
  • 11.  An information system is a collection of hardware, software, data, procedures, and people.  The purpose of a business intelligence (BI) system is to provide the right information, to the right user, at the right time.  BI systems help users accomplish their goals and objectives by producing insights that lead to actions. © 2007 Prentice Hall, Inc. 11
  • 12.  A reporting tool can generate a report that shows a customer has canceled an important order.  A reporting system, however, alerts that customer’s salesperson with this unwanted news, and does so in time for the salesperson to try to alter the customer’s decision.  A data-mining tool can create an equation that computes the probability that a customer will default on a loan. © 2007 Prentice Hall, Inc. 12
  • 13.  A data-mining system uses that equation to enable banking personnel to assess new loan applications. © 2007 Prentice Hall, Inc. 13
  • 14.  The purpose of a reporting system is to create meaningful information from disparate data sources and to deliver that information to the proper user on a timely basis.  Reporting systems generate information from data as a result of four operations: ◦ Filtering data ◦ Sorting data ◦ Grouping data ◦ Making simple calculations on the data © 2007 Prentice Hall, Inc. 14
  • 15. © 2007 Prentice Hall, Inc. 15
  • 16. © 2007 Prentice Hall, Inc. 16
  • 17.  A reporting system maintains a database of reporting metadata.  The metadata describes the reports, users, groups, roles, events, and other entities involved in the reporting activity.  The reporting system uses the metadata to prepare and deliver reports to the proper users on a timely basis. © 2007 Prentice Hall, Inc. 17
  • 18. © 2007 Prentice Hall, Inc. 18
  • 19. © 2007 Prentice Hall, Inc. 19
  • 20.  In terms of a report type, reports can be static or dynamic.  Static reports are prepared once from the underlying data, and they do not change. ◦ Example, a report of past year’s sales  Dynamic reports: the reporting system reads the most current data and generates the report using that fresh data. ◦ Examples are: a report on sales today and a report on current stock prices © 2007 Prentice Hall, Inc. 20
  • 21.  Query reports are prepared in response to data entered by users.  Online analytical processing (OLAP) reports allow the user to dynamically change the report grouping structures. © 2007 Prentice Hall, Inc. 21
  • 22.  Reports are delivered via many different report media or channels.  Some reports are printed on paper, and others are created in a format like PDF whereby they can be printed or viewed electronically.  Other reports are delivered to computer screens.  Companies sometimes place reports on internal corporate Web sites for employees to access. © 2007 Prentice Hall, Inc. 22
  • 23.  Another report medium is a digital dashboard, which is an electronic display customized for a particular user. ◦ Vendors like Yahoo! and MSN provide common examples. ◦ Users of these services can define content they want- say, a local weather forecast, a list of stock prices, or a list of news sources. ◦ The vendor constructs the display customized for each user. © 2007 Prentice Hall, Inc. 23
  • 24.  Other dashboards are particular to an organization. ◦ The organization might have a dashboard that shows up-to-the-minute production and sales activities.  Alerts are another form of report. ◦ Users can declare that they wish to receive notifications of events, say, via email or on their cell phones.  Reports can be published via a Web service. ◦ The Web service produces the report in response to requests from the service-consuming application. © 2007 Prentice Hall, Inc. 24
  • 25. © 2007 Prentice Hall, Inc. 25
  • 26.  The report mode can be either push report or pull report.  Organizations send a push report to users according to a preset schedule. ◦ Users receive the report without any activity on their part.  Users must request a pull report. ◦ To obtain a pull report, a user goes to a Web portal or digital dashboard and clicks a link or button to cause the reporting system to produce and deliver the report. © 2007 Prentice Hall, Inc. 26
  • 27.  Three functions of reporting systems are: ◦ Authoring ◦ Management ◦ Delivery  Report authoring involves connecting to data sources, creating the reporting structure, and formatting the report. © 2007 Prentice Hall, Inc. 27
  • 28. © 2007 Prentice Hall, Inc. 28 Source: Microsoft product screen shot reprinted with permission from Microsoft Corporation.
  • 29. © 2007 Prentice Hall, Inc. 29 Source: Microsoft product screen shot reprinted with permission from Microsoft Corporation.
  • 30.  The purpose of report management is to define who receives what reports, when, and by what means.  Most report-management systems allow the report administrator to define user accounts and user groups and to assign particular users to particular groups.  Reports that have been created using the report-authoring system are assigned groups and users. © 2007 Prentice Hall, Inc. 30
  • 31.  Assigning reports to groups saves the administrator work. ◦ When a report is created, changed, or removed, the administrator need only change the report assignments to the group. ◦ All of the users in the group will inherit the changes.  Metadata also indicates what channel is to be used and whether the report is to be pushed or pulled. ◦ If the report is to be pushed, the administrator declares whether the report is to be generated on a regular schedule or as an alert. © 2007 Prentice Hall, Inc. 31
  • 32.  The report-delivery function of a reporting system pushes reports or allows them to be pulled according to report-management metadata.  Reports can be delivered via an email server, Web site, XML Web services, or by other program-specific means.  The report-delivery system uses the operating system and other program security components to ensure that only authorized users receive authorized reports. © 2007 Prentice Hall, Inc. 32
  • 33.  The report-delivery system also ensures that push reports are produced at appropriate times.  For query reports, the report-delivery system serves as an intermediary between the user and the report generator. ◦ It receives user query data, such as item numbers in an inventory query, passes the query data to the report generator, receives the resulting report, and delivers the report to the user. © 2007 Prentice Hall, Inc. 33
  • 34.  RFM analysis is a way of analyzing and ranking customers according to their purchasing patterns.  It is a simple technique that considers how recently (R) a customer has ordered, how frequently (F) a customer orders, and how much money (M) the customer spends per order.  To produce an RFM score, the program first sorts customer purchase records by the date of their most recent (R) purchase. © 2007 Prentice Hall, Inc. 34
  • 35.  In a common form of this analysis, the program then divides the customers into five groups and gives customers in each group a score of 1 to 5. ◦ The top 20% of the customers having the most recent orders are given an R score 1 (highest).  The program then re-sorts the customers on the basis of how frequently they order. ◦ The top 20% of the customers who order most frequently are given a F score of 1 (highest).  Finally the program sorts the customers again according to the amount spent on their orders. ◦ The 20% who have ordered the most expensive items are given an M score of 1 (highest).© 2007 Prentice Hall, Inc. 35
  • 36.  A reporting system can generate the RFM data and deliver it in many ways: ◦ A report of RFM scores for all customers can be pushed to the vice president of sales. ◦ Reports with scores for particular regions can be pushed to regional sales managers. ◦ Reports of scores for particular accounts can be pushed to the account salespeople. ◦ All of this reporting can be automated. © 2007 Prentice Hall, Inc. 36
  • 37. © 2007 Prentice Hall, Inc. 37
  • 38.  Online analytical processing (OLAP) provides the ability to sum, count, average, and perform other simple arithmetic operations on groups of data.  The remarkable characteristics of OLAP reports is that they are dynamic.  The viewer of the report can change the report’s format, hence, the term online. © 2007 Prentice Hall, Inc. 38
  • 39.  An OLAP report has measures and dimensions.  A measure is the data item of interest. ◦ It is the item that is to be summed or averaged or otherwise processed in the OLAP report.  A dimension is a characteristic of a measure. ◦ Purchase data, customer type, customer location, and sales region are all examples of dimension. © 2007 Prentice Hall, Inc. 39
  • 40.  With an OLAP report, it is possible to drill down into the data. ◦ This term means to further divide the data into more detail.  Special-purpose products called OLAP servers have been developed to perform OLAP analysis.  An OLAP server reads data from an operational database, performs preliminary calculations, and stores the results of those operations in an OLAP database. © 2007 Prentice Hall, Inc. 40
  • 41. © 2007 Prentice Hall, Inc. 41
  • 42. © 2007 Prentice Hall, Inc. 42
  • 43. © 2007 Prentice Hall, Inc. 43
  • 44. © 2007 Prentice Hall, Inc. 44
  • 45.  Basic reports and simple OLAP analyses can be made directly from operational data.  For the most part, such reports display the current state of the business; and if there are a few missing values or small inconsistencies with the data, no one is too concerned.  Operational data are unsuited to more sophisticated analyses, particularly, data- mining analyses that require high-quality input for accurate and useful results. © 2007 Prentice Hall, Inc. 45
  • 46.  Many organizations choose to extract operational data into facilities called data warehouses and data marts, both of which are facilities that prepare, store, and manage data specifically for data mining and other analyses.  Programs read operational data and extract, clean, and prepare that data for BI processing.  The prepared data are stored in a data- warehouse database using data-warehouse DBMS, which can be different from the organization’s operational DBMS. © 2007 Prentice Hall, Inc. 46
  • 47.  Data warehouses include data that are purchased from outside sources.  Metadata concerning the data, its source, its format, its assumptions and constraints, and other facts about the data is kept in a data- warehouse metadata database.  The data-warehouse DBMS extracts and provides data to business intelligence tools such as data-mining programs. © 2007 Prentice Hall, Inc. 47
  • 48. © 2007 Prentice Hall, Inc. 48
  • 49. © 2007 Prentice Hall, Inc. 49
  • 50.  Most operational and purchased data have problems that inhibit their usefulness for business intelligence.  Problematic data are termed dirty data. ◦ Examples are values of B for customer gender and of 213 for customer age.  Purchased data often contain missing elements. ◦ Most data vendors state the percentage of missing values for each attribute in the data they sell. ◦ An organization buys such data because for some uses, some data is better than no data at all. © 2007 Prentice Hall, Inc. 50
  • 51.  Inconsistent data are particularly common for data that have been gathered over time. ◦ When an area code changes, for example, the phone number for a given customer before the change will not match the customer’s number after the change.  Some data inconsistencies occur from the nature of the business activity.  Nonintegrated data can cause problems when data comes from different management information systems. © 2007 Prentice Hall, Inc. 51
  • 52.  Data can be too fine or too coarse. ◦ It is possible to capture the customers clicking behavior in what is termed clickstream data that includes everything a customer does at a Web site.  If data is in the wrong format, that condition is sometimes expressed by saying the data have the wrong granularity.  Because of a phenomenon called the curse of dimensionally, the more attributes there are, the easier it is to build a model that fits the sample data but that is worthless as a predictor. © 2007 Prentice Hall, Inc. 52
  • 53. © 2007 Prentice Hall, Inc. 53
  • 54.  The data warehouse takes data from the data manufacturers (operational systems and purchased data), cleans and processes the data, and locates the data on the shelves, so to speak, of the data warehouse.  A data mart is a data collection, smaller than the data warehouse, that addresses a particular component or functional area of the business. © 2007 Prentice Hall, Inc. 54
  • 55.  The data warehouse is like the distributor in the supply chain and the data mart is like the retail store in the supply chain.  Users in the data mart obtain data that pertain to a particular business function from the data warehouse.  It is expensive to create, staff, and operate data warehouses and data marts. © 2007 Prentice Hall, Inc. 55
  • 56. © 2007 Prentice Hall, Inc. 56
  • 57.  Data mining is the application of statistical techniques to find patterns and relationships among data and to classify and predict.  Data mining represents a convergence of disciplines.  Data-mining techniques emerged from statistics and mathematics and from artificial intelligence and machine-learning fields in computer science. © 2007 Prentice Hall, Inc. 57
  • 58. © 2007 Prentice Hall, Inc. 58
  • 59.  With unsupervised data mining, analysts do not create a model or hypothesis before running the analysis.  Instead, they apply the data-mining technique to the data and observe the results.  Analysts create hypotheses after the analysis to explain the patterns found. © 2007 Prentice Hall, Inc. 59
  • 60.  One common unsupervised technique is cluster analysis. ◦ A common use for cluster analysis is to find groups of similar customers from customer order and demographic data. © 2007 Prentice Hall, Inc. 60
  • 61.  With supervised data mining, data miners develop a model prior to the analysis and apply statistical techniques to data to estimate parameters of the model.  One such analysis, which measures the impact of a set of variables on another variable, is called a regression analysis.  Neural networks are another popular supervised data-mining technique used to predict values and make classifications such as “good prospect” or “poor prospect” customers. © 2007 Prentice Hall, Inc. 61
  • 62.  A market-basket analysis is a data-mining technique for determining sales patterns.  A market-basket analysis shows the products that customers tend to buy together.  In market-basket terminology, support is the probability that two items will be purchased together.  You can expect market-basket analysis to become a standard CRM analysis during your career. © 2007 Prentice Hall, Inc. 62
  • 63. © 2007 Prentice Hall, Inc. 63
  • 64.  A decision tree is a hierarchical arrangement of criteria that predict a classification or a value.  Decision tree analyses are an unsupervised data-mining technique.  The analyst sets up the computer program and provides the data to analyze, and the decision tree program produces the tree. © 2007 Prentice Hall, Inc. 64
  • 65. © 2007 Prentice Hall, Inc. 65
  • 66.  A common business application of decision trees is to classify loans by likelihood of default.  Organizations analyze data from past loans to produce a decision tree that can be converted to loan-decision rules. ◦ A financial institution could use such a tree to assess the default risk on a new loan. © 2007 Prentice Hall, Inc. 66
  • 67. © 2007 Prentice Hall, Inc. 67 Source: Used with permission of Insightful Corporation. Copyright © 1999-2005 Insightful Corporation. All Rights Reserved.
  • 68.  Knowledge management systems concern the sharing of knowledge that is already known to exist, either in libraries of documents, in the heads of employees, or in other known sources.  Knowledge management (KM) is the process of creating value from intellectual capital and sharing that knowledge with employees, managers, suppliers, customers, and others who need that capital. © 2007 Prentice Hall, Inc. 68
  • 69.  Knowledge management is a process that is supported by the five components of an information system. ◦ Its emphasis is on people, their knowledge, and effective means for sharing that knowledge with others.  The benefits of KM concern the application of knowledge to enable employees and others to leverage organizational knowledge to work smarter.  KM preserves organizational memory by capturing and storing the lessons learned and best practices of key employees. © 2007 Prentice Hall, Inc. 69
  • 70.  Content management systems are information systems that track organizational documents, Web pages, graphics, and related materials.  Such systems differ from operational document systems in that they do not directly support business operations.  KM content management systems are concerned with the creation, management, and delivery of documents that exist for the purpose of imparting knowledge. © 2007 Prentice Hall, Inc. 70
  • 71.  Typical users of content management systems are companies that sell complicated products and want to share their knowledge of those products with employees and customers.  The basic functions of content management systems are the same as for report management systems: author, manage, and deliver.  The only requirement that content managers place on document authoring is that the document has been created in a standardized© 2007 Prentice Hall, Inc. 71
  • 72.  Content management functions are, however, exceedingly complicated.  Most content databases are huge; some have thousands of individual documents, pages, and graphics. © 2007 Prentice Hall, Inc. 72
  • 73.  Documents may refer to one another or multiple documents may refer to the same product or procedure. ◦ When one of them changes, others must change as well. ◦ Some content management systems keep semantic linkages among documents so that content dependencies can be known and used to maintain document consistency. © 2007 Prentice Hall, Inc. 73
  • 74.  Document contents are perishable. ◦ Documents become obsolete and need to be altered, removed, or replaced.  Multinational companies have to ensure document language translations. © 2007 Prentice Hall, Inc. 74
  • 75. © 2007 Prentice Hall, Inc. 75 Source: microsoft.com/backstage/inside.htm (accessed February 2004). © 2003 Microsoft Corporation. All rights reserved.
  • 76. © 2007 Prentice Hall, Inc. 76 Source: Used with permission of Tom Rizzo of Microsoft Corporation.
  • 77. © 2007 Prentice Hall, Inc. 77 Source: Used with permission of Tom Rizzo of Microsoft Corporation.
  • 78.  Almost all users of content management systems pull the contents.  Users cannot pull content if they do not know it exists. ◦ The content must be arranged and indexed, and a facility for searching the content devised.  Documents that reside behind a corporate firewall, however, are not publicly accessible and will not be reachable by Google or other search engines. ◦ Organizations must index their own proprietary documents and provide their own search capability for them. © 2007 Prentice Hall, Inc. 78
  • 79.  Web browsers and other programs can readily format content expressed in HTML, PDF, or another standard format.  XML documents often contain their own formatting rules that browsers can interpret. ◦ The content management system will have to determine an appropriate format for content expressed in other ways. © 2007 Prentice Hall, Inc. 79
  • 80.  Nothing is more frustrating for a manager to contemplate than the situation in which one employee struggles with a problem that another employee knows how to solve easily.  KM systems are concerned with the sharing not only of content, but also with the sharing of knowledge among humans. ◦ How can one person share her knowledge with another? ◦ How can one person learn of another person’s great idea? © 2007 Prentice Hall, Inc. 80
  • 81.  Three forms of technology are used for knowledge- sharing among humans: ◦ Portals, discussion groups, and email ◦ Collaborations systems ◦ Expert systems Portals ◦ Employees can share ideas by posting knowledge on a Web portal whereby managers and employees can pull the knowledge from the portal. © 2007 Prentice Hall, Inc. 81
  • 82. © 2007 Prentice Hall, Inc. 82
  • 83. Discussion Groups ◦ Discussion groups allow employees or customers to post questions and queries seeking solutions to problems they have. ◦ Oracle, IBM, PeopleSoft, and other vendors support product discussion groups where users can post questions and where employees, vendors, and other users can answer them. ◦ Later, the organization can edit and summarize the questions from such discussion groups into frequently asked questions (FAQs). © 2007 Prentice Hall, Inc. 83
  • 84. Discussion groups (continued) ◦ Basic email can also be used for knowledge-sharing, especially if email lists have been constructed with KM in mind. ◦ Two human factors inhibit knowledge-sharing.  Employees can be reluctant to exhibit their ignorance.  Competition exists between employees. ◦ A KM application may be ill-suited to a competitive group.  The company may be able to restructure rewards and incentives to foster sharing of ideas among employees. © 2007 Prentice Hall, Inc. 84
  • 85. Collaboration Systems ◦ Collaboration systems are information systems that enable people to work together more effectively. ◦ The Internet can be used as a broadcast medium for speeches, panel discussion, and other types of meetings. ◦ Web broadcasts, because they are digital, can be readily saved and replayed at the viewer’s convenience. ◦ Web broadcasts can also be made interactive by combining them with discussion group bulletin boards that are live during the broadcast. ◦ Video conferencing is another popular form of IT- supported meetings.  Video-conferencing equipment is expensive and normally is located in selected sites in the organization. © 2007 Prentice Hall, Inc. 85
  • 86. Collaboration Systems (continued) ◦ Net meetings are a means by which individuals can participate in remote meetings without leaving their desk.  With a speaker and a Web camera, virtual meetings can be conducted among employees who sit in their own offices. © 2007 Prentice Hall, Inc. 86
  • 87. © 2007 Prentice Hall, Inc. 87
  • 88. Expert Systems ◦ Expert systems are created by interviewing experts in a given business domain and codifying the rules stated by those experts. ◦ Many expert systems were created in the late 1980s and 1990s, and some of them have been successful. ◦ Expert systems suffer from three major disadvantages.  They are difficult and expensive to develop.  They are difficult to maintain.  They were unable to live up to the high expectations set by their name. © 2007 Prentice Hall, Inc. 88
  • 89.  Enormous amounts of data are generated each year.  Business intelligence (BI) tools search these increasing amounts of data for useful information.  Reporting tools tend to be used for assessment, process data using simple calculations such as sums and averages. © 2007 Prentice Hall, Inc. 89
  • 90.  Data-mining tools, tend to be used for prediction, process data using sophisticated statistical and mathematical techniques.  Reporting systems create meaningful information from disparate data sources and deliver that information to the proper user on a timely basis.  RFM and OLAP are two examples of report applications. © 2007 Prentice Hall, Inc. 90
  • 91.  Data warehouses and data marts are facilities that prepare, store, and manage data for data mining and other analyses.  Data Market-basket analysis determines groups of products that customers tend to purchase together.  Decision trees are used to construct “If…Then…” rules for predicting classifications. © 2007 Prentice Hall, Inc. 91
  • 92.  Knowledge management is the process of creating value from intellectual capital and sharing that knowledge with employees, managers, suppliers, customers, and others who need that capital.  Human knowledge-sharing systems use portals, bulletin boards, and email to facilitate knowledge interchange.  Collaboration systems include net conferencing, video conferencing, and expert systems. © 2007 Prentice Hall, Inc. 92
  • 93. Business intelligence (BI) systems Business intelligence (BI) tools Clickstream data Cluster analysis Collaboration systems Confidence Content management systems Curse of dimensionality Data mart Data mining Data-mining tools Data warehouse © 2007 Prentice Hall, Inc. 93 Decision trees Digital dashboard Dimension Dirty data Discussion groups Drill down Dynamic report Exabyte Expert Systems Frequently asked questions (FAQs)
  • 94. © 2007 Prentice Hall, Inc. 94 Granularity If…then…rules Knowledge management (KM) Lift Market-basket analysis Measure Neural networks OLAP cube OLAP server Online analytical processing (OLAP) Petabyte Portals Pull report Push report Query report Regression analysis Report media Report mode Report type Reporting systems Reporting tools RFM analysis Semantic security
  • 95. © 2007 Prentice Hall, Inc. 95 Static report Supervised data mining Support Unsupervised data mining
  • 96. © 2007 Prentice Hall, Inc. 96 Security is a very difficult problem, and it gets worse every year. Physical security is hard enough: How do we know that the person (or program) that signs on as Megan Cho is really Megan Cho?  We use passwords, but files of passwords can be stolen. Suppose Megan works in the HR department, so she has access to personal and private data of other employees.
  • 97. © 2007 Prentice Hall, Inc. 97 We need to design the reporting system so that Megan can access all of the data she needs to do her job, and no more. A reporting server is an obvious and juicy target for any would-be intruder.  Someone can break in and change access permissions.  Or, a hacker could pose as someone else to obtain reports.
  • 98. © 2007 Prentice Hall, Inc. 98 Semantic security concerns the unintended release of a combination of reports or documents that are independently not protected. Megan was given just two reports to do her job  Yet she combined the information in those reports with publicly available information and is able to deduce salaries, for at least some employees.  These salaries are much more than she is supposed to know.  This is a semantic security problem.
  • 99. © 2007 Prentice Hall, Inc. 99 The product managers wanted the data miners to analyze customer clicks on a Web page to determine customer preferences for particular product lines.  The products were competing with one another for resources.  “Sampling?” asked the product managers in a chorus  “Sampling? No way. We want all the data. This is important, and we don’t want a guess.”
  • 100. © 2007 Prentice Hall, Inc. 10 0 There’s nothing wrong with sampling  Properly done, the results from a sample are just as accurate as results from the complete data set.  Studies done from samples are also cheaper and faster.  Sampling is a great way to save time and money. In truth, skill is required to develop a good sample.  The product managers should have listened to the data miners’ sampling plan and ensured that the sample would be appropriate, given the goals of the study.  Understanding this concept will save you and your organization substantial money!
  • 101. © 2007 Prentice Hall, Inc. 10 1 Classification is a useful human skill. Sorting and classifying are necessary, important, and essential activities.  But those activities can also be dangerous Serious ethical issues arise when we classify people.  What makes someone a good or bad “prospect”?  If we’re talking about classifying customers in order to prioritize our sales calls, then the ethical issue may not be too serious.  What about classifying applicants for college?
  • 102. © 2007 Prentice Hall, Inc. 10 2 I’m not really a contrarian about data mining.  I believe in it.  But data mining in the real world is a lot different from the way it’s described in textbooks  One problem is that data are always dirty, with missing values, values way out of the range of possibility, and time values that make no sense. “Another problem is that you know the least when you start the study”.  So you work for a few months and learn that if you had another variable, say the customer’s zip code, or age, or something else, you could do a much better analysis.
  • 103. © 2007 Prentice Hall, Inc. 10 3 Overfitting is another problem, a huge one.  With neural networks, you can create a model of any level of complexity you want, except that none of those equations will predict new cases with any accuracy at all.  When using neural nets, you have to be very careful not to overfit the data. Another problem is seasonality:  Say all your training data are from the summer-will your model be valid for the winter?
  • 104. © 2007 Prentice Hall, Inc. 10 4 “When you start a data-mining project, you never know how it will turn out”;  Some were bad and a wasted of time.  Some were good and found to have interesting and important patterns and information and created very accurate predictive models. It’s not easy, though, you have to be very careful and lucky.
  • 105. © 2007 Prentice Hall, Inc. 10 5 Computer simulation of World War III project at Pentagon 1971-1973 Analysis process  Run the simulation and obtain a set of results.  The military analysts and weapons experts would examine the results, and if results weren’t quite what was expected or wanted, the analysts would ask to change some of the inputs or a portion of the model.  Over time, an accumulated set of results was approved.  The accumulated results were presented to the four-star generals and other senior Pentagon managers.  Sometimes these senior people would see problems in the analyses, and gave instruct ions to discard some of the results.
  • 106. © 2007 Prentice Hall, Inc. 10 6 Observation  I do not believe that anyone thought they were deceiving anyone else.  The top managers didn’t realize that the results they saw left out a substantial portion of the unfavorable simulations.  They never knew about the other results.  The analysts who were filtering the outcomes by throwing out the numbers didn’t like being dishonest  They simply thought that those results were wrong or unrealistic.  I do not think they realized they were using the computer to promulgate their prior ideas about military needs.
  • 107. © 2007 Prentice Hall, Inc. 10 7 Questions to think about  Why perform the analysis?  What are you going to do with the results?  What is it that you want to know or to decide? Answer the questions above before you begin the analysis.  Then, pay attention to the results.  Don’t argue with the data.  If the results don’t conform to your expectations, think long and hard about changing the model, adjusting the data, or modifying the answers.