SlideShare a Scribd company logo
1 of 27
Introduction to
Data Mining and Business Intelligence
                        Lecture 1/DMBI/IKI83403T/MTI/UI

          Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id)
        Faculty of Computer Science, University of Indonesia
Objectives
       Motivation: Why data mining?
       What is data mining?
       Understand the drivers for BI initiatives in modern
        organizations
       Understand the structure, components, and process of BI




    2
Motivation: Why data mining?
       Data explosion problem:
           Automated data collection tools and mature database
            technology lead to tremendous amounts of data stored in
            databases, data warehouses and other information repositories.
       We are drowning in data, but starving for knowledge!
       Data Mining:
           Extraction of interesting knowledge (rules, regularities, patterns,
            constraints) from data in large databases [JH].
           Analysis of the large quantities of data that are stored in
            computers [DO],
       Alternative names
           KDD, knowledge extraction, data archeology, information
            harvesting, business intelligence, etc.

    3
Data Rich but Information Poor



        Databases are too big




                   Data Mining can help
                   discover knowledge




4
Evolution of Database Technology

       Data Collection, database creation, network DBMS

       Relational data model, relational DBMS implementation

       RDBMS, advanced data models (extended-relational, OO,
        etc.) and application-oriented DBMS (spatial, scientific,
        engineering, etc.)

       Data mining and data warehousing, multimedia databases,
        and Web technology

    5
Potential Applications
       See TSBD lecture notes – Data Mining
       See Chapter 1 of DO
           Retailing
           Banking
           Credit Card Management
           Insurance
           Telecommunications
           Telemarketing
           Human Resource Management




    6
Data Mining Should Not be Used Blindly
       Data mining find regularities from history, but history is
        not the same as the future.
       Association does not dictate trend nor causality.
       Some abnormal data could be caused by human.




    7
Another view of BI
       BI is a broad field and it is viewed differently by different
        people.
       Common agreement on major components:
           A centralized repository of data  data warehouse
           An end-user set of tools to create reports and queries from
            data and information and to analyze the data, information, and
            reports  business analytics
           To find non-obvious relationship among large amounts of data
             data mining, for text  text mining, for web  web mining
           Business Performance Management (BPM) to set goals as
            metrics and standards and monitoring and measuring
            performance by using the BI methodology.


    8
Drivers of BI
       Organizations are being compelled to capture,
        understand, and harness their data to support
        decision making in order to improve business
        operations
       Business cycle times are now extremely compressed;
        faster, more informed, and better decision making is
        therefore a competitive imperative
       Managers need the right information at the right time
        and in the right place
       Case Study 1: BI success story at Toyota Motor
        Company (Chapter 1 ET pg. 4-6).

    9
Business Value of BI




10
Data Mining Functionality
    Association
        From association, correlation, to causality
        Finding rules like ―A -> B‖
    Classification and Prediction
        Classify data based on the values ina classifying attribute
        Predict some unknown or missing attribute values based on other
         information
    Cluster analysis
        Group data to form new classes, e.g., cluster houses to find distribution
         patterns
    Outlier and exception data analysis
    Time series analysis (trend and deviation)
        Trend and deviation analysis: regression, sequential pattern, similiar
    11
         sequences e.g. Stock analysis
Sarbanes-Oxley Act of 2002
(extracted from Gartner, Inc., 2004)
   The Sarbanes-Oxley Act of 2002 mandates
    drove one firm to implement a new financial
    performance management system, capable of
    meeting the new requirements to:
       Perform flawless analysis and compilation of
        thousands of transactions and journal entries.
       Balance more access to data with the need to
        control access to sensitive insider information.
       Deliver reports to the SEC in less time.


12
Sarbanes-Oxley Act of 2002
(extracted from Gartner, Inc., 2004) ...                              continued

    Within the overarching goal of achieving financial-reporting
     compliance, these objectives included the following:
        Get more eyes on the data and KPI and build in strict security
         controls
        Provide live reports that allow people to drill down to the lowest
         level of transaction detail
        Proactively scour the financial databases for anomalies, using variance
         triggers
        Gather all financial data into a cohesive database
    Complement accounting and budgeting applications for flexible
     reporting, free-form investigation, and automated data analysis.
    BI can proactively alert specific individuals whenever an
     anomay is detected.
    13
Now let us see some screenshots.....




14
Dashboard




15
And another dashboard.....




16
And another dashboard....




17
Financial Reporting




18
Back to theory..... 




19
KDD Process
                                            Interpretation/
                                                         Knowledge
                                            Evaluation

                              Data Mining

                  Transformation


        Preprocessin
        g

Selection
        Target Data

Data

20
Steps of a KDD Process
    Learning the application domain:
        relevant prior knowledge and goals of application
    Creating a target data set: data selection
    Data cleaning and preprocessing: (may take 60% of effort!)
    Data reduction and projection:
        Find useful features, dimensionality/variable reduction, invariant
         representation.
    Choosing functions of data mining
        summarization, classification, regression, association, clustering.
    Choosing the mining algorithm(s)
    Data mining: search for patterns of interest
    Interpretation: analysis of results.
        visualization, transformation, removing redundant patterns, etc.
    Use of discovered knowledge
    21
Teradata Advanced Analytics Methodology
(similar to CRISP-DM)




22
Structure and Components of BI




23
Structure and Components of BI... continued
 Data Warehouse
       Data flows from operational systems (e.g., CRM,
        ERP) to a DW, which is a special database or
        repository of data that has been prepared to
        support decision-making applications ranging from
        those for simple reporting and querying to complex
        optimization
   Business Analytics/OLAP
       Software tools that allow users to create on-
        demand reports and queries and to conduct
        analysis of data

24
Structure and Components of BI... continued
   Data Mining
       Data mining is a class of database information
        analysis that looks for hidden patterns in a
        group of data that can be used to predict future
        behavior
       Used to replace or enhance human intelligence
        by scanning through massive storehouses of data
        to discover meaningful new correlations,
        patterns, and trends, by using pattern
        recognition technologies and advanced statistics
25
Structure and Components of BI... continued
 Business   Performance Management (BPM)
    Based on the balanced scorecard methodology—
     a framework for defining, implementing, and
     managing an enterprise’s business strategy by
     linking objectives with factual measures
    Dashboards
     A visual presentation of critical data for
     executives to view. It allows executives to see
     hot spots in seconds and explore the situation


26
BI: Today and Tomorrow
    Recent industry analyst reports show that in the coming
     years, millions of people will use BI visual tools and
     analytics every day
    BI takes advantage of already developed and installed
     components of IT technologies, helping companies
     leverage their current IT investments and use valuable
     data stored in legacy and transactional systems
    Some Issues:
        Mining information from heterogeneous databases and global
         information systems
        Handling relational and complex types of data
        Efficiency and scalability of data mining algorithms


    27

More Related Content

What's hot

Information Management Strategy
Information Management StrategyInformation Management Strategy
Information Management StrategyDoug Devitre
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningYashwant Rautela
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Julian Hyde
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)Eva Tse
 
An Introduction to MapReduce
An Introduction to MapReduceAn Introduction to MapReduce
An Introduction to MapReduceFrane Bandov
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for HealthcareChandan Reddy
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...
How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...
How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...InfluxData
 
Efficient monitoring and alerting
Efficient monitoring and alertingEfficient monitoring and alerting
Efficient monitoring and alertingTobias Schmidt
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
Visualizing climate change through data
Visualizing climate change through dataVisualizing climate change through data
Visualizing climate change through dataZachary Labe
 
Timeline to internet
Timeline to internetTimeline to internet
Timeline to internetRomanbelic911
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningSlideshare
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouseKrish_ver2
 
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
From Idea to Model: Productionizing Data Pipelines with Apache AirflowFrom Idea to Model: Productionizing Data Pipelines with Apache Airflow
From Idea to Model: Productionizing Data Pipelines with Apache AirflowDatabricks
 

What's hot (20)

Information Management Strategy
Information Management StrategyInformation Management Strategy
Information Management Strategy
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
3 Data Mining Tasks
3  Data Mining Tasks3  Data Mining Tasks
3 Data Mining Tasks
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
CS6303 - Computer Architecture
CS6303 - Computer ArchitectureCS6303 - Computer Architecture
CS6303 - Computer Architecture
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
 
An Introduction to MapReduce
An Introduction to MapReduceAn Introduction to MapReduce
An Introduction to MapReduce
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for Healthcare
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...
How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...
How to Digitize Industrial Manufacturing with Azure IoT Edge, InfluxDB, and M...
 
Data mining
Data mining Data mining
Data mining
 
Efficient monitoring and alerting
Efficient monitoring and alertingEfficient monitoring and alerting
Efficient monitoring and alerting
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Visualizing climate change through data
Visualizing climate change through dataVisualizing climate change through data
Visualizing climate change through data
 
Timeline to internet
Timeline to internetTimeline to internet
Timeline to internet
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
1.4 data warehouse
1.4 data warehouse1.4 data warehouse
1.4 data warehouse
 
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
From Idea to Model: Productionizing Data Pipelines with Apache AirflowFrom Idea to Model: Productionizing Data Pipelines with Apache Airflow
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
 

Viewers also liked

Getting started with Data Warehousing and BI
Getting started with Data Warehousing and BIGetting started with Data Warehousing and BI
Getting started with Data Warehousing and BIEdureka!
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesSaif Ullah
 
02. Data Warehouse and OLAP
02. Data Warehouse and OLAP02. Data Warehouse and OLAP
02. Data Warehouse and OLAPAchmad Solichin
 
Data Mining As Used In Employee Recruitment &
Data Mining As Used In Employee Recruitment &Data Mining As Used In Employee Recruitment &
Data Mining As Used In Employee Recruitment &melodysmithjones
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social MediaSeth Grimes
 
Data Mining
Data MiningData Mining
Data Miningshrapb
 
08. Mining Type Of Complex Data
08. Mining Type Of Complex Data08. Mining Type Of Complex Data
08. Mining Type Of Complex DataAchmad Solichin
 
05 Classification And Prediction
05   Classification And Prediction05   Classification And Prediction
05 Classification And PredictionAchmad Solichin
 
Dashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDavide Mauri
 
Multisensor Data Fusion : Techno Briefing
Multisensor Data Fusion : Techno BriefingMultisensor Data Fusion : Techno Briefing
Multisensor Data Fusion : Techno BriefingPaveen Juntama
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesEunjeong (Lucy) Park
 

Viewers also liked (20)

Getting started with Data Warehousing and BI
Getting started with Data Warehousing and BIGetting started with Data Warehousing and BI
Getting started with Data Warehousing and BI
 
Data mining
Data miningData mining
Data mining
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
02. Data Warehouse and OLAP
02. Data Warehouse and OLAP02. Data Warehouse and OLAP
02. Data Warehouse and OLAP
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
Knowledge Extraction
Knowledge ExtractionKnowledge Extraction
Knowledge Extraction
 
Data Mining As Used In Employee Recruitment &
Data Mining As Used In Employee Recruitment &Data Mining As Used In Employee Recruitment &
Data Mining As Used In Employee Recruitment &
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
Data Mining
Data MiningData Mining
Data Mining
 
08. Mining Type Of Complex Data
08. Mining Type Of Complex Data08. Mining Type Of Complex Data
08. Mining Type Of Complex Data
 
05 Classification And Prediction
05   Classification And Prediction05   Classification And Prediction
05 Classification And Prediction
 
Mapping rules
Mapping rulesMapping rules
Mapping rules
 
Dashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BIDashboarding with Microsoft: Datazen & Power BI
Dashboarding with Microsoft: Datazen & Power BI
 
Lecture - Data Mining
Lecture - Data MiningLecture - Data Mining
Lecture - Data Mining
 
Learning-based Data Cleaning
Learning-based Data CleaningLearning-based Data Cleaning
Learning-based Data Cleaning
 
Data mapping tutorial
Data mapping tutorialData mapping tutorial
Data mapping tutorial
 
03. Data Preprocessing
03. Data Preprocessing03. Data Preprocessing
03. Data Preprocessing
 
Multisensor Data Fusion : Techno Briefing
Multisensor Data Fusion : Techno BriefingMultisensor Data Fusion : Techno Briefing
Multisensor Data Fusion : Techno Briefing
 
Modern PHP Developer
Modern PHP DeveloperModern PHP Developer
Modern PHP Developer
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for Newbies
 

Similar to 01. Introduction to Data Mining and BI

Knowledge management and business intelligence
Knowledge management and business intelligenceKnowledge management and business intelligence
Knowledge management and business intelligenceAzmi Taufik
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business IntelligenceAmulya Lohani
 
Ch1-Introduction to Business Intelligence.pptx
Ch1-Introduction to Business Intelligence.pptxCh1-Introduction to Business Intelligence.pptx
Ch1-Introduction to Business Intelligence.pptxsommaikhantong
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureInfosys
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentIRJET Journal
 
CS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitectureCS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitecturePalani Kumar
 
Information & Data Architecture
Information & Data ArchitectureInformation & Data Architecture
Information & Data ArchitectureSammer Qader
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligencessnbarnett
 
Mi0036 business intelligence tools
Mi0036  business intelligence toolsMi0036  business intelligence tools
Mi0036 business intelligence toolssmumbahelp
 
Business Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxBusiness Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxRupaRani28
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationDenodo
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxOTA13NayabNakhwa
 

Similar to 01. Introduction to Data Mining and BI (20)

Knowledge management and business intelligence
Knowledge management and business intelligenceKnowledge management and business intelligence
Knowledge management and business intelligence
 
Ch03
Ch03Ch03
Ch03
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Bi orientations
Bi orientationsBi orientations
Bi orientations
 
Ch1-Introduction to Business Intelligence.pptx
Ch1-Introduction to Business Intelligence.pptxCh1-Introduction to Business Intelligence.pptx
Ch1-Introduction to Business Intelligence.pptx
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
 
Business Analytics
Business AnalyticsBusiness Analytics
Business Analytics
 
Business Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows AzureBusiness Intelligence Solution on Windows Azure
Business Intelligence Solution on Windows Azure
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate Environment
 
CS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitectureCS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_Architecture
 
Information & Data Architecture
Information & Data ArchitectureInformation & Data Architecture
Information & Data Architecture
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Jn2516891694
Jn2516891694Jn2516891694
Jn2516891694
 
Jn2516891694
Jn2516891694Jn2516891694
Jn2516891694
 
Mi0036 business intelligence tools
Mi0036  business intelligence toolsMi0036  business intelligence tools
Mi0036 business intelligence tools
 
Intro bi
Intro biIntro bi
Intro bi
 
Business Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptxBusiness Intelligence and Analytics .pptx
Business Intelligence and Analytics .pptx
 
Business Intelligence
Business Intelligence Business Intelligence
Business Intelligence
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
 
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptxDATASCIENCE vs BUSINESS INTELLIGENCE.pptx
DATASCIENCE vs BUSINESS INTELLIGENCE.pptx
 

More from Achmad Solichin

Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)
Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)
Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)Achmad Solichin
 
Materi Webinar Web 3.0 (16 Juli 2022)
Materi Webinar Web 3.0 (16 Juli 2022)Materi Webinar Web 3.0 (16 Juli 2022)
Materi Webinar Web 3.0 (16 Juli 2022)Achmad Solichin
 
Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)
Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)
Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)Achmad Solichin
 
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)Achmad Solichin
 
Webinar PHP-ID: Machine Learning dengan PHP
Webinar PHP-ID: Machine Learning dengan PHPWebinar PHP-ID: Machine Learning dengan PHP
Webinar PHP-ID: Machine Learning dengan PHPAchmad Solichin
 
Webinar Data Mining dengan Rapidminer | Universitas Budi Luhur
Webinar Data Mining dengan Rapidminer | Universitas Budi LuhurWebinar Data Mining dengan Rapidminer | Universitas Budi Luhur
Webinar Data Mining dengan Rapidminer | Universitas Budi LuhurAchmad Solichin
 
TREN DAN IDE RISET BIDANG DATA MINING TERBARU
TREN DAN IDE RISET BIDANG DATA MINING TERBARUTREN DAN IDE RISET BIDANG DATA MINING TERBARU
TREN DAN IDE RISET BIDANG DATA MINING TERBARUAchmad Solichin
 
Metodologi Riset: Literature Review
Metodologi Riset: Literature ReviewMetodologi Riset: Literature Review
Metodologi Riset: Literature ReviewAchmad Solichin
 
Materi Seminar: Artificial Intelligence dengan PHP
Materi Seminar: Artificial Intelligence dengan PHPMateri Seminar: Artificial Intelligence dengan PHP
Materi Seminar: Artificial Intelligence dengan PHPAchmad Solichin
 
Percobaan Perpindahan Kalor melalui Konduksi, Konveksi dan Radiasi
Percobaan Perpindahan Kalor melalui Konduksi, Konveksi dan RadiasiPercobaan Perpindahan Kalor melalui Konduksi, Konveksi dan Radiasi
Percobaan Perpindahan Kalor melalui Konduksi, Konveksi dan RadiasiAchmad Solichin
 
Metodologi Riset: Literature Review
Metodologi Riset: Literature ReviewMetodologi Riset: Literature Review
Metodologi Riset: Literature ReviewAchmad Solichin
 
Depth First Search (DFS) pada Graph
Depth First Search (DFS) pada GraphDepth First Search (DFS) pada Graph
Depth First Search (DFS) pada GraphAchmad Solichin
 
Breadth First Search (BFS) pada Graph
Breadth First Search (BFS) pada GraphBreadth First Search (BFS) pada Graph
Breadth First Search (BFS) pada GraphAchmad Solichin
 
Binary Search Tree (BST) - Algoritma dan Struktur Data
Binary Search Tree (BST) - Algoritma dan Struktur DataBinary Search Tree (BST) - Algoritma dan Struktur Data
Binary Search Tree (BST) - Algoritma dan Struktur DataAchmad Solichin
 
Computer Vision di Era Industri 4.0
Computer Vision di Era Industri 4.0Computer Vision di Era Industri 4.0
Computer Vision di Era Industri 4.0Achmad Solichin
 
Seminar: Become a Reliable Web Programmer
Seminar: Become a Reliable Web ProgrammerSeminar: Become a Reliable Web Programmer
Seminar: Become a Reliable Web ProgrammerAchmad Solichin
 
The Big 5: Future IT Trends
The Big 5: Future IT TrendsThe Big 5: Future IT Trends
The Big 5: Future IT TrendsAchmad Solichin
 
Seminar: PHP Developer for Dummies
Seminar: PHP Developer for DummiesSeminar: PHP Developer for Dummies
Seminar: PHP Developer for DummiesAchmad Solichin
 
Pertemuan 1 - Algoritma dan Struktur Data 1
Pertemuan 1 - Algoritma dan Struktur Data 1Pertemuan 1 - Algoritma dan Struktur Data 1
Pertemuan 1 - Algoritma dan Struktur Data 1Achmad Solichin
 
Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016
Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016
Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016Achmad Solichin
 

More from Achmad Solichin (20)

Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)
Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)
Kuliah Umum - Tips Publikasi Jurnal SINTA untuk Mahasiswa Galau (6 Agustus 2022)
 
Materi Webinar Web 3.0 (16 Juli 2022)
Materi Webinar Web 3.0 (16 Juli 2022)Materi Webinar Web 3.0 (16 Juli 2022)
Materi Webinar Web 3.0 (16 Juli 2022)
 
Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)
Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)
Webinar: Kesadaran Keamanan Informasi (3 Desember 2021)
 
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
Webinar PHP-ID: Mari Mengenal Logika Fuzzy (Fuzzy Logic)
 
Webinar PHP-ID: Machine Learning dengan PHP
Webinar PHP-ID: Machine Learning dengan PHPWebinar PHP-ID: Machine Learning dengan PHP
Webinar PHP-ID: Machine Learning dengan PHP
 
Webinar Data Mining dengan Rapidminer | Universitas Budi Luhur
Webinar Data Mining dengan Rapidminer | Universitas Budi LuhurWebinar Data Mining dengan Rapidminer | Universitas Budi Luhur
Webinar Data Mining dengan Rapidminer | Universitas Budi Luhur
 
TREN DAN IDE RISET BIDANG DATA MINING TERBARU
TREN DAN IDE RISET BIDANG DATA MINING TERBARUTREN DAN IDE RISET BIDANG DATA MINING TERBARU
TREN DAN IDE RISET BIDANG DATA MINING TERBARU
 
Metodologi Riset: Literature Review
Metodologi Riset: Literature ReviewMetodologi Riset: Literature Review
Metodologi Riset: Literature Review
 
Materi Seminar: Artificial Intelligence dengan PHP
Materi Seminar: Artificial Intelligence dengan PHPMateri Seminar: Artificial Intelligence dengan PHP
Materi Seminar: Artificial Intelligence dengan PHP
 
Percobaan Perpindahan Kalor melalui Konduksi, Konveksi dan Radiasi
Percobaan Perpindahan Kalor melalui Konduksi, Konveksi dan RadiasiPercobaan Perpindahan Kalor melalui Konduksi, Konveksi dan Radiasi
Percobaan Perpindahan Kalor melalui Konduksi, Konveksi dan Radiasi
 
Metodologi Riset: Literature Review
Metodologi Riset: Literature ReviewMetodologi Riset: Literature Review
Metodologi Riset: Literature Review
 
Depth First Search (DFS) pada Graph
Depth First Search (DFS) pada GraphDepth First Search (DFS) pada Graph
Depth First Search (DFS) pada Graph
 
Breadth First Search (BFS) pada Graph
Breadth First Search (BFS) pada GraphBreadth First Search (BFS) pada Graph
Breadth First Search (BFS) pada Graph
 
Binary Search Tree (BST) - Algoritma dan Struktur Data
Binary Search Tree (BST) - Algoritma dan Struktur DataBinary Search Tree (BST) - Algoritma dan Struktur Data
Binary Search Tree (BST) - Algoritma dan Struktur Data
 
Computer Vision di Era Industri 4.0
Computer Vision di Era Industri 4.0Computer Vision di Era Industri 4.0
Computer Vision di Era Industri 4.0
 
Seminar: Become a Reliable Web Programmer
Seminar: Become a Reliable Web ProgrammerSeminar: Become a Reliable Web Programmer
Seminar: Become a Reliable Web Programmer
 
The Big 5: Future IT Trends
The Big 5: Future IT TrendsThe Big 5: Future IT Trends
The Big 5: Future IT Trends
 
Seminar: PHP Developer for Dummies
Seminar: PHP Developer for DummiesSeminar: PHP Developer for Dummies
Seminar: PHP Developer for Dummies
 
Pertemuan 1 - Algoritma dan Struktur Data 1
Pertemuan 1 - Algoritma dan Struktur Data 1Pertemuan 1 - Algoritma dan Struktur Data 1
Pertemuan 1 - Algoritma dan Struktur Data 1
 
Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016
Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016
Sharing Penelitian S3 Lab Elins FMIPA UGM - 17 Februari 2016
 

Recently uploaded

mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 

Recently uploaded (20)

mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 

01. Introduction to Data Mining and BI

  • 1. Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id) Faculty of Computer Science, University of Indonesia
  • 2. Objectives  Motivation: Why data mining?  What is data mining?  Understand the drivers for BI initiatives in modern organizations  Understand the structure, components, and process of BI 2
  • 3. Motivation: Why data mining?  Data explosion problem:  Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories.  We are drowning in data, but starving for knowledge!  Data Mining:  Extraction of interesting knowledge (rules, regularities, patterns, constraints) from data in large databases [JH].  Analysis of the large quantities of data that are stored in computers [DO],  Alternative names  KDD, knowledge extraction, data archeology, information harvesting, business intelligence, etc. 3
  • 4. Data Rich but Information Poor Databases are too big Data Mining can help discover knowledge 4
  • 5. Evolution of Database Technology  Data Collection, database creation, network DBMS  Relational data model, relational DBMS implementation  RDBMS, advanced data models (extended-relational, OO, etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.)  Data mining and data warehousing, multimedia databases, and Web technology 5
  • 6. Potential Applications  See TSBD lecture notes – Data Mining  See Chapter 1 of DO  Retailing  Banking  Credit Card Management  Insurance  Telecommunications  Telemarketing  Human Resource Management 6
  • 7. Data Mining Should Not be Used Blindly  Data mining find regularities from history, but history is not the same as the future.  Association does not dictate trend nor causality.  Some abnormal data could be caused by human. 7
  • 8. Another view of BI  BI is a broad field and it is viewed differently by different people.  Common agreement on major components:  A centralized repository of data  data warehouse  An end-user set of tools to create reports and queries from data and information and to analyze the data, information, and reports  business analytics  To find non-obvious relationship among large amounts of data  data mining, for text  text mining, for web  web mining  Business Performance Management (BPM) to set goals as metrics and standards and monitoring and measuring performance by using the BI methodology. 8
  • 9. Drivers of BI  Organizations are being compelled to capture, understand, and harness their data to support decision making in order to improve business operations  Business cycle times are now extremely compressed; faster, more informed, and better decision making is therefore a competitive imperative  Managers need the right information at the right time and in the right place  Case Study 1: BI success story at Toyota Motor Company (Chapter 1 ET pg. 4-6). 9
  • 11. Data Mining Functionality  Association  From association, correlation, to causality  Finding rules like ―A -> B‖  Classification and Prediction  Classify data based on the values ina classifying attribute  Predict some unknown or missing attribute values based on other information  Cluster analysis  Group data to form new classes, e.g., cluster houses to find distribution patterns  Outlier and exception data analysis  Time series analysis (trend and deviation)  Trend and deviation analysis: regression, sequential pattern, similiar 11 sequences e.g. Stock analysis
  • 12. Sarbanes-Oxley Act of 2002 (extracted from Gartner, Inc., 2004)  The Sarbanes-Oxley Act of 2002 mandates drove one firm to implement a new financial performance management system, capable of meeting the new requirements to:  Perform flawless analysis and compilation of thousands of transactions and journal entries.  Balance more access to data with the need to control access to sensitive insider information.  Deliver reports to the SEC in less time. 12
  • 13. Sarbanes-Oxley Act of 2002 (extracted from Gartner, Inc., 2004) ... continued  Within the overarching goal of achieving financial-reporting compliance, these objectives included the following:  Get more eyes on the data and KPI and build in strict security controls  Provide live reports that allow people to drill down to the lowest level of transaction detail  Proactively scour the financial databases for anomalies, using variance triggers  Gather all financial data into a cohesive database  Complement accounting and budgeting applications for flexible reporting, free-form investigation, and automated data analysis.  BI can proactively alert specific individuals whenever an anomay is detected. 13
  • 14. Now let us see some screenshots..... 14
  • 20. KDD Process Interpretation/ Knowledge Evaluation Data Mining Transformation Preprocessin g Selection Target Data Data 20
  • 21. Steps of a KDD Process  Learning the application domain:  relevant prior knowledge and goals of application  Creating a target data set: data selection  Data cleaning and preprocessing: (may take 60% of effort!)  Data reduction and projection:  Find useful features, dimensionality/variable reduction, invariant representation.  Choosing functions of data mining  summarization, classification, regression, association, clustering.  Choosing the mining algorithm(s)  Data mining: search for patterns of interest  Interpretation: analysis of results.  visualization, transformation, removing redundant patterns, etc.  Use of discovered knowledge 21
  • 22. Teradata Advanced Analytics Methodology (similar to CRISP-DM) 22
  • 24. Structure and Components of BI... continued  Data Warehouse  Data flows from operational systems (e.g., CRM, ERP) to a DW, which is a special database or repository of data that has been prepared to support decision-making applications ranging from those for simple reporting and querying to complex optimization  Business Analytics/OLAP  Software tools that allow users to create on- demand reports and queries and to conduct analysis of data 24
  • 25. Structure and Components of BI... continued  Data Mining  Data mining is a class of database information analysis that looks for hidden patterns in a group of data that can be used to predict future behavior  Used to replace or enhance human intelligence by scanning through massive storehouses of data to discover meaningful new correlations, patterns, and trends, by using pattern recognition technologies and advanced statistics 25
  • 26. Structure and Components of BI... continued  Business Performance Management (BPM)  Based on the balanced scorecard methodology— a framework for defining, implementing, and managing an enterprise’s business strategy by linking objectives with factual measures  Dashboards A visual presentation of critical data for executives to view. It allows executives to see hot spots in seconds and explore the situation 26
  • 27. BI: Today and Tomorrow  Recent industry analyst reports show that in the coming years, millions of people will use BI visual tools and analytics every day  BI takes advantage of already developed and installed components of IT technologies, helping companies leverage their current IT investments and use valuable data stored in legacy and transactional systems  Some Issues:  Mining information from heterogeneous databases and global information systems  Handling relational and complex types of data  Efficiency and scalability of data mining algorithms 27