SlideShare a Scribd company logo
1 of 56
Advanced Statistical
Analysis Tool
Statistical Package for the Social
Sciences
What is SPSS
 IBM SPSS Modeler is a Data Mining and Text Analytics software application by IBM.
It is a comprehensive predictive analytics platform designed to bring predictive
intelligence to decisions made by individuals, groups, systems or by an enterprise as a
whole.
 It is used to build predictive models and conduct analytics tasks through its user-
friendly Visual Interface by leveraging statistical and data mining algorithms without
programming.
 Originally SPSS stands for Statistical Package for the Social Science but now it
stands for Statistical Product and Service Solutions.
Advantages of SPSS
 Analyze and better understand your data and solve complex business.
 Understand large and complex data sets quickly with advanced statistical procedures that
help to ensure high accuracy and quality decision-making.
Disadvantages of SPSS
 It cannot be used to analyze a big data set.
 Expensive tool.
 The graph features are not as simple as of Microsoft Excel.
Competitors
• R
• SAS
• Python
• Minitab
• Eviews
• Origin
Flow of
Course
Module 1
Introduction to Data
Mining
• Introduction to Data
Mining
• SPSS Modeler Interface
Module 2
The Data Mining Process
• CRISP-DM Methodology
Module 3
Modeling Techniques
• Multiple Regression Analysis
• Factor Analysis
• Cluster Analysis
• Discriminant Analysis
• Multidimensional Scaling
• Conjoint Analysis
Module 1 Introduction to Data Mining
Introduction to Data Mining
Why Data Mining?
 How do we consume the available data, translate it into information and make it
usable?
What is Data Mining?
 Process of discovering insights, patterns and relationships from large amounts of data.
What knowledge can be extracted?
 Descriptive-What has happened and Why did it happen?
 Predictive- What is likely to happen next?
Why is Data Mining Important & its
Applications
Data and
Analytics
Create New Business Models
CMO- Attract, grow and
retain customers
COO- Optimize Operations:
Counter Frauds &Threats
CIO/CDO – Maximize Insights,
Ensure Trust, Improve
Economics
CRO- Manage Risks
CFO- Transform management
and financial processes
Identify the
Data Mining!
Dividing the customers of a company according to
their gender
Computing Total Sales of a Company
Sorting a student database based on student
identification numbers
Predicting the outcomes of tossing a fair pair dice
Predicting the future stock price of a company
using historical records
SPSS Modeler Interface
Stream Canvas
Stream,
Output,
Model
Manager
Data Mining
Lifecycle
Palettes
Nodes
Module 2 Data Mining Process
Data Mining Process: CRISP - DM
 CRISP-DM stands for Cross Industry Standard Process for Data Mining.
 Business Understanding - What should be accomplished from a business perspective?
 Data Understanding - Acquiring the data needed to accomplish the objective.
 Data Preparation- Selecting and Cleaning the data. May transform/aggregate for analysis.
 Modeling - Selecting technique, building and training the model, accessing accuracy.
 Evaluation - Does the model meet business objectives?
 Deployment - Strategy for deploying the model.
Module 3 Modeling Techniques
Regression Analysis
 Regression technique is used to assess the strength of a relationship
between one dependent and independent variable(s). It helps in predicting the
value of a dependent variable from one or more independent variables.
 Two types of Regression
 Bivariate Regression: One Dependent Variable, One Independent Variable.
 Multiple Regression: One Dependent Variable, Rest Independent Variable.
Regression Examples: Understand
Variables
 Price reduction has any impact on increasing sales.
 Sales has any effect on advertising spend, the number of products introduced,
and the number of sales personnel.
 Female literacy has any impact on increasing the marriage age of the female
child.
Bivariate Regression: Introduction
 The simplest of regression analysis is called Bivariate Regression.
 Includes 2 variables.
 One dependent variables that needs to be predicted or explained.
 One independent variable that explains the variance in the dependent variable.
 Regression Analysis is used to predict the value of dependent variable given the values
of independent variables by calculating an equation.
 Example on the next slide
Regression Methods
 Enter: All independent variables are entered into the equation in (one step), also
called "forced entry".
 Forward: A variable selection method which begins with a model that contains no
variables (called the Null Model) & then starts adding the most significant variables
one after the other.
• Backward: A variable selection method which begins with a model that contains all
variables under consideration (called the Full Model) then starts removing the least
significant variables one after the other.
 Stepwise: It is a combination of Backward & Forward, it keeps adding and removing
predictors as it builds the model.
Bivariate Regression: Example
 A marketing manager wants to predict if the variation in the sales can be explained in
terms of variation in advertising spend
 The equation can be as follows
Sales = Sales with 0 advertising spend + B1 (Advertising Budget) + Error Terms
B1 = Beta Coefficient (Change in sales if there is an advertising budget)
Error Terms: Other factors which can affect variable factors
Let’s Understand Practically - Bivariate Regression
 Data Description: A person with 8 years of education is earning $77 per week.
 Problem Statement:
 If an increase in education level has any impact on weekly earnings.
 If there is an increase in education by one unit then how much earnings will be
increased per week?
 Hypothesis: Yes, there is a significant impact.
Important Terms in Regression
Regression Coefficient
Regression coefficient is a measure of how strongly each IV (also known as predictor variable)
predicts the DV.
R Values
This is the correlation coefficient. Regression analysis would provide you with two different R
values. A simple R value represents the correlation between the observed values and the
predicted values (based on the regression equation obtained) of the DV. The other R values is
referred to as R Square. R square shows how much variance in the dependent variable is
being explained by the independent variable(s). For example, R Square value of 0.70 would
mean that the IVs in the model can predict 70% of the variance in the DV.
Important Terms in Regression
T Value
Any t-value greater than +2 or less than - 2 is acceptable. The higher the t-value, the greater the
confidence (accuracy) we have in the coefficient as a predictor. Low t-values are indications of low
reliability of the predictive power of that coefficient.
F Value
= Explained Variance/ Unexplained Variance
A general rule of thumb that is often used in regression analysis is that if F>2.5 then we can reject the
null hypothesis. We would conclude that there is a least one parameter value that is nonzero.
Df (Degrees of Freedom)
The number of independent variables in our regression model.
Important Terms in Regression
Beta Coefficient (Standardized)
The beta coefficient is the degree of change in the outcome variable for every 1-unit of change in the
predictor variable. It ranges from -∞ to +∞.
R
R is Correlation Coefficient that describes the relationship between two independent variables. It
ranges between +1 and - 1 for completely positive and negative correlation respectively.
Beta Coefficient (Unstandardized)
Standardized beta coefficients are expressed in standard deviations whereas unstandardized
coefficients are expressed in raw units.
Important Terms in Regression
Insights
 If there is 1 unit increase in education the level then there is 0.423
impact on weekly earning.
 R Square value suggests there is 17.9% variability in weekly earning
due to education level.
 Significance: ? (Tell me)
Multiple Regression: Introduction
 One dependent variable and more than one independent variable.
 Rest all as same as Bivariate Regression.
Let’s Understand Practically - Multiple Regression
 Data Description: Age, Weight & BP of different people have been given.
 Problem Statement:
 If Age & Weight have any impact on the BP of a person.
 If there is an increase in Age or Weight by one unit then how much BP will
be impacted?
 Hypothesis: Yes, there is a significant impact.
Insights
 Because the unit of Age & Weight is different so we can only analyze Standardized
Coefficients.
 If there is 1 unit increase in age, then there is 0.346 impact on BP.
 If there is 1 unit increase in weight, then there is 0.481 impact on BP.
 Both predictors are significant.
 R Square value suggests there is 55.8% variability in BP due to Age & Weight.
Factor Analysis
 Factor analysis is a technique that is used to reduce a large number of variables
into fewer numbers of factors.
 Factor analysis groups variables with similar characteristics together. Therefore,
with factor analysis, you can produce a small number of factors from a large
number of variables.
 One can use the reduced factors for further analysis.
Factor Analysis: How it Works
 When Factor Analysis is applied to the dataset, variables with high correlation
are grouped together.
Important Terms in Factor Analysis
• Communality: Communality is the amount of variance a variable shares with
all the other variables being considered. Small values indicate variables that do
not fit well with the factor solution and should possibly be dropped from the
analysis. Normally values Less than .50 are removed.
• Eigen Value: The eigenvalue represents the total variance explained by each
factor. Factors having eigenvalues over one (1) are selected for further study.
Let’s Understand Practically - Factor Analysis
 Data Description: Jet Airways Feedback Data
 Sample Size: 20
 Parameters: 10
 Scale: 1 to 7
o 1 Strongly Disagree
o 2 Relatively Disagree
o 3 Disagree
o 4 Neutral
o 5 Agree
o 6 Relatively Agree
o 7 Strongly Agree
Parameters
 JA is always on time.
 Seats are comfortable.
 Love the food they provide.
 Air Hostesses are beautiful.
 My boss/ friends also use the same.
 JA has younger air crafts.
 I get advantage of a frequent flyer program.
 Flight timings suit my schedule.
 I feel safe.
 JA matches my lifestyle
and standard.
Advantages
 It can be used to identify the hidden dimensions or constraints which may or
may not be apparent from direct analysis.
 It is not extremely difficult to do and at the same time its inexpensive and gives
accurate results.
Disadvantages
 The usefulness depends on the researcher’s ability to develop a complete and accurate
set of product attributes. If important attributes are missed the value of procedure is
reduced accordingly.
 Naming of the factors can be difficult multiple attributes can be highly correlated with no
apparent reasons.
 If the observed variables are completely unrelated the factor analysis is unable to
produce meaningful pattern.
Cluster Analysis
 Cluster analysis is a powerful data-mining tool that helps organizations to identify
discrete groups of customers, sales transactions, or other types of behaviours and things.
 For example, insurance providers use cluster analysis to detect fraudulent claims, and
banks use it for credit scoring.
 The most common use of cluster analysis is classification.
 Subjects are separated into groups so that each subject is more similar to other subjects in
its group (called a cluster) than to subjects outside the group.
 This technique is used for segmentation.
Application - Segmentation
 A company wants to launch a mobile for INR 100000.
 How to decide whom to target for high sales?
Similarity between Factor Analysis &
Cluster Analysis
 Cluster analysis and factor analysis are two common statistical
methods that data analysts use to explore and simplify complex data
sets.
 They both aim to group variables or observations based on some
measure of similarity or correlation, but they differ in their purposes
and assumptions.
Difference between Factor Analysis &
Cluster Analysis
 In Factor Analysis we look at Correlation but in Cluster Analysis, we
look at the distance.
 In Factor Analysis, we group the statements but in Cluster Analysis, we
group the respondents.
Most Important Types of Cluster Analysis
 TwoStep
TwoStep Cluster is a two-step clustering method. The first step makes a single pass through
the data, during which it compresses the raw input data into a manageable set of subclusters.
The second step uses a hierarchical clustering method to progressively merge the subclusters
into larger and larger clusters, without requiring another pass through the data.
 K – Means
K-means clustering is one of the most often used methods and is conducted by creating a
space that has as dimensions as the input variables. K stands for the number of clusters.
Let’s Understand Practically - Cluster Analysis
 Let’s take the same example which we took in Factor Analysis
Important Terms in Cluster Analysis
 Silhouette Value
The silhouette value is a measure of how similar an object is to its own cluster (cohesion)
compared to other clusters (separation). The silhouette ranges from −1 to +1, where a
high value indicates that the object is well matched to its own cluster and poorly
matched to neighboring clusters.
 The Euclidean Distance
The Euclidean distance process determines the proximity between observations by
drawing a straight line between pairs of observations. Therefore, this process measures
the distance between observations by looking at the length of this line between
observations.
Discriminant Analysis
 The primary function of this technique is to assign each observation to a particular group
or category according to the data’s independent characteristics.
 This is similar to Regression Analysis and is used to assess the relationship between
dependent & independent variables.
 In Discriminant Analysis, the dependent variable is categorial or non-metric.
 Dependent variable is called discriminant variable as that discriminates the respondents.
Let’s do it Practically - Discriminant Analysis
 There are 3 rounds conducted by a company to hire a candidate. The data for the same has
been taken to declare the result.
Important Terms in Discriminant Analysis
 Low value of Wilk’s Lambda reflects high significance. It ranges between 0 to 1.
 The F Test should show a p value less than 0.5.
 Larger the absolute value of standardized coefficients better the predictive power of
variable.
 Canonical Correlation: Should be closer to 1 for a strong correlation.
Insights
 Eigen Value is > 1, so it is a good model.
 Canonical Coefficient is near 1 so strong correlation is present.
 Wilks’ Lambda is towards 0 (0.454) which means high significance. High significance
means better discriminating power of the model.
 P is < 0.05 so the discrimination between the groups is highly significant.
 Test1 has highest power of discrimination then Interview then Test2 going by coefficients.
 Best Predictor: Test1, Interview, Test2 going by Structure Matrix.
Multidimensional Scaling
 Multidimensional scaling is a visual representation of distances or dissimilarities between
sets of objects.
 Multidimensional Scaling is a family of statistical methods that focus on creating
mappings of items based on distance.
 The input to multidimensional scaling is a distance matrix. The output is typically a two-
dimensional scatterplot, where each of the objects is represented as a point.
 MDS is more impactful because pictures are easier to interpret than numbers & tables.
Applications - Multidimensional Scaling
 To identify the image/ position of a product in consumers’ mind.
 The number & nature of dimensions consumers use to perceive a brand.
 To understand market gap so that a company can fit a new product in the market.
 Also called Perceptual Mapping, maps the perceptions of the consumers about the product that a
marketer always needs.
 Market Segmentation
 Assessing advertising effectiveness
 Pricing Analysis
 Channel Decisions
Terms Associated - Multidimensional
Scaling
 Stress: This is a lack of fit-measure, higher values of stress indicates the poorer fits. It must be <
0.02 for a great fit.
 RSQ: Squared Correlation: It must be > 0.7 for a great fit.
Steps to Conduct MDS
 Formulate the Problem: Specify the purpose of Analysis, Number of Brands (8 to 25) to be
included in the analysis.
 Obtain the Input Data: Refer the next slide.
 Select the MDS Procedure: Perceptions: To Create Spatial Map, Preference: To Decide the
Dimensions.
 Decide on the number of Dimensions: Not more than 3 else it becomes complicated.
 Label the Dimensions and Interpret the Configuration: Will do practically.
 Assess Reliability & Validity: Will do practically.
Obtain The Input Data
MDS Input Data
Perceptions
Direct (Similarity
Judgements)
Derived (Attribute
Ratings)
Preferences
Example - Multidimensional Scaling
 5 Brands of Mobile Phones – Vivo, Samsung, Mi, Oppo, Huawei
 2 Dimensions - Economic, Features (Look wise)
 Scale: 0 to 10, 0 – Dissimilar, 10 – Similar
Conjoint Analysis
 Conjoint analysis is a form of statistical analysis that firms use in market research to
understand how customers value different components or features of their products or
services.
 Conjoint analysis is a statistical analysis and marketing research technique to measure
what consumers value most about your products and services.
 It is a survey-based statistical analysis method.
 For example, a TV manufacturer would want to know if customers value picture or sound
quality more, or if they value low price more than picture quality.
Conjoint Analysis – Use Cases
 Buyer decisions
 Customer preferences
 Market sales
 New product pricing
 Selection of the best service or product feature
Conjoint Analysis – Let’s Do It Practically
 Juices Example

More Related Content

What's hot

MDH Masala: Spicy Success (Brand Management)
MDH Masala: Spicy Success (Brand Management)MDH Masala: Spicy Success (Brand Management)
MDH Masala: Spicy Success (Brand Management)Yohan DSouza
 
UNILEVER:WORKING BEYOND THE HORIZON
UNILEVER:WORKING BEYOND THE HORIZONUNILEVER:WORKING BEYOND THE HORIZON
UNILEVER:WORKING BEYOND THE HORIZONRiya Aseef
 
Godrej nature’s basket
Godrej nature’s basketGodrej nature’s basket
Godrej nature’s basketAtul Gupta
 
Haldiram - Marketing PPT
Haldiram - Marketing PPT Haldiram - Marketing PPT
Haldiram - Marketing PPT Karan Jain
 
Csr project kansai nerolac paints ltd
Csr project kansai nerolac paints ltdCsr project kansai nerolac paints ltd
Csr project kansai nerolac paints ltdKinnar Majithia
 
Strategic Management Analysis - Airtel
Strategic Management Analysis - AirtelStrategic Management Analysis - Airtel
Strategic Management Analysis - AirtelArjun Parekh
 
B2B Marketing: MRF’s Strategies pitch & plans for TAFE
B2B Marketing: MRF’s Strategies pitch & plans for TAFE B2B Marketing: MRF’s Strategies pitch & plans for TAFE
B2B Marketing: MRF’s Strategies pitch & plans for TAFE piyushree nagrale
 
Presentation of d mart
Presentation of d martPresentation of d mart
Presentation of d martganaraya
 
Brand comparison of rolex and titan
Brand comparison of rolex and titanBrand comparison of rolex and titan
Brand comparison of rolex and titanRakhi Ghosh
 
Patanjali Analysis
Patanjali AnalysisPatanjali Analysis
Patanjali AnalysisRohit Garg
 
Strategies of Patanjali of Marketing
Strategies of Patanjali of MarketingStrategies of Patanjali of Marketing
Strategies of Patanjali of MarketingDevansh Aggarwal
 
Balaji profile
Balaji  profileBalaji  profile
Balaji profile2001209
 

What's hot (20)

MDH Masala: Spicy Success (Brand Management)
MDH Masala: Spicy Success (Brand Management)MDH Masala: Spicy Success (Brand Management)
MDH Masala: Spicy Success (Brand Management)
 
UNILEVER:WORKING BEYOND THE HORIZON
UNILEVER:WORKING BEYOND THE HORIZONUNILEVER:WORKING BEYOND THE HORIZON
UNILEVER:WORKING BEYOND THE HORIZON
 
Swiggy presentation
Swiggy presentationSwiggy presentation
Swiggy presentation
 
Godrej nature’s basket
Godrej nature’s basketGodrej nature’s basket
Godrej nature’s basket
 
Haldiram - Marketing PPT
Haldiram - Marketing PPT Haldiram - Marketing PPT
Haldiram - Marketing PPT
 
Ppt on Reliance jio
Ppt on Reliance jioPpt on Reliance jio
Ppt on Reliance jio
 
Csr project kansai nerolac paints ltd
Csr project kansai nerolac paints ltdCsr project kansai nerolac paints ltd
Csr project kansai nerolac paints ltd
 
Strategic Management Analysis - Airtel
Strategic Management Analysis - AirtelStrategic Management Analysis - Airtel
Strategic Management Analysis - Airtel
 
B2B Marketing: MRF’s Strategies pitch & plans for TAFE
B2B Marketing: MRF’s Strategies pitch & plans for TAFE B2B Marketing: MRF’s Strategies pitch & plans for TAFE
B2B Marketing: MRF’s Strategies pitch & plans for TAFE
 
Case Study of JIO
Case Study of JIOCase Study of JIO
Case Study of JIO
 
Presentation of d mart
Presentation of d martPresentation of d mart
Presentation of d mart
 
Airtel money
Airtel moneyAirtel money
Airtel money
 
Brand comparison of rolex and titan
Brand comparison of rolex and titanBrand comparison of rolex and titan
Brand comparison of rolex and titan
 
Patanjali Analysis
Patanjali AnalysisPatanjali Analysis
Patanjali Analysis
 
Future Group
Future GroupFuture Group
Future Group
 
Strategies of Patanjali of Marketing
Strategies of Patanjali of MarketingStrategies of Patanjali of Marketing
Strategies of Patanjali of Marketing
 
All about Zomato
All about ZomatoAll about Zomato
All about Zomato
 
Balaji profile
Balaji  profileBalaji  profile
Balaji profile
 
Surf Excel
Surf ExcelSurf Excel
Surf Excel
 
ITC Ltd.
ITC Ltd.ITC Ltd.
ITC Ltd.
 

Similar to Market Research using SPSS _ Edu4Sure Sept 2023.ppt

Customer Satisfaction Data - Multiple Linear Regression Model.pdf
Customer Satisfaction Data -  Multiple Linear Regression Model.pdfCustomer Satisfaction Data -  Multiple Linear Regression Model.pdf
Customer Satisfaction Data - Multiple Linear Regression Model.pdfruwanp2000
 
Moderation and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSSModeration and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSSOsama Yousaf
 
Factoranalysis & annova
Factoranalysis & annovaFactoranalysis & annova
Factoranalysis & annovaDivesh Sharma
 
Week 3 Lecture 11 Regression Analysis Regression analy.docx
Week 3 Lecture 11 Regression Analysis Regression analy.docxWeek 3 Lecture 11 Regression Analysis Regression analy.docx
Week 3 Lecture 11 Regression Analysis Regression analy.docxcockekeshia
 
Store segmentation progresso
Store segmentation progressoStore segmentation progresso
Store segmentation progressoveesingh
 
Data Preparation with the help of Analytics Methodology
Data Preparation with the help of Analytics MethodologyData Preparation with the help of Analytics Methodology
Data Preparation with the help of Analytics MethodologyRupak Roy
 
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Smarten Augmented Analytics
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Rankingijsrd.com
 
Chapter 4 - multiple regression
Chapter 4  - multiple regressionChapter 4  - multiple regression
Chapter 4 - multiple regressionTauseef khan
 
Quantifying an association to predict future events chapt
Quantifying an association to predict future events chaptQuantifying an association to predict future events chapt
Quantifying an association to predict future events chaptMARK547399
 
Multiple Linear Regression
Multiple Linear RegressionMultiple Linear Regression
Multiple Linear RegressionIndus University
 

Similar to Market Research using SPSS _ Edu4Sure Sept 2023.ppt (20)

Linear regression
Linear regressionLinear regression
Linear regression
 
Customer Satisfaction Data - Multiple Linear Regression Model.pdf
Customer Satisfaction Data -  Multiple Linear Regression Model.pdfCustomer Satisfaction Data -  Multiple Linear Regression Model.pdf
Customer Satisfaction Data - Multiple Linear Regression Model.pdf
 
ai.pptx
ai.pptxai.pptx
ai.pptx
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Moderation and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSSModeration and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSS
 
Factoranalysis & annova
Factoranalysis & annovaFactoranalysis & annova
Factoranalysis & annova
 
Week 3 Lecture 11 Regression Analysis Regression analy.docx
Week 3 Lecture 11 Regression Analysis Regression analy.docxWeek 3 Lecture 11 Regression Analysis Regression analy.docx
Week 3 Lecture 11 Regression Analysis Regression analy.docx
 
Risk Based Loan Approval Framework
Risk Based Loan Approval FrameworkRisk Based Loan Approval Framework
Risk Based Loan Approval Framework
 
Store segmentation progresso
Store segmentation progressoStore segmentation progresso
Store segmentation progresso
 
Data Preparation with the help of Analytics Methodology
Data Preparation with the help of Analytics MethodologyData Preparation with the help of Analytics Methodology
Data Preparation with the help of Analytics Methodology
 
FICO Credit Risk Data
FICO Credit Risk DataFICO Credit Risk Data
FICO Credit Risk Data
 
Spss software
Spss softwareSpss software
Spss software
 
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Ranking
 
Chapter 4 - multiple regression
Chapter 4  - multiple regressionChapter 4  - multiple regression
Chapter 4 - multiple regression
 
Quantifying an association to predict future events chapt
Quantifying an association to predict future events chaptQuantifying an association to predict future events chapt
Quantifying an association to predict future events chapt
 
FICO Credit Risk Data
FICO Credit Risk DataFICO Credit Risk Data
FICO Credit Risk Data
 
Analytics
AnalyticsAnalytics
Analytics
 
Multiple Linear Regression
Multiple Linear RegressionMultiple Linear Regression
Multiple Linear Regression
 
Regresión
RegresiónRegresión
Regresión
 

More from Edu4Sure

Structured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptxStructured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptxEdu4Sure
 
Sales Training 2023.pptx
Sales Training 2023.pptxSales Training 2023.pptx
Sales Training 2023.pptxEdu4Sure
 
Seo tips continue 1 to 1 live
Seo tips continue 1 to 1 liveSeo tips continue 1 to 1 live
Seo tips continue 1 to 1 liveEdu4Sure
 
SQL (Basic to Intermediate Customized 8 Hours)
SQL (Basic to Intermediate Customized 8 Hours)SQL (Basic to Intermediate Customized 8 Hours)
SQL (Basic to Intermediate Customized 8 Hours)Edu4Sure
 
Marketing concepts
Marketing conceptsMarketing concepts
Marketing conceptsEdu4Sure
 
Edu4Sure - Entrepreneurship
Edu4Sure - EntrepreneurshipEdu4Sure - Entrepreneurship
Edu4Sure - EntrepreneurshipEdu4Sure
 
Edu4Sure - Instagram
Edu4Sure - InstagramEdu4Sure - Instagram
Edu4Sure - InstagramEdu4Sure
 
Edu4Sure - Affiliate Marketing
Edu4Sure - Affiliate MarketingEdu4Sure - Affiliate Marketing
Edu4Sure - Affiliate MarketingEdu4Sure
 
Edu4Sure - YouTube
Edu4Sure - YouTubeEdu4Sure - YouTube
Edu4Sure - YouTubeEdu4Sure
 
Edu4Sure - Web Analytics (Google)
Edu4Sure - Web Analytics (Google)Edu4Sure - Web Analytics (Google)
Edu4Sure - Web Analytics (Google)Edu4Sure
 
Edu4Sure - Google AdSense
Edu4Sure - Google AdSenseEdu4Sure - Google AdSense
Edu4Sure - Google AdSenseEdu4Sure
 
Edu4Sure - Power BI (MS Analytics Tool)
Edu4Sure - Power BI (MS Analytics Tool)Edu4Sure - Power BI (MS Analytics Tool)
Edu4Sure - Power BI (MS Analytics Tool)Edu4Sure
 
Edu4Sure - WordPress
Edu4Sure - WordPressEdu4Sure - WordPress
Edu4Sure - WordPressEdu4Sure
 
Edu4Sure - Google Ads Mistakes
Edu4Sure - Google Ads MistakesEdu4Sure - Google Ads Mistakes
Edu4Sure - Google Ads MistakesEdu4Sure
 
Edu4Sure - LinkedIn
Edu4Sure - LinkedInEdu4Sure - LinkedIn
Edu4Sure - LinkedInEdu4Sure
 
Edu4Sure - Social Media Marketing Powerfil Month Tips
Edu4Sure - Social Media Marketing Powerfil Month TipsEdu4Sure - Social Media Marketing Powerfil Month Tips
Edu4Sure - Social Media Marketing Powerfil Month TipsEdu4Sure
 
Edu4Sure - eMail Marketing (Mailchimp)
Edu4Sure - eMail Marketing (Mailchimp)Edu4Sure - eMail Marketing (Mailchimp)
Edu4Sure - eMail Marketing (Mailchimp)Edu4Sure
 
Edu4Sure - Google Ads Fundamentals
Edu4Sure - Google Ads FundamentalsEdu4Sure - Google Ads Fundamentals
Edu4Sure - Google Ads FundamentalsEdu4Sure
 
Edu4Sure - SEO SMO SMM
Edu4Sure - SEO SMO SMMEdu4Sure - SEO SMO SMM
Edu4Sure - SEO SMO SMMEdu4Sure
 
Edu4Sure - Quora
Edu4Sure - QuoraEdu4Sure - Quora
Edu4Sure - QuoraEdu4Sure
 

More from Edu4Sure (20)

Structured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptxStructured Query Language (SQL) _ Edu4Sure Training.pptx
Structured Query Language (SQL) _ Edu4Sure Training.pptx
 
Sales Training 2023.pptx
Sales Training 2023.pptxSales Training 2023.pptx
Sales Training 2023.pptx
 
Seo tips continue 1 to 1 live
Seo tips continue 1 to 1 liveSeo tips continue 1 to 1 live
Seo tips continue 1 to 1 live
 
SQL (Basic to Intermediate Customized 8 Hours)
SQL (Basic to Intermediate Customized 8 Hours)SQL (Basic to Intermediate Customized 8 Hours)
SQL (Basic to Intermediate Customized 8 Hours)
 
Marketing concepts
Marketing conceptsMarketing concepts
Marketing concepts
 
Edu4Sure - Entrepreneurship
Edu4Sure - EntrepreneurshipEdu4Sure - Entrepreneurship
Edu4Sure - Entrepreneurship
 
Edu4Sure - Instagram
Edu4Sure - InstagramEdu4Sure - Instagram
Edu4Sure - Instagram
 
Edu4Sure - Affiliate Marketing
Edu4Sure - Affiliate MarketingEdu4Sure - Affiliate Marketing
Edu4Sure - Affiliate Marketing
 
Edu4Sure - YouTube
Edu4Sure - YouTubeEdu4Sure - YouTube
Edu4Sure - YouTube
 
Edu4Sure - Web Analytics (Google)
Edu4Sure - Web Analytics (Google)Edu4Sure - Web Analytics (Google)
Edu4Sure - Web Analytics (Google)
 
Edu4Sure - Google AdSense
Edu4Sure - Google AdSenseEdu4Sure - Google AdSense
Edu4Sure - Google AdSense
 
Edu4Sure - Power BI (MS Analytics Tool)
Edu4Sure - Power BI (MS Analytics Tool)Edu4Sure - Power BI (MS Analytics Tool)
Edu4Sure - Power BI (MS Analytics Tool)
 
Edu4Sure - WordPress
Edu4Sure - WordPressEdu4Sure - WordPress
Edu4Sure - WordPress
 
Edu4Sure - Google Ads Mistakes
Edu4Sure - Google Ads MistakesEdu4Sure - Google Ads Mistakes
Edu4Sure - Google Ads Mistakes
 
Edu4Sure - LinkedIn
Edu4Sure - LinkedInEdu4Sure - LinkedIn
Edu4Sure - LinkedIn
 
Edu4Sure - Social Media Marketing Powerfil Month Tips
Edu4Sure - Social Media Marketing Powerfil Month TipsEdu4Sure - Social Media Marketing Powerfil Month Tips
Edu4Sure - Social Media Marketing Powerfil Month Tips
 
Edu4Sure - eMail Marketing (Mailchimp)
Edu4Sure - eMail Marketing (Mailchimp)Edu4Sure - eMail Marketing (Mailchimp)
Edu4Sure - eMail Marketing (Mailchimp)
 
Edu4Sure - Google Ads Fundamentals
Edu4Sure - Google Ads FundamentalsEdu4Sure - Google Ads Fundamentals
Edu4Sure - Google Ads Fundamentals
 
Edu4Sure - SEO SMO SMM
Edu4Sure - SEO SMO SMMEdu4Sure - SEO SMO SMM
Edu4Sure - SEO SMO SMM
 
Edu4Sure - Quora
Edu4Sure - QuoraEdu4Sure - Quora
Edu4Sure - Quora
 

Recently uploaded

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 

Recently uploaded (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 

Market Research using SPSS _ Edu4Sure Sept 2023.ppt

  • 1. Advanced Statistical Analysis Tool Statistical Package for the Social Sciences
  • 2. What is SPSS  IBM SPSS Modeler is a Data Mining and Text Analytics software application by IBM. It is a comprehensive predictive analytics platform designed to bring predictive intelligence to decisions made by individuals, groups, systems or by an enterprise as a whole.  It is used to build predictive models and conduct analytics tasks through its user- friendly Visual Interface by leveraging statistical and data mining algorithms without programming.  Originally SPSS stands for Statistical Package for the Social Science but now it stands for Statistical Product and Service Solutions.
  • 3. Advantages of SPSS  Analyze and better understand your data and solve complex business.  Understand large and complex data sets quickly with advanced statistical procedures that help to ensure high accuracy and quality decision-making.
  • 4. Disadvantages of SPSS  It cannot be used to analyze a big data set.  Expensive tool.  The graph features are not as simple as of Microsoft Excel.
  • 5. Competitors • R • SAS • Python • Minitab • Eviews • Origin
  • 6. Flow of Course Module 1 Introduction to Data Mining • Introduction to Data Mining • SPSS Modeler Interface Module 2 The Data Mining Process • CRISP-DM Methodology Module 3 Modeling Techniques • Multiple Regression Analysis • Factor Analysis • Cluster Analysis • Discriminant Analysis • Multidimensional Scaling • Conjoint Analysis
  • 7. Module 1 Introduction to Data Mining
  • 8. Introduction to Data Mining Why Data Mining?  How do we consume the available data, translate it into information and make it usable? What is Data Mining?  Process of discovering insights, patterns and relationships from large amounts of data. What knowledge can be extracted?  Descriptive-What has happened and Why did it happen?  Predictive- What is likely to happen next?
  • 9. Why is Data Mining Important & its Applications Data and Analytics Create New Business Models CMO- Attract, grow and retain customers COO- Optimize Operations: Counter Frauds &Threats CIO/CDO – Maximize Insights, Ensure Trust, Improve Economics CRO- Manage Risks CFO- Transform management and financial processes
  • 10. Identify the Data Mining! Dividing the customers of a company according to their gender Computing Total Sales of a Company Sorting a student database based on student identification numbers Predicting the outcomes of tossing a fair pair dice Predicting the future stock price of a company using historical records
  • 11. SPSS Modeler Interface Stream Canvas Stream, Output, Model Manager Data Mining Lifecycle Palettes Nodes
  • 12. Module 2 Data Mining Process
  • 13. Data Mining Process: CRISP - DM  CRISP-DM stands for Cross Industry Standard Process for Data Mining.  Business Understanding - What should be accomplished from a business perspective?  Data Understanding - Acquiring the data needed to accomplish the objective.  Data Preparation- Selecting and Cleaning the data. May transform/aggregate for analysis.  Modeling - Selecting technique, building and training the model, accessing accuracy.  Evaluation - Does the model meet business objectives?  Deployment - Strategy for deploying the model.
  • 14.
  • 15. Module 3 Modeling Techniques
  • 16. Regression Analysis  Regression technique is used to assess the strength of a relationship between one dependent and independent variable(s). It helps in predicting the value of a dependent variable from one or more independent variables.  Two types of Regression  Bivariate Regression: One Dependent Variable, One Independent Variable.  Multiple Regression: One Dependent Variable, Rest Independent Variable.
  • 17. Regression Examples: Understand Variables  Price reduction has any impact on increasing sales.  Sales has any effect on advertising spend, the number of products introduced, and the number of sales personnel.  Female literacy has any impact on increasing the marriage age of the female child.
  • 18. Bivariate Regression: Introduction  The simplest of regression analysis is called Bivariate Regression.  Includes 2 variables.  One dependent variables that needs to be predicted or explained.  One independent variable that explains the variance in the dependent variable.  Regression Analysis is used to predict the value of dependent variable given the values of independent variables by calculating an equation.  Example on the next slide
  • 19. Regression Methods  Enter: All independent variables are entered into the equation in (one step), also called "forced entry".  Forward: A variable selection method which begins with a model that contains no variables (called the Null Model) & then starts adding the most significant variables one after the other. • Backward: A variable selection method which begins with a model that contains all variables under consideration (called the Full Model) then starts removing the least significant variables one after the other.  Stepwise: It is a combination of Backward & Forward, it keeps adding and removing predictors as it builds the model.
  • 20. Bivariate Regression: Example  A marketing manager wants to predict if the variation in the sales can be explained in terms of variation in advertising spend  The equation can be as follows Sales = Sales with 0 advertising spend + B1 (Advertising Budget) + Error Terms B1 = Beta Coefficient (Change in sales if there is an advertising budget) Error Terms: Other factors which can affect variable factors
  • 21. Let’s Understand Practically - Bivariate Regression  Data Description: A person with 8 years of education is earning $77 per week.  Problem Statement:  If an increase in education level has any impact on weekly earnings.  If there is an increase in education by one unit then how much earnings will be increased per week?  Hypothesis: Yes, there is a significant impact.
  • 22. Important Terms in Regression Regression Coefficient Regression coefficient is a measure of how strongly each IV (also known as predictor variable) predicts the DV. R Values This is the correlation coefficient. Regression analysis would provide you with two different R values. A simple R value represents the correlation between the observed values and the predicted values (based on the regression equation obtained) of the DV. The other R values is referred to as R Square. R square shows how much variance in the dependent variable is being explained by the independent variable(s). For example, R Square value of 0.70 would mean that the IVs in the model can predict 70% of the variance in the DV.
  • 23. Important Terms in Regression T Value Any t-value greater than +2 or less than - 2 is acceptable. The higher the t-value, the greater the confidence (accuracy) we have in the coefficient as a predictor. Low t-values are indications of low reliability of the predictive power of that coefficient. F Value = Explained Variance/ Unexplained Variance A general rule of thumb that is often used in regression analysis is that if F>2.5 then we can reject the null hypothesis. We would conclude that there is a least one parameter value that is nonzero. Df (Degrees of Freedom) The number of independent variables in our regression model.
  • 24. Important Terms in Regression Beta Coefficient (Standardized) The beta coefficient is the degree of change in the outcome variable for every 1-unit of change in the predictor variable. It ranges from -∞ to +∞. R R is Correlation Coefficient that describes the relationship between two independent variables. It ranges between +1 and - 1 for completely positive and negative correlation respectively. Beta Coefficient (Unstandardized) Standardized beta coefficients are expressed in standard deviations whereas unstandardized coefficients are expressed in raw units.
  • 25. Important Terms in Regression
  • 26. Insights  If there is 1 unit increase in education the level then there is 0.423 impact on weekly earning.  R Square value suggests there is 17.9% variability in weekly earning due to education level.  Significance: ? (Tell me)
  • 27. Multiple Regression: Introduction  One dependent variable and more than one independent variable.  Rest all as same as Bivariate Regression.
  • 28. Let’s Understand Practically - Multiple Regression  Data Description: Age, Weight & BP of different people have been given.  Problem Statement:  If Age & Weight have any impact on the BP of a person.  If there is an increase in Age or Weight by one unit then how much BP will be impacted?  Hypothesis: Yes, there is a significant impact.
  • 29. Insights  Because the unit of Age & Weight is different so we can only analyze Standardized Coefficients.  If there is 1 unit increase in age, then there is 0.346 impact on BP.  If there is 1 unit increase in weight, then there is 0.481 impact on BP.  Both predictors are significant.  R Square value suggests there is 55.8% variability in BP due to Age & Weight.
  • 30. Factor Analysis  Factor analysis is a technique that is used to reduce a large number of variables into fewer numbers of factors.  Factor analysis groups variables with similar characteristics together. Therefore, with factor analysis, you can produce a small number of factors from a large number of variables.  One can use the reduced factors for further analysis.
  • 31. Factor Analysis: How it Works  When Factor Analysis is applied to the dataset, variables with high correlation are grouped together.
  • 32. Important Terms in Factor Analysis • Communality: Communality is the amount of variance a variable shares with all the other variables being considered. Small values indicate variables that do not fit well with the factor solution and should possibly be dropped from the analysis. Normally values Less than .50 are removed. • Eigen Value: The eigenvalue represents the total variance explained by each factor. Factors having eigenvalues over one (1) are selected for further study.
  • 33. Let’s Understand Practically - Factor Analysis  Data Description: Jet Airways Feedback Data  Sample Size: 20  Parameters: 10  Scale: 1 to 7 o 1 Strongly Disagree o 2 Relatively Disagree o 3 Disagree o 4 Neutral o 5 Agree o 6 Relatively Agree o 7 Strongly Agree
  • 34. Parameters  JA is always on time.  Seats are comfortable.  Love the food they provide.  Air Hostesses are beautiful.  My boss/ friends also use the same.  JA has younger air crafts.  I get advantage of a frequent flyer program.  Flight timings suit my schedule.  I feel safe.  JA matches my lifestyle and standard.
  • 35. Advantages  It can be used to identify the hidden dimensions or constraints which may or may not be apparent from direct analysis.  It is not extremely difficult to do and at the same time its inexpensive and gives accurate results.
  • 36. Disadvantages  The usefulness depends on the researcher’s ability to develop a complete and accurate set of product attributes. If important attributes are missed the value of procedure is reduced accordingly.  Naming of the factors can be difficult multiple attributes can be highly correlated with no apparent reasons.  If the observed variables are completely unrelated the factor analysis is unable to produce meaningful pattern.
  • 37. Cluster Analysis  Cluster analysis is a powerful data-mining tool that helps organizations to identify discrete groups of customers, sales transactions, or other types of behaviours and things.  For example, insurance providers use cluster analysis to detect fraudulent claims, and banks use it for credit scoring.  The most common use of cluster analysis is classification.  Subjects are separated into groups so that each subject is more similar to other subjects in its group (called a cluster) than to subjects outside the group.  This technique is used for segmentation.
  • 38. Application - Segmentation  A company wants to launch a mobile for INR 100000.  How to decide whom to target for high sales?
  • 39. Similarity between Factor Analysis & Cluster Analysis  Cluster analysis and factor analysis are two common statistical methods that data analysts use to explore and simplify complex data sets.  They both aim to group variables or observations based on some measure of similarity or correlation, but they differ in their purposes and assumptions.
  • 40. Difference between Factor Analysis & Cluster Analysis  In Factor Analysis we look at Correlation but in Cluster Analysis, we look at the distance.  In Factor Analysis, we group the statements but in Cluster Analysis, we group the respondents.
  • 41. Most Important Types of Cluster Analysis  TwoStep TwoStep Cluster is a two-step clustering method. The first step makes a single pass through the data, during which it compresses the raw input data into a manageable set of subclusters. The second step uses a hierarchical clustering method to progressively merge the subclusters into larger and larger clusters, without requiring another pass through the data.  K – Means K-means clustering is one of the most often used methods and is conducted by creating a space that has as dimensions as the input variables. K stands for the number of clusters.
  • 42. Let’s Understand Practically - Cluster Analysis  Let’s take the same example which we took in Factor Analysis
  • 43. Important Terms in Cluster Analysis  Silhouette Value The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters.  The Euclidean Distance The Euclidean distance process determines the proximity between observations by drawing a straight line between pairs of observations. Therefore, this process measures the distance between observations by looking at the length of this line between observations.
  • 44. Discriminant Analysis  The primary function of this technique is to assign each observation to a particular group or category according to the data’s independent characteristics.  This is similar to Regression Analysis and is used to assess the relationship between dependent & independent variables.  In Discriminant Analysis, the dependent variable is categorial or non-metric.  Dependent variable is called discriminant variable as that discriminates the respondents.
  • 45. Let’s do it Practically - Discriminant Analysis  There are 3 rounds conducted by a company to hire a candidate. The data for the same has been taken to declare the result.
  • 46. Important Terms in Discriminant Analysis  Low value of Wilk’s Lambda reflects high significance. It ranges between 0 to 1.  The F Test should show a p value less than 0.5.  Larger the absolute value of standardized coefficients better the predictive power of variable.  Canonical Correlation: Should be closer to 1 for a strong correlation.
  • 47. Insights  Eigen Value is > 1, so it is a good model.  Canonical Coefficient is near 1 so strong correlation is present.  Wilks’ Lambda is towards 0 (0.454) which means high significance. High significance means better discriminating power of the model.  P is < 0.05 so the discrimination between the groups is highly significant.  Test1 has highest power of discrimination then Interview then Test2 going by coefficients.  Best Predictor: Test1, Interview, Test2 going by Structure Matrix.
  • 48. Multidimensional Scaling  Multidimensional scaling is a visual representation of distances or dissimilarities between sets of objects.  Multidimensional Scaling is a family of statistical methods that focus on creating mappings of items based on distance.  The input to multidimensional scaling is a distance matrix. The output is typically a two- dimensional scatterplot, where each of the objects is represented as a point.  MDS is more impactful because pictures are easier to interpret than numbers & tables.
  • 49. Applications - Multidimensional Scaling  To identify the image/ position of a product in consumers’ mind.  The number & nature of dimensions consumers use to perceive a brand.  To understand market gap so that a company can fit a new product in the market.  Also called Perceptual Mapping, maps the perceptions of the consumers about the product that a marketer always needs.  Market Segmentation  Assessing advertising effectiveness  Pricing Analysis  Channel Decisions
  • 50. Terms Associated - Multidimensional Scaling  Stress: This is a lack of fit-measure, higher values of stress indicates the poorer fits. It must be < 0.02 for a great fit.  RSQ: Squared Correlation: It must be > 0.7 for a great fit.
  • 51. Steps to Conduct MDS  Formulate the Problem: Specify the purpose of Analysis, Number of Brands (8 to 25) to be included in the analysis.  Obtain the Input Data: Refer the next slide.  Select the MDS Procedure: Perceptions: To Create Spatial Map, Preference: To Decide the Dimensions.  Decide on the number of Dimensions: Not more than 3 else it becomes complicated.  Label the Dimensions and Interpret the Configuration: Will do practically.  Assess Reliability & Validity: Will do practically.
  • 52. Obtain The Input Data MDS Input Data Perceptions Direct (Similarity Judgements) Derived (Attribute Ratings) Preferences
  • 53. Example - Multidimensional Scaling  5 Brands of Mobile Phones – Vivo, Samsung, Mi, Oppo, Huawei  2 Dimensions - Economic, Features (Look wise)  Scale: 0 to 10, 0 – Dissimilar, 10 – Similar
  • 54. Conjoint Analysis  Conjoint analysis is a form of statistical analysis that firms use in market research to understand how customers value different components or features of their products or services.  Conjoint analysis is a statistical analysis and marketing research technique to measure what consumers value most about your products and services.  It is a survey-based statistical analysis method.  For example, a TV manufacturer would want to know if customers value picture or sound quality more, or if they value low price more than picture quality.
  • 55. Conjoint Analysis – Use Cases  Buyer decisions  Customer preferences  Market sales  New product pricing  Selection of the best service or product feature
  • 56. Conjoint Analysis – Let’s Do It Practically  Juices Example

Editor's Notes

  1. https://researchwithfawad.com/index.php/lp-courses/data-analysis-using-spss/how-to-perform-exploratory-factor-analysis-using-spss/
  2. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  3. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  4. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  5. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  6. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  7. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  8. https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ https://www.youtube.com/watch?v=3MnVCX94jJM https://www.youtube.com/watch?v=az6bDAyvM_w
  9. https://www.youtube.com/watch?v=tT1kJhQS2Dk
  10. https://www.youtube.com/watch?v=tT1kJhQS2Dk
  11. https://www.youtube.com/watch?v=o1Z_tQh043k