SlideShare a Scribd company logo
1 of 17
Download to read offline
MODELING ADOPTIONS AND THE 
STAGES OF THE DIFFUSION OF 
INNOVATIONS 
Nicola 
Barbieri 
Francesco 
Bonchi 
Yahoo 
Labs 
Barcelona, 
Spain 
{barbieri,bonchi}@yahoo-­‐inc.com 
Yasir 
Mehmood 
Pompeu 
Fabra 
University 
yasir@yahoo-­‐inc.com
Background 
• The spread of new ideas in a society is a complex process 
that starting from a small fraction of the population, propagates 
over time through a diverse set of communication channels, 
potentially reaching a critical mass. 
• Rogers’ seminal work provides a unified tool for modeling 
diffusion processes. 
• His theory identifies five categories of people by considering 
the adoption time of each person with respect to the rest of the 
population.
Our contribution 
• Understanding the dynamics of such complex process is has 
potential implications in sociology, economics and marketing. 
• We study the data mining problem of modeling adoptions and 
the stages of the diffusion of an innovation: 
① Real-world items exhibit consistent differences in the way 
they diffuse; 
② The diffusion of different items may interest different 
segments of the market and the diffusion of a item can 
achieve different level of success; 
③ Different items may exhibit different temporal patterns of 
diffusion.
MASD: a stochastic framework for modeling 
adoptions and the stages of diffusions 
• The process of diffusion is decomposed in a finite and 
ordered sequence of stages of adoptions; 
• Early stages correspond to the introduction in the market of a item, 
while latter ones correspond to the maturity phase of its life cycle. 
§ In continuity with Rogers’ theory, users have different 
likelihood of being involved in each stage. 
§ Each stage is characterized by a rate which describes the 
relative speed of adoptions.
MASD: modeling procedure (1) 
• The density function for observing a specific adoption given 
the previous ones and the current stage is: 
• The density function for the n-th adoption depends on the 
given the current stage and it is given by 
• the probability of observing the activation of the adopter ui,n 
• this adoption to occur at time ti,n 
• We assume that each stage sj, the temporal gaps between 
consecutive adoptions are explained by an exponential 
density function with rate λj : 
where
MASD: modeling procedure (2) 
• Key idea: we consider a 1-to-1 association between the 
stages of adoptions, and the states of a Markov model. 
• To enforce a clear sequentiality in the evolution of the 
stages of adoption, we introduce the following constraints: 
• These structural constraints can be accommodated in a 
left-to-right hidden Markov model. 
A 4-states left-right HMM and its transition matrix
MASD: Generative model & learning 
To better explain diffusion 
mechanisms (e.g. localized trends), 
we devise a learning framework that 
alternates two phases: 
① Cluster the diffusion traces in different 
groups; 
② For each group fit the parameter of the 
MASD model by applying the 
Expectation Maximization algorithm. 
• The number of states for each cluster is automatically 
detected by relying on the Bayesian Information Criterion.
MASD: Learning 
• We employ a simple instance of iterative prototype-based 
clustering. 
• Given a cluster Ch (set of traces) and a set of candidate 
models with different degree of complexity, we select Θ∗ 
h as:
Evaluation 
• Assess the accuracy, convergence and stability of the learning 
framework on synthetic data. 
• We generate synthetic data with planted clustering structure in two 
steps: 
① Construct a set MASD models to generate clusters of data; 
② Sample lengths of adoption traces, pick a MASD model at random, and 
generate adoptions using the selected model. 
• Detecting and characterizing different patterns of adoption on 
real data (Movielens & Yahoo Meme). 
• Predictive tasks. Given a considered time window: 
• Which users are more likely to adopt the item? 
• How many users will adopt the item?
Evaluation on synthetic data. 
(top) Accuracy in the “clustering reconstruction” task on synthetic data with planted clusters, measured in terms of Rand index 
(bottom) Convergence rate of the clustering/learning process, measured as the percentage of swaps observed at each iteration
Evaluation on real-world data (1) 
Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom) 
MASD model on the Yahoo MEME dataset. The thickness of the arc indicates the 
strength of transition probability between stages. Numbers inside each state 
represent: (i) the index of the adoption stage, (ii) the avg percentage of adoptions 
observed in the considered stage; (iii) the percentage of adoption traces that 
involve the stage.
Evaluation on real-world data (2) 
Each user is generally tied to few stages. 
Stages exhibit different patterns 
considering adoption rates. 
Different diffusion rates (log-scale) in the five clusters in MovieLens (top) and Yahoo Meme (bottom)
Experiments: real-world datasets (3) 
Top-5 traces that minimize the perplexity for the considered model: 
Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom)
Predictive tasks: Prediction protocol 
• Learn parameters over the diffusion traces in the training-set. 
• For each trace in the test-set we evaluate prediction accuracy 
by varying the length of the observed data used for fitting. 
• Each partial-observed trace in the test-set is associated to 
the model M that maximizes it log likelihood. 
Fitting Evaluation 
• We find the the current diffusion state by applying Viterbi. 
• From the current state, we generate multiple samples 
according to the generative process. 
• The variable of interest (adoption of a user/overall size) is 
estimated on the simulated traces (1k).
Predictive tasks: Results 
Area under the curve (AUC) for predicting single user activations: 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
30 days 0.70 0.54 0.55 0.55 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 
21 days 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 0.68 0.54 0.54 0.54 
14 days 0.69 0.54 0.54 0.54 0.69 0.54 0.54 0.55 0.69 0.53 0.54 0.54 
(a) Movielens 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
60 min. 0.83 0.73 0.74 0.75 0.82 0.72 0.73 0.74 0.82 0.72 0.74 0.74 
30 min. 0.82 0.72 0.73 0.74 0.81 0.71 0.72 0.72 0.81 0.71 0.72 0.73 
15 min. 0.81 0.68 0.70 0.70 0.80 0.69 0.70 0.71 0.81 0.66 0.68 0.69 
(b) Yahoo Meme 
TABLE VI: Area under the curve (AUC) for predicting single user activations in different time windows. The baseline procedure is evaluated 
for three selections of k and three different splits of propagations. 
Mean absolute error (MAE) for predicting final size of the diffusion trace: 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
30 days 3.42 3.90 3.90 3.92 3.71 3.90 3.91 3.92 4.61 5.14 5.17 5.17 
21 days 2.61 3.02 3.02 3.03 2.88 3.02 3.03 3.03 3.61 3.96 3.97 3.98 
14 days 1.93 2.22 2.23 2.23 2.16 2.23 2.23 2.23 2.69 2.90 2.91 2.91 
(a) Movielens 
60% partial observation 50% partial observation 40% partial observation 
Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 
60 min. 3.57 5.56 5.61 5.63 5.32 7.20 7.24 7.25 7.46 9.01 9.08 9.09 
30 min. 3.01 4.65 4.69 4.71 4.66 6.13 6.17 6.15 6.69 7.82 7.89 7.93 
15 min. 2.49 3.23 3.25 3.26 4.53 5.89 5.92 5.93 5.50 5.63 5.66 5.68 
(b) Yahoo Meme
Conclusion and future works 
• We introduce MASD, a stochastic framework for modeling 
users’ adoptions and the different stages of diffusion of of 
innovations. 
• Our model focuses on the two main dimensions, users and rate 
of adoption. Learning is accomplished by fitting a left-to-right 
hidden Markov model. 
• The experimental evaluation over real-world data confirms the 
accuracy in detecting interesting patterns of adoption and in 
prediction scenarios. 
• Future works: Account for social influence dynamics and stages 
of virality.
Thanks!

More Related Content

Similar to Modeling adoptions and the stages of the diffusion of innovations

Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesFellowBuddy.com
 
Crowd Density Estimation Using Base Line Filtering
Crowd Density Estimation Using Base Line FilteringCrowd Density Estimation Using Base Line Filtering
Crowd Density Estimation Using Base Line Filteringpaperpublications3
 
Md simulation and stochastic simulation
Md simulation and stochastic simulationMd simulation and stochastic simulation
Md simulation and stochastic simulationAbdulAhad358
 
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016MLconf
 
Matrioska tracking keypoints in real-time
Matrioska tracking keypoints in real-timeMatrioska tracking keypoints in real-time
Matrioska tracking keypoints in real-timepowerUserHallo
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance VideoIRJET Journal
 
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSOA Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSOrahulmonikasharma
 
Using AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsUsing AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsRoland Ewald
 
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...Philipp Singer
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and BlockchainKan Yuenyong
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Postermultimediaeval
 

Similar to Modeling adoptions and the stages of the diffusion of innovations (20)

Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
 
Crowd Density Estimation Using Base Line Filtering
Crowd Density Estimation Using Base Line FilteringCrowd Density Estimation Using Base Line Filtering
Crowd Density Estimation Using Base Line Filtering
 
Simulation
SimulationSimulation
Simulation
 
Agener~1
Agener~1Agener~1
Agener~1
 
Md simulation and stochastic simulation
Md simulation and stochastic simulationMd simulation and stochastic simulation
Md simulation and stochastic simulation
 
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
Alex Smola, Director of Machine Learning, AWS/Amazon, at MLconf SF 2016
 
Matrioska tracking keypoints in real-time
Matrioska tracking keypoints in real-timeMatrioska tracking keypoints in real-time
Matrioska tracking keypoints in real-time
 
F1083644
F1083644F1083644
F1083644
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
C013141723
C013141723C013141723
C013141723
 
Rabiner
RabinerRabiner
Rabiner
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance Video
 
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSOA Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
A Survey on: Hyper Spectral Image Segmentation and Classification Using FODPSO
 
Using AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of SimulatorsUsing AI Planning to Automate the Performance Analysis of Simulators
Using AI Planning to Automate the Performance Analysis of Simulators
 
FUTURE TRENDS OF SEISMIC ANALYSIS
FUTURE TRENDS OF SEISMIC ANALYSISFUTURE TRENDS OF SEISMIC ANALYSIS
FUTURE TRENDS OF SEISMIC ANALYSIS
 
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
Sequential Action Patterns in Collaborative Ontology Engineering Projects: A ...
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - PosterMediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
MediaEval 2015 - UNED-UV @ Retrieving Diverse Social Images Task - Poster
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 

Recently uploaded

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...ThinkInnovation
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...ThinkInnovation
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 

Recently uploaded (16)

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
Predictive Analysis - Using Insight-informed Data to Plan Inventory in Next 6...
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
Decision Making Under Uncertainty - Is It Better Off Joining a Partnership or...
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 

Modeling adoptions and the stages of the diffusion of innovations

  • 1. MODELING ADOPTIONS AND THE STAGES OF THE DIFFUSION OF INNOVATIONS Nicola Barbieri Francesco Bonchi Yahoo Labs Barcelona, Spain {barbieri,bonchi}@yahoo-­‐inc.com Yasir Mehmood Pompeu Fabra University yasir@yahoo-­‐inc.com
  • 2. Background • The spread of new ideas in a society is a complex process that starting from a small fraction of the population, propagates over time through a diverse set of communication channels, potentially reaching a critical mass. • Rogers’ seminal work provides a unified tool for modeling diffusion processes. • His theory identifies five categories of people by considering the adoption time of each person with respect to the rest of the population.
  • 3. Our contribution • Understanding the dynamics of such complex process is has potential implications in sociology, economics and marketing. • We study the data mining problem of modeling adoptions and the stages of the diffusion of an innovation: ① Real-world items exhibit consistent differences in the way they diffuse; ② The diffusion of different items may interest different segments of the market and the diffusion of a item can achieve different level of success; ③ Different items may exhibit different temporal patterns of diffusion.
  • 4. MASD: a stochastic framework for modeling adoptions and the stages of diffusions • The process of diffusion is decomposed in a finite and ordered sequence of stages of adoptions; • Early stages correspond to the introduction in the market of a item, while latter ones correspond to the maturity phase of its life cycle. § In continuity with Rogers’ theory, users have different likelihood of being involved in each stage. § Each stage is characterized by a rate which describes the relative speed of adoptions.
  • 5. MASD: modeling procedure (1) • The density function for observing a specific adoption given the previous ones and the current stage is: • The density function for the n-th adoption depends on the given the current stage and it is given by • the probability of observing the activation of the adopter ui,n • this adoption to occur at time ti,n • We assume that each stage sj, the temporal gaps between consecutive adoptions are explained by an exponential density function with rate λj : where
  • 6. MASD: modeling procedure (2) • Key idea: we consider a 1-to-1 association between the stages of adoptions, and the states of a Markov model. • To enforce a clear sequentiality in the evolution of the stages of adoption, we introduce the following constraints: • These structural constraints can be accommodated in a left-to-right hidden Markov model. A 4-states left-right HMM and its transition matrix
  • 7. MASD: Generative model & learning To better explain diffusion mechanisms (e.g. localized trends), we devise a learning framework that alternates two phases: ① Cluster the diffusion traces in different groups; ② For each group fit the parameter of the MASD model by applying the Expectation Maximization algorithm. • The number of states for each cluster is automatically detected by relying on the Bayesian Information Criterion.
  • 8. MASD: Learning • We employ a simple instance of iterative prototype-based clustering. • Given a cluster Ch (set of traces) and a set of candidate models with different degree of complexity, we select Θ∗ h as:
  • 9. Evaluation • Assess the accuracy, convergence and stability of the learning framework on synthetic data. • We generate synthetic data with planted clustering structure in two steps: ① Construct a set MASD models to generate clusters of data; ② Sample lengths of adoption traces, pick a MASD model at random, and generate adoptions using the selected model. • Detecting and characterizing different patterns of adoption on real data (Movielens & Yahoo Meme). • Predictive tasks. Given a considered time window: • Which users are more likely to adopt the item? • How many users will adopt the item?
  • 10. Evaluation on synthetic data. (top) Accuracy in the “clustering reconstruction” task on synthetic data with planted clusters, measured in terms of Rand index (bottom) Convergence rate of the clustering/learning process, measured as the percentage of swaps observed at each iteration
  • 11. Evaluation on real-world data (1) Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom) MASD model on the Yahoo MEME dataset. The thickness of the arc indicates the strength of transition probability between stages. Numbers inside each state represent: (i) the index of the adoption stage, (ii) the avg percentage of adoptions observed in the considered stage; (iii) the percentage of adoption traces that involve the stage.
  • 12. Evaluation on real-world data (2) Each user is generally tied to few stages. Stages exhibit different patterns considering adoption rates. Different diffusion rates (log-scale) in the five clusters in MovieLens (top) and Yahoo Meme (bottom)
  • 13. Experiments: real-world datasets (3) Top-5 traces that minimize the perplexity for the considered model: Adoption patterns, in the 5 clusters, MovieLens (top) and Yahoo Meme (bottom)
  • 14. Predictive tasks: Prediction protocol • Learn parameters over the diffusion traces in the training-set. • For each trace in the test-set we evaluate prediction accuracy by varying the length of the observed data used for fitting. • Each partial-observed trace in the test-set is associated to the model M that maximizes it log likelihood. Fitting Evaluation • We find the the current diffusion state by applying Viterbi. • From the current state, we generate multiple samples according to the generative process. • The variable of interest (adoption of a user/overall size) is estimated on the simulated traces (1k).
  • 15. Predictive tasks: Results Area under the curve (AUC) for predicting single user activations: 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 30 days 0.70 0.54 0.55 0.55 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 21 days 0.69 0.55 0.55 0.55 0.69 0.54 0.54 0.55 0.68 0.54 0.54 0.54 14 days 0.69 0.54 0.54 0.54 0.69 0.54 0.54 0.55 0.69 0.53 0.54 0.54 (a) Movielens 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 60 min. 0.83 0.73 0.74 0.75 0.82 0.72 0.73 0.74 0.82 0.72 0.74 0.74 30 min. 0.82 0.72 0.73 0.74 0.81 0.71 0.72 0.72 0.81 0.71 0.72 0.73 15 min. 0.81 0.68 0.70 0.70 0.80 0.69 0.70 0.71 0.81 0.66 0.68 0.69 (b) Yahoo Meme TABLE VI: Area under the curve (AUC) for predicting single user activations in different time windows. The baseline procedure is evaluated for three selections of k and three different splits of propagations. Mean absolute error (MAE) for predicting final size of the diffusion trace: 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 30 days 3.42 3.90 3.90 3.92 3.71 3.90 3.91 3.92 4.61 5.14 5.17 5.17 21 days 2.61 3.02 3.02 3.03 2.88 3.02 3.03 3.03 3.61 3.96 3.97 3.98 14 days 1.93 2.22 2.23 2.23 2.16 2.23 2.23 2.23 2.69 2.90 2.91 2.91 (a) Movielens 60% partial observation 50% partial observation 40% partial observation Time Window MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) MASD k-NN (60, 80, 100) 60 min. 3.57 5.56 5.61 5.63 5.32 7.20 7.24 7.25 7.46 9.01 9.08 9.09 30 min. 3.01 4.65 4.69 4.71 4.66 6.13 6.17 6.15 6.69 7.82 7.89 7.93 15 min. 2.49 3.23 3.25 3.26 4.53 5.89 5.92 5.93 5.50 5.63 5.66 5.68 (b) Yahoo Meme
  • 16. Conclusion and future works • We introduce MASD, a stochastic framework for modeling users’ adoptions and the different stages of diffusion of of innovations. • Our model focuses on the two main dimensions, users and rate of adoption. Learning is accomplished by fitting a left-to-right hidden Markov model. • The experimental evaluation over real-world data confirms the accuracy in detecting interesting patterns of adoption and in prediction scenarios. • Future works: Account for social influence dynamics and stages of virality.