SlideShare a Scribd company logo
1 of 19
Download to read offline
Session 6: Administrative and Alternative Data Sources
Big Data and Macroeconomic Nowcasting:
From Data Access to Modelling
15th Conference of International Association for Official Statistics
Abu Dhabi, 6–8 December 2016
Dario Buono*, European Commission, Eurostat, dario.buono@ec.europa.eu
Stephan Krische*, GOPA Consultants, stephan.krische@gopa.de
Massimiliano Marcellino, Bocconi University, massimiliano.marcellino@unibocconi.it
George Kapetanios, King’s College, george.kapetanios@kcl.ac.uk
Gian Luigi Mazzi, European Commission, Eurostat, gianluigi.mazzi@ec.europa.eu
Fotis Papailias, Queen's University Management School, f.papailias@qub.ac.uk
*The views expressed are the author’s alone and do not necessarily correspond to those of the corresponding organisations of affiliation
Eurostat, the Statistical Office of EU
• About 700 people with 28 different nationalities
• Statistical Office of European Union, part of EC
• Core business:
• Euro-zone (19) & EU (28) aggregates
• harmonization, best practices, guidelines, trainings &
international cooperation
• Methodology team: Time Series, Econometrics, SDC,
Research & EA
Why interested in Big Data for nowcasting?
• Big Data are complementary information to standard
data, being based on different information sets
• More granular perspective on the indicator of interest,
both in the temporal and cross-sectional dimensions
• It is timely available, generally not subject to revisions
European research project: Apr 15 to Jul 16
Research questions and findings
Can Big Data help for Macroeconomic Nowcasting?
What are the potential Big Data sources?
1. Literature review
2. Models/methods to be used for Big data
3. Recommendations on how to handle Big Data
4. Case study: IPI, Inflation, unemployment of some EU
countries
Big Data types & dimensionality
• When the dimensionality increases, the volume of the space
increases so fast that the available data become sparse.
• For statistically significant result, the amount of data needed
often grows exponentially with the dimensionality.
• Use of a typology based on Doornik and Hendry (2015):
• Tall data: many observation, few variables
• Fat data: many variables, few observations
• Huge data: many variables, many observations
Eurostat
Models race
• Dynamic Factor Analysis
• Partial Least Squares
• Bayesian Regression
• LASSO regression
• U-Midas models
• Model averaging
255 models tested, macro-financial & google trend data
Eurostat
Statistical Methods: findings
• Sparse regression (LASSO) works for fat, huge data
• Data reduction techniques (PLS) helpful for large variables
• (U)-MIDAS or bridge modelling for mixed frequency
• Dimensionality reduction improves nowcasting
• Forecast combination: Data-driven automated strategy with
model rotation based on forecasting performance in the past
works well
From Data Access to Modelling
Step-by-step approach, accompanied by specific
recommendations for the use of big data for
macroeconomic nowcasting, guiding to
• the identification and the choice of Big Data
• pre-treatment and econometric modelling
• the comparative evaluation of results to obtain a very useful
tool for decision about the use or not of Big Data
Step 1: Big Data usefulness within
a nowcasting exercise
Recommendations
1. Evaluate the quality of the existing nowcasts
and identify issue (bias or inefficiency or large
errors in specific periods), that can be fixed by
adding information in Big Data based indicators
2. Use of Big Data only when expecting to improve
the timeliness and/or the quality of nowcastings
3. Do not consider Big Data sources with spurious
correlations with the target variable
Step 2: Big Data search
Recommendations
1. Starting point for an assessment of the potential
benefits/costs of the use of Big Data for macroeconomic
nowcasting: identification of their source
• Social Networks (human-sourced information)
• Traditional Business Systems (process-mediated data)
• Internet of Things (machine-generated data)
2. Choice is heavily dependent on the target indicator of
the nowcasting exercise
Step 3: Assessment of big-data
accessibility and quality
Recommendations
1. Privilege data providers with guarantee of continuity and of the
availability of a good metadata associated to the Big Data
2. Privilege Big Data sources ensuring sufficient time and cross-
sectional coverage
3. If a bias is observed a bias correction can be included in the
nowcasting strategy.
4. To deal with possible instabilities of the relationships between the
Big Data and the target variables, nowcasting models should be re-
specified on a regular basis (e.g. yearly) and occasionally in the
presence of unexpected events.
Step 4: Big data preparation
Recommendations
1. Big data often unstructured: proper mapping
2. Pre-treatment to remove deterministic patterns
• Outliers, calendar effects, missing observations
• Seasonal and non-seasonal short-term movements should
be dealt accordingly to the characteristic of the target
variable
3. Create a specific IT environment where the original
data are collected and stored with associated routines
4. Ensure the availability of an exhaustive
documentation of the Big Data conversion process
Step 5: Big Data modelling strategy
Recommendations
1. Identification of appropriate econometric techniques
2. First dimension: choice between the use of methods suited for
large but not huge datasets, therefore applied to summaries of the
Big Data (Google Trends)
• nowcasting with large datasets can be based on factor models,
large BVARs, or shrinkage regressions
3. Huge datasets can be handled by sparse principal components,
linear models combined with heuristic optimization, or a variety of
machine learning methods such as LASSO & LARS regression
4. In case of mixed frequency data, methods such as UMIDAS and,
as a second best, Bridge, should be privileged.
Step 6: Results evaluation of Big Data
based nowcasting
Recommendations
1. Run a critical and comprehensive assessment of the
contribution of Big Data for nowcasting the indicator of interest
based, e.g., on standard criteria such as MSE or MAE.
2. In order to reduce the extent of data and model snooping, a cross-
validation approach should be followed:
• various models and indicators, with and without Big Data,
estimated over a first sample and selected and/or pooled
according to their performance
• then the performance of the preferred approaches re-evaluated
over a second sample
Case study
- Implementation of all these steps for nowcasting IP growth, inflation
and unemployment in several EU countries in a pseudo out of
sample context, using Google trends for specific and carefully
selected keywords for each country and variable
- Big Data specific features: transform unstructured into structured data,
time series decompositions, handling mixed frequency data
- Overall, the results are mixed but there are several cases where
Google trends, when combined with rather sophisticated econometric
techniques, yield forecasting gains, though generally small.
- Gains in term of timeliness or revisions have not been considered
Eurostat
Literature contribution
Eurostat Statistical Working Paper
"Big Data and Macroeconomic Nowcasting:
From data access to modelling"
 Methodological finding will be included in 2 chapter of the
Eurostat/UNECE Handbook on Rapid Estimates currently
under 2nd peer review, (forthcoming in 2017)
What's next? Big Data Econometrics
• 2017, a new project focusing on:
• Econometrics, Filtering issues, advanced Bayesian
estimation and forecasting methods
• Real time empirical evaluations (including a direct
comparison with Eurostat flash estimates),
• New ways and new metrics to present nowcasts
• Possible data timeliness/accuracy gains
• Big data handling tool developed as R package
• Scientific summary for Big Data Econometric strategy
Thank you for your attention!!
Some References:
- Eurostat, Big data and macroeconomic nowcasting, preliminary results presented at the
ESS methodological working group (7 April 2016, Luxembourg)
http://ec.europa.eu/eurostat/cros/content/item21bigdataandmacroeconomicnowcastingsl
ides_en
- Big data CROS portal, http://ec.europa.eu/eurostat/cros/content/big-data_en
- Marcellino, M. (2016), “Nowcasting with Big Data”, Keynote Speech at the 33rd CIRET
conference.
- Harford, T. (2014, April). Big data: Are we making a big mistake? Financial Times.
Available at http://www.ft.com/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabdc0.html
#ixzz2xcdlP1zZ
- Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). "The Parable of Google Flu:
Traps in Big Data Analysis", Science, 143, 1203-1205.
- Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of
the Royal Statistical Society B, 58, 267-288.

More Related Content

What's hot

hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdfAkuhuruf
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...Piet J.H. Daas
 
A forecasting of stock trading price using time series information based on b...
A forecasting of stock trading price using time series information based on b...A forecasting of stock trading price using time series information based on b...
A forecasting of stock trading price using time series information based on b...IJECEIAES
 
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
Sessione I - Big Data    Li-Chun Zhang, Discussion: Test mining, machin learn...Sessione I - Big Data    Li-Chun Zhang, Discussion: Test mining, machin learn...
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...Istituto nazionale di statistica
 
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEWBIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEWsarfraznawaz
 
Arloesiadur: An analytics experiment in innovation policy
Arloesiadur: An analytics experiment in innovation policyArloesiadur: An analytics experiment in innovation policy
Arloesiadur: An analytics experiment in innovation policyJuan Mateos-Garcia
 
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...Tilastokeskus
 
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) TreeUniversity of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Treekphodel
 
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoonTa4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoonStatistics South Africa
 
Query reverse engineering in the context of the semantic web
Query reverse engineering in the context of the semantic webQuery reverse engineering in the context of the semantic web
Query reverse engineering in the context of the semantic webLeandro Tabares-Martin
 

What's hot (20)

Apply (Big) Data Analytics & Predictive Analytics to Business Application
Apply (Big) Data Analytics & Predictive Analytics to Business ApplicationApply (Big) Data Analytics & Predictive Analytics to Business Application
Apply (Big) Data Analytics & Predictive Analytics to Business Application
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdf
 
International Year of Statistics | 2013
International Year of Statistics | 2013International Year of Statistics | 2013
International Year of Statistics | 2013
 
Data mining intro-2009-v2
Data mining intro-2009-v2Data mining intro-2009-v2
Data mining intro-2009-v2
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 
A forecasting of stock trading price using time series information based on b...
A forecasting of stock trading price using time series information based on b...A forecasting of stock trading price using time series information based on b...
A forecasting of stock trading price using time series information based on b...
 
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
Sessione I - Big Data    Li-Chun Zhang, Discussion: Test mining, machin learn...Sessione I - Big Data    Li-Chun Zhang, Discussion: Test mining, machin learn...
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEWBIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
 
What is statistics
What is statisticsWhat is statistics
What is statistics
 
Arloesiadur: An analytics experiment in innovation policy
Arloesiadur: An analytics experiment in innovation policyArloesiadur: An analytics experiment in innovation policy
Arloesiadur: An analytics experiment in innovation policy
 
NIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATSNIH BD2K DataMed model, DATS
NIH BD2K DataMed model, DATS
 
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
 
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) TreeUniversity of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
 
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoonTa4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
 
Query reverse engineering in the context of the semantic web
Query reverse engineering in the context of the semantic webQuery reverse engineering in the context of the semantic web
Query reverse engineering in the context of the semantic web
 
Development of a Collaborator Recommender System Based on Directed Graph Model
Development of a Collaborator Recommender System Based on Directed Graph ModelDevelopment of a Collaborator Recommender System Based on Directed Graph Model
Development of a Collaborator Recommender System Based on Directed Graph Model
 
Randy Goebel for the KIEF 2018. FROM DATA TO ECONOMIC VALUE
Randy Goebel for the KIEF 2018. FROM DATA TO ECONOMIC VALUERandy Goebel for the KIEF 2018. FROM DATA TO ECONOMIC VALUE
Randy Goebel for the KIEF 2018. FROM DATA TO ECONOMIC VALUE
 
Methods for making the best use of admin data
Methods for making the best use of admin dataMethods for making the best use of admin data
Methods for making the best use of admin data
 
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
 

Viewers also liked

Nowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflowNowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflowNational Bank of Belgium
 
Le Big data à Bruxelles aujourd'hui. Et demain ?
Le Big data à Bruxelles aujourd'hui. Et demain ? Le Big data à Bruxelles aujourd'hui. Et demain ?
Le Big data à Bruxelles aujourd'hui. Et demain ? Christina Galouzis
 
Pecha Kucha Presentation
Pecha Kucha Presentation  Pecha Kucha Presentation
Pecha Kucha Presentation Laura Thorburn
 
KARAS-eBrochure-FINAL
KARAS-eBrochure-FINALKARAS-eBrochure-FINAL
KARAS-eBrochure-FINALTeresa Holden
 
Uid15 accessibility raike
Uid15 accessibility raikeUid15 accessibility raike
Uid15 accessibility raikeAntti Raike
 
Cafe Desire Presentation Latest
Cafe Desire Presentation LatestCafe Desire Presentation Latest
Cafe Desire Presentation LatestAbe Joseph
 
Michael Wilkinson & Richard Gunson, Learning Everywhere
Michael Wilkinson & Richard Gunson, Learning EverywhereMichael Wilkinson & Richard Gunson, Learning Everywhere
Michael Wilkinson & Richard Gunson, Learning EverywhereHandheldLearning
 
Cei_Willis_PortfolioILLUSTRATION
Cei_Willis_PortfolioILLUSTRATIONCei_Willis_PortfolioILLUSTRATION
Cei_Willis_PortfolioILLUSTRATIONCei Willis
 
проказы старухи зимы
проказы старухи зимыпроказы старухи зимы
проказы старухи зимыaviamed
 
мальчик
мальчикмальчик
мальчикaviamed
 
Teller training
Teller trainingTeller training
Teller trainingagrigo8148
 
Bagian 2 slide - pengantar teori pajak atas penghasilan (pph)-oke
Bagian 2  slide - pengantar teori pajak atas penghasilan (pph)-okeBagian 2  slide - pengantar teori pajak atas penghasilan (pph)-oke
Bagian 2 slide - pengantar teori pajak atas penghasilan (pph)-okeAsep suryadi
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseRubedo, a WebTales solution
 
Big Data & Real Time #JSS2014
Big Data & Real Time #JSS2014Big Data & Real Time #JSS2014
Big Data & Real Time #JSS2014Romain Casteres
 

Viewers also liked (20)

Nowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflowNowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflow
 
Le Big data à Bruxelles aujourd'hui. Et demain ?
Le Big data à Bruxelles aujourd'hui. Et demain ? Le Big data à Bruxelles aujourd'hui. Et demain ?
Le Big data à Bruxelles aujourd'hui. Et demain ?
 
Pecha Kucha Presentation
Pecha Kucha Presentation  Pecha Kucha Presentation
Pecha Kucha Presentation
 
KARAS-eBrochure-FINAL
KARAS-eBrochure-FINALKARAS-eBrochure-FINAL
KARAS-eBrochure-FINAL
 
Assignment akshat
Assignment akshatAssignment akshat
Assignment akshat
 
Uid15 accessibility raike
Uid15 accessibility raikeUid15 accessibility raike
Uid15 accessibility raike
 
Cafe Desire Presentation Latest
Cafe Desire Presentation LatestCafe Desire Presentation Latest
Cafe Desire Presentation Latest
 
Michael Wilkinson & Richard Gunson, Learning Everywhere
Michael Wilkinson & Richard Gunson, Learning EverywhereMichael Wilkinson & Richard Gunson, Learning Everywhere
Michael Wilkinson & Richard Gunson, Learning Everywhere
 
Infrastructure strategy
Infrastructure strategyInfrastructure strategy
Infrastructure strategy
 
My City Crawl
My City Crawl My City Crawl
My City Crawl
 
Doc1
Doc1Doc1
Doc1
 
Cei_Willis_PortfolioILLUSTRATION
Cei_Willis_PortfolioILLUSTRATIONCei_Willis_PortfolioILLUSTRATION
Cei_Willis_PortfolioILLUSTRATION
 
Inspection activities linked
Inspection activities linkedInspection activities linked
Inspection activities linked
 
проказы старухи зимы
проказы старухи зимыпроказы старухи зимы
проказы старухи зимы
 
мальчик
мальчикмальчик
мальчик
 
Teller training
Teller trainingTeller training
Teller training
 
Bagian 2 slide - pengantar teori pajak atas penghasilan (pph)-oke
Bagian 2  slide - pengantar teori pajak atas penghasilan (pph)-okeBagian 2  slide - pengantar teori pajak atas penghasilan (pph)-oke
Bagian 2 slide - pengantar teori pajak atas penghasilan (pph)-oke
 
Le big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entrepriseLe big data à l'épreuve des projets d'entreprise
Le big data à l'épreuve des projets d'entreprise
 
The Technological Singularity
The Technological SingularityThe Technological Singularity
The Technological Singularity
 
Big Data & Real Time #JSS2014
Big Data & Real Time #JSS2014Big Data & Real Time #JSS2014
Big Data & Real Time #JSS2014
 

Similar to Big data and macroeconomic nowcasting from data access to modelling

Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesKimberley Mitchell
 
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...Big Data Value Association
 
Construction of composite index: process & methods
Construction of composite index:  process & methodsConstruction of composite index:  process & methods
Construction of composite index: process & methodsgopichandbalusu
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)Piet J.H. Daas
 
New Data for Innovation Policy
New Data for Innovation PolicyNew Data for Innovation Policy
New Data for Innovation PolicyJuan Mateos-Garcia
 
Garcia - New data for innovation policy
Garcia - New data for innovation policyGarcia - New data for innovation policy
Garcia - New data for innovation policyinnovationoecd
 
Methodological network and strategy
Methodological network and strategy Methodological network and strategy
Methodological network and strategy Dario Buono
 
Fealing - Improving indicators to inform policy
Fealing - Improving indicators to inform policyFealing - Improving indicators to inform policy
Fealing - Improving indicators to inform policyinnovationoecd
 
KU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015aKU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015avonmcconnell
 
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...StatsCommunications
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLHan Yang
 
IRJET-Smart Tourism Recommender System
IRJET-Smart Tourism Recommender SystemIRJET-Smart Tourism Recommender System
IRJET-Smart Tourism Recommender SystemIRJET Journal
 
McKinsey Global Institute Big data The next frontier for innova.docx
McKinsey Global Institute Big data The next frontier for innova.docxMcKinsey Global Institute Big data The next frontier for innova.docx
McKinsey Global Institute Big data The next frontier for innova.docxandreecapon
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applicationsPadma Metta
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxjesssueann
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxgauthierleppington
 

Similar to Big data and macroeconomic nowcasting from data access to modelling (20)

Lesson1.2.pptx.pdf
Lesson1.2.pptx.pdfLesson1.2.pptx.pdf
Lesson1.2.pptx.pdf
 
Improving activity data for Tier 2 estimates of livestock emissions: End of W...
Improving activity data for Tier 2 estimates of livestock emissions: End of W...Improving activity data for Tier 2 estimates of livestock emissions: End of W...
Improving activity data for Tier 2 estimates of livestock emissions: End of W...
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
 
Construction of composite index: process & methods
Construction of composite index:  process & methodsConstruction of composite index:  process & methods
Construction of composite index: process & methods
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
New Data for Innovation Policy
New Data for Innovation PolicyNew Data for Innovation Policy
New Data for Innovation Policy
 
Garcia - New data for innovation policy
Garcia - New data for innovation policyGarcia - New data for innovation policy
Garcia - New data for innovation policy
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
Methodological network and strategy
Methodological network and strategy Methodological network and strategy
Methodological network and strategy
 
Fealing - Improving indicators to inform policy
Fealing - Improving indicators to inform policyFealing - Improving indicators to inform policy
Fealing - Improving indicators to inform policy
 
KU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015aKU_Big_Data_3_25_2015a
KU_Big_Data_3_25_2015a
 
Week_2_Lecture.pdf
Week_2_Lecture.pdfWeek_2_Lecture.pdf
Week_2_Lecture.pdf
 
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...
 
CV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCLCV-Grace-DataAnalytics-UCL
CV-Grace-DataAnalytics-UCL
 
IRJET-Smart Tourism Recommender System
IRJET-Smart Tourism Recommender SystemIRJET-Smart Tourism Recommender System
IRJET-Smart Tourism Recommender System
 
McKinsey Global Institute Big data The next frontier for innova.docx
McKinsey Global Institute Big data The next frontier for innova.docxMcKinsey Global Institute Big data The next frontier for innova.docx
McKinsey Global Institute Big data The next frontier for innova.docx
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
 
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docxLearning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
Learning Resources Week 2 Frankfort-Nachmias, C., & Leon-Guerr.docx
 

More from Dario Buono

Reporting uncertainties - too much information?
Reporting uncertainties - too much information?Reporting uncertainties - too much information?
Reporting uncertainties - too much information?Dario Buono
 
Skills for the new generation of statisticians
Skills for the new generation of statisticians Skills for the new generation of statisticians
Skills for the new generation of statisticians Dario Buono
 
JDemetra+ Java Tool for Seasonal Adjustment
JDemetra+ Java Tool for Seasonal AdjustmentJDemetra+ Java Tool for Seasonal Adjustment
JDemetra+ Java Tool for Seasonal AdjustmentDario Buono
 
Big Data Analysis: The curse of dimensionality in official statistics
Big Data Analysis: The curse of dimensionality in official statisticsBig Data Analysis: The curse of dimensionality in official statistics
Big Data Analysis: The curse of dimensionality in official statisticsDario Buono
 
Physics4Stats & BMI vs. QoL
Physics4Stats & BMI vs. QoLPhysics4Stats & BMI vs. QoL
Physics4Stats & BMI vs. QoLDario Buono
 
Safebook quality grading
Safebook quality gradingSafebook quality grading
Safebook quality gradingDario Buono
 
MIP: Analysis of metadata and data revisions
MIP: Analysis of metadata and data revisionsMIP: Analysis of metadata and data revisions
MIP: Analysis of metadata and data revisionsDario Buono
 
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...Dario Buono
 
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...Dario Buono
 
Detecting outliers at the end of the series using forecast intervals
Detecting outliers at the end of the series using forecast intervals Detecting outliers at the end of the series using forecast intervals
Detecting outliers at the end of the series using forecast intervals Dario Buono
 
1 out of 20 scenarios
1 out of 20 scenarios1 out of 20 scenarios
1 out of 20 scenariosDario Buono
 
Eurostat methodological skills staff survey lesson learned final
Eurostat methodological skills staff survey lesson learned finalEurostat methodological skills staff survey lesson learned final
Eurostat methodological skills staff survey lesson learned finalDario Buono
 
Reliability of estimates in socio-demographic groups with small samples
Reliability of estimates in socio-demographic groups with small samplesReliability of estimates in socio-demographic groups with small samples
Reliability of estimates in socio-demographic groups with small samplesDario Buono
 

More from Dario Buono (13)

Reporting uncertainties - too much information?
Reporting uncertainties - too much information?Reporting uncertainties - too much information?
Reporting uncertainties - too much information?
 
Skills for the new generation of statisticians
Skills for the new generation of statisticians Skills for the new generation of statisticians
Skills for the new generation of statisticians
 
JDemetra+ Java Tool for Seasonal Adjustment
JDemetra+ Java Tool for Seasonal AdjustmentJDemetra+ Java Tool for Seasonal Adjustment
JDemetra+ Java Tool for Seasonal Adjustment
 
Big Data Analysis: The curse of dimensionality in official statistics
Big Data Analysis: The curse of dimensionality in official statisticsBig Data Analysis: The curse of dimensionality in official statistics
Big Data Analysis: The curse of dimensionality in official statistics
 
Physics4Stats & BMI vs. QoL
Physics4Stats & BMI vs. QoLPhysics4Stats & BMI vs. QoL
Physics4Stats & BMI vs. QoL
 
Safebook quality grading
Safebook quality gradingSafebook quality grading
Safebook quality grading
 
MIP: Analysis of metadata and data revisions
MIP: Analysis of metadata and data revisionsMIP: Analysis of metadata and data revisions
MIP: Analysis of metadata and data revisions
 
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
 
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
 
Detecting outliers at the end of the series using forecast intervals
Detecting outliers at the end of the series using forecast intervals Detecting outliers at the end of the series using forecast intervals
Detecting outliers at the end of the series using forecast intervals
 
1 out of 20 scenarios
1 out of 20 scenarios1 out of 20 scenarios
1 out of 20 scenarios
 
Eurostat methodological skills staff survey lesson learned final
Eurostat methodological skills staff survey lesson learned finalEurostat methodological skills staff survey lesson learned final
Eurostat methodological skills staff survey lesson learned final
 
Reliability of estimates in socio-demographic groups with small samples
Reliability of estimates in socio-demographic groups with small samplesReliability of estimates in socio-demographic groups with small samples
Reliability of estimates in socio-demographic groups with small samples
 

Recently uploaded

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 

Recently uploaded (20)

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 

Big data and macroeconomic nowcasting from data access to modelling

  • 1. Session 6: Administrative and Alternative Data Sources Big Data and Macroeconomic Nowcasting: From Data Access to Modelling 15th Conference of International Association for Official Statistics Abu Dhabi, 6–8 December 2016 Dario Buono*, European Commission, Eurostat, dario.buono@ec.europa.eu Stephan Krische*, GOPA Consultants, stephan.krische@gopa.de Massimiliano Marcellino, Bocconi University, massimiliano.marcellino@unibocconi.it George Kapetanios, King’s College, george.kapetanios@kcl.ac.uk Gian Luigi Mazzi, European Commission, Eurostat, gianluigi.mazzi@ec.europa.eu Fotis Papailias, Queen's University Management School, f.papailias@qub.ac.uk *The views expressed are the author’s alone and do not necessarily correspond to those of the corresponding organisations of affiliation
  • 2. Eurostat, the Statistical Office of EU • About 700 people with 28 different nationalities • Statistical Office of European Union, part of EC • Core business: • Euro-zone (19) & EU (28) aggregates • harmonization, best practices, guidelines, trainings & international cooperation • Methodology team: Time Series, Econometrics, SDC, Research & EA
  • 3. Why interested in Big Data for nowcasting? • Big Data are complementary information to standard data, being based on different information sets • More granular perspective on the indicator of interest, both in the temporal and cross-sectional dimensions • It is timely available, generally not subject to revisions
  • 4. European research project: Apr 15 to Jul 16
  • 5. Research questions and findings Can Big Data help for Macroeconomic Nowcasting? What are the potential Big Data sources? 1. Literature review 2. Models/methods to be used for Big data 3. Recommendations on how to handle Big Data 4. Case study: IPI, Inflation, unemployment of some EU countries
  • 6. Big Data types & dimensionality • When the dimensionality increases, the volume of the space increases so fast that the available data become sparse. • For statistically significant result, the amount of data needed often grows exponentially with the dimensionality. • Use of a typology based on Doornik and Hendry (2015): • Tall data: many observation, few variables • Fat data: many variables, few observations • Huge data: many variables, many observations
  • 7. Eurostat Models race • Dynamic Factor Analysis • Partial Least Squares • Bayesian Regression • LASSO regression • U-Midas models • Model averaging 255 models tested, macro-financial & google trend data
  • 8. Eurostat Statistical Methods: findings • Sparse regression (LASSO) works for fat, huge data • Data reduction techniques (PLS) helpful for large variables • (U)-MIDAS or bridge modelling for mixed frequency • Dimensionality reduction improves nowcasting • Forecast combination: Data-driven automated strategy with model rotation based on forecasting performance in the past works well
  • 9. From Data Access to Modelling Step-by-step approach, accompanied by specific recommendations for the use of big data for macroeconomic nowcasting, guiding to • the identification and the choice of Big Data • pre-treatment and econometric modelling • the comparative evaluation of results to obtain a very useful tool for decision about the use or not of Big Data
  • 10. Step 1: Big Data usefulness within a nowcasting exercise Recommendations 1. Evaluate the quality of the existing nowcasts and identify issue (bias or inefficiency or large errors in specific periods), that can be fixed by adding information in Big Data based indicators 2. Use of Big Data only when expecting to improve the timeliness and/or the quality of nowcastings 3. Do not consider Big Data sources with spurious correlations with the target variable
  • 11. Step 2: Big Data search Recommendations 1. Starting point for an assessment of the potential benefits/costs of the use of Big Data for macroeconomic nowcasting: identification of their source • Social Networks (human-sourced information) • Traditional Business Systems (process-mediated data) • Internet of Things (machine-generated data) 2. Choice is heavily dependent on the target indicator of the nowcasting exercise
  • 12. Step 3: Assessment of big-data accessibility and quality Recommendations 1. Privilege data providers with guarantee of continuity and of the availability of a good metadata associated to the Big Data 2. Privilege Big Data sources ensuring sufficient time and cross- sectional coverage 3. If a bias is observed a bias correction can be included in the nowcasting strategy. 4. To deal with possible instabilities of the relationships between the Big Data and the target variables, nowcasting models should be re- specified on a regular basis (e.g. yearly) and occasionally in the presence of unexpected events.
  • 13. Step 4: Big data preparation Recommendations 1. Big data often unstructured: proper mapping 2. Pre-treatment to remove deterministic patterns • Outliers, calendar effects, missing observations • Seasonal and non-seasonal short-term movements should be dealt accordingly to the characteristic of the target variable 3. Create a specific IT environment where the original data are collected and stored with associated routines 4. Ensure the availability of an exhaustive documentation of the Big Data conversion process
  • 14. Step 5: Big Data modelling strategy Recommendations 1. Identification of appropriate econometric techniques 2. First dimension: choice between the use of methods suited for large but not huge datasets, therefore applied to summaries of the Big Data (Google Trends) • nowcasting with large datasets can be based on factor models, large BVARs, or shrinkage regressions 3. Huge datasets can be handled by sparse principal components, linear models combined with heuristic optimization, or a variety of machine learning methods such as LASSO & LARS regression 4. In case of mixed frequency data, methods such as UMIDAS and, as a second best, Bridge, should be privileged.
  • 15. Step 6: Results evaluation of Big Data based nowcasting Recommendations 1. Run a critical and comprehensive assessment of the contribution of Big Data for nowcasting the indicator of interest based, e.g., on standard criteria such as MSE or MAE. 2. In order to reduce the extent of data and model snooping, a cross- validation approach should be followed: • various models and indicators, with and without Big Data, estimated over a first sample and selected and/or pooled according to their performance • then the performance of the preferred approaches re-evaluated over a second sample
  • 16. Case study - Implementation of all these steps for nowcasting IP growth, inflation and unemployment in several EU countries in a pseudo out of sample context, using Google trends for specific and carefully selected keywords for each country and variable - Big Data specific features: transform unstructured into structured data, time series decompositions, handling mixed frequency data - Overall, the results are mixed but there are several cases where Google trends, when combined with rather sophisticated econometric techniques, yield forecasting gains, though generally small. - Gains in term of timeliness or revisions have not been considered
  • 17. Eurostat Literature contribution Eurostat Statistical Working Paper "Big Data and Macroeconomic Nowcasting: From data access to modelling"  Methodological finding will be included in 2 chapter of the Eurostat/UNECE Handbook on Rapid Estimates currently under 2nd peer review, (forthcoming in 2017)
  • 18. What's next? Big Data Econometrics • 2017, a new project focusing on: • Econometrics, Filtering issues, advanced Bayesian estimation and forecasting methods • Real time empirical evaluations (including a direct comparison with Eurostat flash estimates), • New ways and new metrics to present nowcasts • Possible data timeliness/accuracy gains • Big data handling tool developed as R package • Scientific summary for Big Data Econometric strategy
  • 19. Thank you for your attention!! Some References: - Eurostat, Big data and macroeconomic nowcasting, preliminary results presented at the ESS methodological working group (7 April 2016, Luxembourg) http://ec.europa.eu/eurostat/cros/content/item21bigdataandmacroeconomicnowcastingsl ides_en - Big data CROS portal, http://ec.europa.eu/eurostat/cros/content/big-data_en - Marcellino, M. (2016), “Nowcasting with Big Data”, Keynote Speech at the 33rd CIRET conference. - Harford, T. (2014, April). Big data: Are we making a big mistake? Financial Times. Available at http://www.ft.com/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabdc0.html #ixzz2xcdlP1zZ - Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). "The Parable of Google Flu: Traps in Big Data Analysis", Science, 143, 1203-1205. - Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of the Royal Statistical Society B, 58, 267-288.