For more info about our Big Data courses, check out our website ➡️ https://www.betacowork.com/big-data/
---------
"Data is the new oil" - Many companies and professionals do not know how to use their data or are not aware of the added value they could gain from it.
It is in response to these problems that the project “Brussels: The Beating Heart of Big Data” was born.
This project, financed by the Region of Brussels Capital and organised by Betacowork, offers 3 training cycles of 10 courses on big data, at both beginner and advanced levels. These 3 cycles will be followed by a Hackathon weekend.
No prerequisites are required to start these courses. The aim of these courses is to familiarize participants with the principles of Big Data.
------
For more info about our Big Data courses, check out our website ➡️ https://www.betacowork.com/big-data/
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Course 2: Big Data & Business: Use Cases - by Peter A. Campbell
1. BIG DATA & BUSINESS
INCLUDING USE CASES
COURSE BY PETER A. CAMPBELL
2. CONTENT
My early days . . . or . . . misspent youth ?1
2
3
INTRODUCTION
DATA MANAGEMENT IN THE LAST CENTURY
TRYING TO KEEP UP !
Data Management in the last century
Data Management in this century
4
SUMMARY
Parting thoughts, Questions & Answers
5 ANNEXES AND FUN STUFF
2
4. Course by Peter Campbell
PETER A. CAMPBELL
CAREER SUMMARY
➢Independent Business and
Information Management Consultant
➢Over 70% of career internationally;
working on large international
projects (primarily in Europe);
➢Founding Member & Director, DAMA
BeLux (Data Management
Association, Belgium & Luxembourg)
EXPERTISE SUMMARY
➢ Defining and delivering data-
centric solutions for business;
➢ Data Management & Data
Governance; Master Data;
➢ Data Architecture, Data Modeling,
Database Design, Database
Administration;
➢ Data Interoperability and
Integration;
➢ Data Warehousing & Business
Intelligence
➢ Big Data / Data Science
EDUCATION / CERTIFICATION
➢ Master of Business Administration,
Boston University (Boston, MA,
USA)
➢ Bachelor of Arts, Brown University
(Providence, RI, USA)
➢ Continuously enhancing my
knowledge via seminars, active
involvement in professional
associations, conferences,
webinars, courses (live or
MOOCs), meetups, et cetera.
My objectives – for this session:
▪ Provide some background history on data management over the years;
▪ “Demystify” Big Data and Data Science and reduce the hype !
▪ Talk about some business “Use Cases”
4
5. Course by Peter Campbell
5
… and I won !
As a teenager, wandering into MIT
(Massachusetts Institute of Technology) labs
and playing Chess against the big computer…
My early days . . . . or . . . . misspent youth ? ?INTRODUCTION
5
6. Course by Peter Campbell
INTRODUCTION My early days . . . . or . . . . misspent youth ? ?
Suburbs of Boston, Massachusetts (USA) : (Secondary School): early 1970s
ASR-33 Teletype (Terminal):
Paper roll, paper tape
(with punched holes)
… enough about me … back to the lecture ! !
DEC (Digital Equipment Corporation): PDP-8/I
6
Core Memory
7. DATA MANAGEMENT
IN THE LAST CENTURY
2 ➢ “Big Iron”: the very early days
➢ Pre-relational → Relational databases
➢ The rise of Applications
➢ Data Warehousing and Business Analytics
➢ The beginnings of Data Interoperability (EDI)
➢ The early Internet Age
7
8. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
Small Data, Enormous Machines!
ENIAC (Electronic Numerical
Integrator And Computer)
https://en.wikipedia.org/wiki/ENIAC
1945
Some “Apps”:
➢Calculate artillery firing tables for the U.S. Army
➢Study feasibility of the hydrogen bomb
➢Weather Forecasting
8
9. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
ERMA: Electronic Recording Machine
Accounting, Mark 1 (mid-1950s)
https://en.wikipedia.org/wiki/Electronic_Recording_Machine,_Accounting
http://www.sri.com/work/timeline-innovation
The Rise of Application – Machines !
9
10. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
First “big” disk drive:
IBM 350, in the IBM 305 RAMAC
RAMAC = Random Access Method of
Accounting and Control
10
11. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
A bit of Data Management History …“Appointment in Philadelphia” ( more than 50 years ago ! )
https://www.youtube.com/watch?v=WRJYtbDHCto
Tribute (NY Times, 2017): https://www.nytimes.com/2017/07/16/technology/charles-w-bachman-dies.html
11
12. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
“Pre-relational” DBMSes (1970s into 1980s)
IBM BOMP, DBOMP, IMS (Hierarchical)
CINCOM Total (Network DB)
CULLINANE / CULLINET IDMS (Network DB)
Applied Data Research (ADR) Datacom / DB (Inverted List)
Software AG Adabas (Inverted List)
Computer Corporation of America (CCA) Model 204 (Inverted List)
http://www.softwarememories.com/2006/02/09/prerelational-dbms-vendors-a-quick-overview/
12
13. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
General Ledger
Accounts Payable
Accounts Receivable
Purchasing
Inventory
Fixed Assets
Payroll
Human Resourcesand others…
http://www.softwarememories.com/2015/08/07/application-databases/
The rise of Applications (Europe)
The rise of Applications (North America) “Big Eight”
apps:
13
14. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
Data Warehousing and Business Analytics: Early Pioneers
1970s,
1980s :
Bill Inmon created the accepted definition of what a data
warehouse is – a subject oriented, nonvolatile, integrated,
time variant collection of data in support of management's
decisions.
Also known as the “Corporate
Information Factory”
Dr. Barry Devlin: coined the
term “Information Warehouse”
Shortly afterwards, IBM
marketed this as the:
“Information Center”
1988
14
15. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
http://globalscorecard.gs1.org/gsclive/guide_to_ECR/E02.asp
1974: A pack of Wrigley's gum
becomes the first product to be
scanned with a GS1 barcode in a
Marsh supermarket in Troy,
Ohio, United States.
1970s:
▪ UPC in U.S.
▪ EAN in Europe
Electronic Data Interchange (EDI): early days
15
16. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
Data Interoperability (EDI) continues
EDI in various industries
Other Industries and EDI Standards:
http://www.edibasics.com/edi-by-industry/the-high-tech-industry/
+ many others ….
➢ Automotive: STAR (Standards for Technology in Automotive Retail)
➢ Construction: EDICON
➢ Gas: Edig@as, EASEE-gas
➢ Transport: FORTRAS
➢ Textiles/Fashion EDITEX
16
17. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
Sir Tim Berners-Lee
Dawn of the Internet (Christmas 1990)
https://en.m.wikipedia.org/wiki/CERN_httpd
CERN httpd (later also known as W3Chttpd) was a web server (HTTP) daemon originally developed at
CERN from 1990 onwards by Tim Berners-Lee, Ari Luotonen and Henrik Frystyk Nielsen.
Implemented in C First ever web server software Went “live” on Christmas Day 1990.
17
Robert Cailliau
Vint Cerf
SOME OTHER KEY CONTRIBUTORS:
18. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
Data Warehousing and Business Analytics: in the 1990s
1996: Ralph Kimball publishes the book:
"The Data Warehouse Toolkit"
2000 Daniel Linstedt releases
the Data Vault, enabling a real
time auditable Data Warehouse.
https://en.wikipedia.org/wiki/Data_Vault_Modeling
http://danlinstedt.com/
( Methodologies / Techniques )
18
19. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
In the 1980s and 1990s, software mostly from the “big players” :
First “BIG” Data Warehouse :
https://www.healthcatalyst.com/wal-mart-birth-of-data-warehouse/
First to reach one (1)
Terabyte ! In 1992 !
Data Warehousing and Business Analytics
19
20. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
https://en.m.wikipedia.org/wiki/NCSA_HTTPd
NCSA HTTPd was a web server originally developed
at the NCSA (National Center for Supercomputing
Applications, University of Illinois at Urbana)
Mosaic (Browser) released by NCSA in 1993, later
became commercialized as “Netscape”
NCSA
https://en.wikipedia.org/wiki/Apache_Software_Foundation
1995 / 1996: development at NCSA slowed down; an
independent effort, the Apache project, took over the codebase
and continued;
The Apache Software Foundation (ASF) formed
from the Apache Group and incorporated in
Delaware (US) in June 1999
Other interesting (parallel) developments in the 1990s
20
21. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
http://ipcarrier.blogspot.be/2013/11/tech-sector-is-in-bubble.html
(Peak: 10 March 2000)
Rising: 1995–2000
The “Dot-Com” Bubble
http://www.businessinsider.com/where-are-they-now-the-kings-of-the-90s-dot-com-bubble-2013-10?op=1&IR=T
https://en.wikipedia.org/wiki/World_Online
2000
Here in Belgium / NL …
21
22. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
1911 Originally the Computing-Tabulating-Recording Company,
renamed to International Business Machines in 1924
1939
Named after the two founders; started in 1939 in Packard's
garage with an initial capital investment of $538
1970
June 1970: Systemanalyse und Programmentwicklung ("System
Analysis and Program Development"); Dominant Player in
Applications, acquired Sybase (DBMS Vendor) in 2010
1975 Founders: Bill Gates, Paul Allen; still dominant on the
desktop, now into Cloud Computing (Azure)
1976
Very high level of brand loyalty, one of the most valuable
companies in the world; iPhone / iPad / iPod / iMac/ iTunes /
iStore / iWatch
Source: Wikipedia
Established Players (started in the last century)
Year founded:
Pay !…and i
22
23. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
1976
Acronym for Statistical Analysis System, original usage in
Agriculture departments, still privately held (owners: James
Goodnight and John Sall)
1977
Originally "Software Development Laboratories (SDL)", changed
to Oracle Corporation in 1995. Market leader in RDBMS and
big player in applications
1979
Specialized data machines, to handle large relational databases
(for analytics). Teradata began to associate itself with the term
“Big Data” in 2010
1984
Founder: Michael Dell. Privatized (Management Buyout) in
2013, employs about 138 000 people worldwide (end 2017)
1984 Designs, manufactures, and sells networking
equipment, IPO (Initial Public Offering) in 1990
Established Players (started in the last century)
Year founded:
Source: Wikipedia
23
24. Course by Peter Campbell
DATA MANAGEMENT IN THE LAST CENTURY
Year founded:
1994
➢ Originally named "Cadabra", changed to Amazon in 1995, went on-
line as amazon.com in 1995, IPO May 1997 (AMZN);
➢ 2015: Amazon surpassed Walmart as the most valuable retailer
in the United States by market capitalization
New Players on the scene (1990s) … who became very big very fast
1995 ➢ Original name "AuctionWeb", changed to Ebay in 1997
1998
➢ Founders: Larry Page and Sergei Brin, IPO in 2004;
➢ Many acquisitions of various companies, also investments;
➢ R&D work on advanced technologies (driverless cars, Google glass, etc)
Newer, 21st century startups covered in a few minutes . . .
1999
Alibaba Group Holding Limited is a Chinese e-commerce company that
provides consumer-to-consumer, business-to-consumer and business-
to-business sales services via web portals. Founder / CEO: Jack Ma
24
Source: Wikipedia
25. TRYING TO KEEP UP !
DATA MANAGEMENT IN THIS CENTURY
3 ➢ Big Data
➢ The "Ecosystem", and some of the current key players
➢ NoSQL, NewSQL, other new database technologies
➢ Cloud Computing
+ Database Pioneer: Dr. Michael Stonebraker
➢ Linked Open Data / Semantic Data Interoperability
➢ Internet of Things” (IoT)
➢ Analytics and Insights :
➢ AI / Machine Learning, Data Mining, Predictive Analytics
➢ Don’t forget Security ! ( Confidentiality, Data Privacy, Data Breaches )
25
THE “USE
CASES”
26. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
26
Recent “Digital Disruptors” (since 2000)
Founded:
2003
Social Network (Professional networking), IPO
2011; acquired by Microsoft, 2016
2004
"Facemash" (Zuckerberg: Program, 2003), Thefacebook. IPO in 2012.
October 2012: Facebook passes the monthly active users mark of
one billion. Market Valuation of $460 billion (February 2019)
2006 (Covered in detail in "Cloud Computing“ section)
IPO in 2013
2006
End 2018: : Twitter has more than 500
million users, out of which more than
300 million are active users
2008 Uber Technologies Inc. is an American international transportation
network company. Venture capital funded (including Google Ventures)
2009
Airbnb is a website for people to list, find, and rent lodgings. It
has over 4,000,000 listings in 65,000 cities and 191 countries.
Source: Wikipedia
27. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
27
… AND HERE WE HAVE
PETER, WHO HANDLES
ALL OF OUR BIG DATA
PROJECTS
Google search "BIG DATA” (22 February 2019): about 7.410.000.000 results (0,60 seconds)
Used to be called the “Information Explosion” Quiz: from what year ?
http://www.newworldencyclopedia.org/entry/Information_explosion
28. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
28
http://www.forbes.com/sites/gilpress/2013/05/09/a-very-short-history-of-big-data/print/
Background and Context: The term “Big Data”
29. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
29
Big Data, and the 3 “V”s
The original 3 “V”s: . . . but, there are more “V”s:
and even more…. https://www.linkedin.com/pulse/vs-ability-peter-campbell/
30. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
30
Background and Context:
A Big Data “Word Cloud”
https://datascience.berkeley.edu/what-is-big-data/
3 September 2014
43 different definitions ! ! !
31. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
31
Background and Context: Big Data “Drivers”
In general, Big Data are generated by several means :
Social Networking
and Media
Mobile
Devices
Internet
Transactions
Networked Devices
and Sensors
▪ Over 2.3 billion Facebook users
▪ Over 300 million Twitter users
▪ Over 500 million LinkedIn users
▪ Over 500 million public blogs
▪ Nearly 5 billion mobile phones are in use worldwide, often
collecting and transmitting (GPS) location data.
▪ Billions of transactions per day, with data points
collected by retailers, banks, credit agencies, others
▪ Electronic devices of all sorts create semi-structured log
data that record every action.
(statistics from various Internet websites)
What is Big Data? Big Data Explained http://youtu.be/c4BwefH5Ve8 (Patrick Schwerdtfeger)
32. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
32
https://www.youtube.com/watch?v=N1ltwg2nTK4
33. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
33
Background and Context: “Big Data” in the Media
https://www.economist.com/leaders/2010/02/25/the-data-deluge
25 February 2010 : The Economist
35. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
35
https://wikibon.com/2016-2026-worldwide-big-data-market-forecast/https://wikibon.com/research/big-data/
36. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
36
The Elephant in the Room (Hadoop)
http://www.dbms2.com/2015/06/10/hadoop-generalities/#more-9664
Doug helped found the Hadoop project and coined the
project “Hadoop” after his son’s stuffed elephant.
➢ Currently: Chief Architect at Cloudera
(September 2009 to present)
➢ Previously: “Technical Yahoo” at Yahoo
(January 2006 – August 2009)
➢ The Apache Software Foundation
➢ Director, 2009 - 2015
➢ Committer (since 2001)
➢ Also: Apple & Xerox PARC
➢ Education: BA Linguistics (Stanford, 1985)
Doug
Cutting
37. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
37
http://blog.agro-know.com/?p=3810
+
https://en.wikipedia.org/wiki/Apache_Spark
Hadoop components & functions
https://flink.apache.org/
And more recently:
38. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
38
The book “Seven Databases in Seven Weeks” covers:
https://pragprog.com/book/rwdata/seven-databases-in-seven-weeks
NoSQL, NewSQL, other new database technologies: NoSQL
39. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
39
NoSQL, NewSQL, other Database technologies: NoSQL
Some of the most popular NoSQL databases are:
Memcached
(Mem-Cache-D)
https://en.wikipedia.org/wiki/NoSQL
https://www.infoworld.com/article/3260184/nosql/how-to-choose-the-right-nosql-database.html
https://www.improgrammer.net/most-popular-nosql-database/
SEE ALSO:
40. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
40
NoSQL, NewSQL, other Database technologies: NewSQL
NewSQL is an emerging category
that borrows some of the scalability
of NoSQL databases for the
relational database world
https://en.wikipedia.org/wiki/NewSQL
The Forrester Wave™: In-Memory
Database Platforms, Q3 2015
( + NuoDB )
41. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
41
Cloud Computing : Origin of the Term
1996: George Favaloro poses with a 1996
Compaq business plan.
The document is the earliest known
use of the term “cloud computing”
Title: “Internet Solutions Division Strategy for
Cloud Computing” (14 November 1996)
http://www.technologyreview.com/news/425970/who-coined-cloud-computing/
2006: Large companies such as Google and Amazon started using the
term “cloud computing”; AWS (Amazon Web Services) created
42. Course by Peter Campbell
DATA MANAGEMENT
IN THIS CENTURY
42
Cloud Computing : Market Leaders
Gartner “Magic Quadrant”
For Cloud Infrastructure as a
Service, April 2018
Courtesy of Gartner
https://en.wikipedia.org/wiki/Gartner
43. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
43
Cloud Computing : October 2013
http://www.businessinsider.com/amazon-wins-court-battle-cia-contract-2013-10?IR=T
https://www.businessinsider.com/ibm-stops-fighting-amazons-cia-deal-2013-11?IR=T
44. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
44
Catch me if
you can !
Cloud Computing : Market Leaders
https://www.forbes.com/sites/louiscolumbus/2018/09/23/roundup-of-cloud-computing-forecasts-and-market-estimates-2018/
45. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
45
Cloud Computing : Service Offerings, Amazon Web Services (AWS)
https://aws.amazon.com/free/
46. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
46
Cloud Computing : Service Offerings, Microsoft “Azure”
https://www.dotnettricks.com/learn/azure/getting-started-with-microsoft-azure-platform
47. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
47
Database Pioneer: Dr. Michael Stonebraker
Active in Database inventions since the 1970s,
at the University of California at Berkeley, now
at Massachusetts Institute of Technology
2005: IEEE John von Neumann Medal
For “contributions to the design, implementation,
and commercialization of relational and object-
relational database systems.”
2014: A. M. Turing Award Winner
For fundamental contributions to the concepts and
practices underlying modern database systems.Dr. Michael Stonebraker
https://en.wikipedia.org/wiki/Michael_Stonebraker
http://www.sitepronews.com/2015/07/03/trailblazer-pioneer-of-data-base-research-michael-stonebraker/
Michael Stonebraker: Big Data is (at least) Four Different Problems https://www.youtube.com/watch?v=KRcecxdGxvQ
48. TRYING TO KEEP UP !
DATA MANAGEMENT IN THIS CENTURY
3 ➢ Big Data
➢ The "Ecosystem", and some of the current key players
➢ NoSQL, NewSQL, other new database technologies
➢ Cloud Computing
+ Database Pioneer: Dr. Michael Stonebraker
➢ Linked Open Data / Semantic Data Interoperability
➢ Internet of Things” (IoT)
➢ Analytics and Insights :
➢ AI / Machine Learning, Data Mining, Predictive Analytics
➢ Don’t forget Security ! ( Confidentiality, Data Privacy, Data Breaches )
48
THE “USE
CASES”
( CONTINUED )
49. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
49
Linked Open Data / Semantic Data Interoperability
BLOOD ? ? ?
Big
Linked
Data(sets)
Online
Open
https://www.linkedin.com/pulse/blood-peter-campbell
50. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
50
http://euclid-project.eu/
Linked Open Data / Semantic Data Interoperability
Latest (and much bigger) version:
https://lod-cloud.net/
51. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
51
Linked Open Data / Semantic Data Interoperability
https://en.wikipedia.org/wiki/FOAF_(ontology)
http://euclid-project.eu/
EUCLID is a European project facilitating professional training for
data practitioners, who aim to use Linked Data in their daily work.
https://en.wikipedia.org/wiki/DBLP
52. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
52
Linked Open Data / Semantic Data Interoperability
Used in several European / EU PROJECTS:
➢ EuroVoc (multilingual, multidisciplinary thesaurus)
➢ EU Publications Office (Luxembourg):
► ► ► VocBench ( http://vocbench.uniroma2.it/ )
➢ LOD2 (Linked Open Data 2: "Creating Knowledge out of interlinked Data")
➢ European Union Open Data Portal
➢ Open Data Support
➢ SEMIC (Semantic Interoperability Community)
➢ + others
53. Course by Peter Campbell
http://www.efpia.eu/ http://www.imi.europa.eu/
DATA MANAGEMENT IN THIS CENTURY
53
Linked Open Data / Semantic Data Interoperability
http://www.openphacts.org/
http://www.openphactsfoundation.org/
http://connecteddiscovery.com/
Plus….. Dozens of Academic
Institutions and Institutes
“USE CASE”: Open PHACTS
( Pharma : Connected Discovery )
54. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
54
https://www.wired.co.uk/article/internet-of-things-what-is-explained-iot
https://internetofthingsagenda.techtarget.com/feature/Explained-What-is-the-Internet-of-Things
https://research.populus.ai/reports/Populus_MicroMobility_2018_Jul.pdf
Internet of Things (IoT)
55. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
55
Internet of Things (IoT) connected
devices installed base worldwide
from 2015 to 2025 (in billions)
SOURCE: https://www.statista.com/statistics/471264/iot-number-of-connected-devices-worldwide/
Internet of Things (IoT)
56. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
56
Internet of Things (IoT)
Many different “Use Cases”
Morgan Stanley Blue Paper:
http://byinnovation.eu/wp-content/uploads/2014/11/MORGAN-STANLEY-BLUE-PAPER_Internet-of-Things.pdf
Some online courses, Curtin University (moderate fees charged):
https://www.edx.org/micromasters/curtinx-internet-of-things-iot
57. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
57
Internet of Things (IoT): Key infrastructure
GS1: in Brussels,
and very much
involved in the
“Internet of Things”
https://en.wikipedia.org/wiki/GS1
GS1 is a neutral, not-for-profit, international organization that develops
and maintains standards for supply and demand chains across multiple sectors.
GS1 has over a million employee companies across the world, executing more
than six billion transactions daily using GS1 standards.
IDENTIFY :
Standards for the identification
of items, locations, shipments,
assets, etc.. and associated data
CAPTURE :
Standards for encoding
and capturing data in
physical data carriers such
as barcodes and RFID tags
SHARE :
Standards for
sharing data
between parties
( Global Headquarters in Brussels: Blue Tower, Avenue Louise )
58. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
58
Internet of Things (IoT)
https://www.youtube.com/watch?v=K3pYZwol6Dc
Silicon Valley:
Gilfoyle Hacks
Jian Yang’s
Smart Fridge
59. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
59
Analytics and Insights : WHY ?
https://www.businessnewsdaily.com/4522-big-data.html
https://www.businessnewsdaily.com/10625-businesses-collecting-data.html
60. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
60
Analytics and Insights: Data is . . . . ? ! ?
Data is the new "_____" ? ! ? https://www.linkedin.com/pulse/data-new-peter-campbell/
https://www.wsj.com/articles/data-is-the-new-middle-manager-1429478017
Wall Street Journal,
20 April 2015
61. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
61
Analytics and Insights
Magic Quadrant for Analytics and Business
Intelligence Platforms, January 2019
Courtesy of Gartner
NOTE: if you google on “GARTNER MQ BI ANALYTICS”,
you can download the entire report (about 60 pages) from
some vendors – after you supply your contact details !
62. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
62
Analytics and Insights : Evolution
http://www.gartner.com/it-glossary/predictive-analytics
Predictive analytics describes any approach
to data mining with four attributes:
An emphasis on prediction (rather than
description, classification or clustering)
Rapid analysis measured in hours or days
(rather than the stereotypical months of
traditional data mining)
An emphasis on the business relevance of
the resulting insights (no ivory tower
analyses)
(increasingly) An emphasis on ease of use,
thus making the tools accessible to
business users.
Analytics and Insights:
Gartner Group Chart
63. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
63
Analytics and Insights: Data Science & Machine Learning Platforms
Gartner defines a data science platform as:
A cohesive software application that offers a mixture
of basic building blocks essential for creating all kinds
of data science solutions, and for incorporating those
solutions into business processes, surrounding
infrastructure and products (Report: January 2019)
Magic Quadrant for Data Science and Machine
Learning Platforms, November 2018 / January 2019
64. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
64
Analytics and Insights: USE CASES
SLIDE SHOW: http://www.slideshare.net/Dell/big-data-use-cases-36019892
65. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
65
Analytics and Insights : USE CASES
SLIDE SHOW:
http://www.slideshare.net/Dell/big-data-use-cases-36019892
66. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
66
Analytics and Insights : USE CASES
SLIDE SHOW:
http://www.slideshare.net/Dell/big-data-use-cases-36019892
67. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
67
Analytics and Insights : USE CASES
Source: InfoDiagram.com
1) 360° View of the Customer
2) Fraud Prevention
3) Security Intelligence
4) Data Warehouse Offload
5) Price Optimization
6) Operational Efficiency
7) Recommendation Engines
8) Social Media Analysis and Response
9) Preventive Maintenance and Support
10) Internet of Things
Big Data Use Cases (Datamation, 21st June 2017)
https://www.datamation.com/big-data/big-data-use-cases.html
68. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
68
Analytics and Insights : USE CASES
Amsterdam Startup Bootcamp: FinTech / CyberSecurity Demo Day (at Rabobank Utrecht, 14th February 2019)
69. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
69
Analytics and Insights : USE CASES
Amsterdam Startup Bootcamp: FinTech / CyberSecurity Demo Day (at Rabobank Utrecht, 14th February 2019)
70. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
70
Analytics and Insights
“We’ve got the Big Data report, we did the competitive
analysis, and nobody thought to include cats?!”
71. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
71
Analytics and Insights: Big Data
(EN) “Our analysis of 5 petabytes of Facebook data and 800 million Tweets leads to one conclusion: Our clients are idiots!”
72. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
72
Analytics and Insights: AI / Machine Learning
http://www.infoworld.com/article/2900036/machine-learning/not-all-machine-learning-is-created-equal.html
InfoWorld: Not all machine learning is created equal
73. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
73
Analytics and Insights: Machine Learning (USE CASE)
http://www.infoworld.com/article/2907877/machine-learning/how-paypal-reduces-fraud-with-machine-learning.html
According to Wang, PayPal is a pioneer in risk management,
although some advanced efforts are just now emerging from the
lab. PayPal uses three types of machine learning algorithms for
risk management: linear, neural network, and deep learning.
Experience has shown PayPal that in many cases, the most
effective approach is to use all three at once.
“We take trust very seriously. It’s our brand. We have to decide in
a couple of hundred milliseconds whether this is a good person, [in
which case] we will give him or her the best and the fastest and the
most convenient experience.”
Wang emphasizes that you need large quantities of data to support
these complex neural network structures. PayPal itself collects
gargantuan amounts of data about buyers and sellers, including
their network information, machine information, and financial data.
The deep learning beast is well fed.
74. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
74
Analytics and Insights: AI / Machine Learning USE CASES
https://www.forbes.com/sites/louiscolumbus/2018/08/26/25-machine-learning-startups-to-watch-in-2018/
75. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
75
Analytics and Insights: AI / Machine Learning USE CASES
1. Data Security
2. Personal Security
3. Financial Trading
4. Healthcare
5. Marketing Personalization
6. Fraud Detection
7. Recommendations
8. Online Search
9. Natural Language Processing (NLP)
10. Smart Cars
https://www.forbes.com/sites/bernardmarr/2016/09/30/what-are-the-top-10-use-cases-for-machine-learning-and-ai/
https://www.bernardmarr.com/
(30 September 2016)
76. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
76
Analytics and Insights: AI / Machine Learning USE CASES
26th December 2018
https://www.forbes.com/sites/davidteich/2018/12/26/machine-learning-and-artificial-intelligence-in-business-year-in-review-2018/
INCLUDES:
Natural Language Machine Learning
Robotics Robotic Process Automation
+ The Impact beyond 2018
77. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
77
Analytics and Insights: AI / Machine Learning USE CASES
https://www.forbes.com/sites/nvidia/2019/02/27/reaping-success-with-enterprise-machine-learninginsights-from-capital-one/
78. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
78
Analytics and Insights: AI / Machine Learning USE CASES
https://www.forbes.com/sites/nvidia/2019/02/28/4-industries-transformed-by-machine-learning-today/
RETAIL
CONSUMER
INTERNET
FINANCIAL
SERVICES
HEALTH CARE
79. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
79
Analytics and Insights: AI / Machine Learning USE CASES
https://www.forbes.com/insights-intelai/ai-issue-1/
https://www.forbes.com/insights-intelai/ai-issue-2/
https://www.forbes.com/insights-intelai/ai-issue-3/
https://www.forbes.com/insights-intelai/ai-issue-4/
80. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
80
Analytics and Insights: Data Science, Data Scientists
https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/
81. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
81
Analytics and Insights: Data Science, Data Scientists
What is a Data Scientist ?
❖ Project Manager
❖ Qualified statistician
❖ Domain Business Expert
❖ Experienced Data Architect
❖ Software Engineer
Credit:
IT’S A TEAM !
82. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
82
Analytics and Insights: Data Science, Data Scientists
Data Science Central is the industry's online resource for data
practitioners. From Statistics to Analytics to Machine Learning to AI,
Data Science Central provides a community experience that includes
a rich editorial platform, social interaction, forum-based support,
plus the latest information on technology, tools, trends, and careers.
https://www.datasciencecentral.com/
83. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
83
Don’t forget Security ! ( Confidentiality, Data Privacy, Data Breaches )
84. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
84
Don’t forget Security ! ( Confidentiality, Data Privacy, Data Breaches )
Target Stores,
Data Breach 2014
http://www.forbes.com/sites/anthonykosner/2014/01/17/actually-two-
attacks-in-one-target-breach-affected-70-to-110-million-customers/
85. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
85
Don’t forget Security ! ( Confidentiality, Data Privacy, Data Breaches )
07/03/2019
http://fortune.com/sony-hack-part-1/
http://fortune.com/sony-hack-part-two/
http://fortune.com/sony-hack-final-part/
Full story:
Sony Breach (Hack of the Century)
86. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
86
Don’t forget Security ! ( Confidentiality, Data Privacy, Data Breaches )
https://www.forbes.com/sites/davidvolodzko/2018/12/04/marriott-breach-exposes-far-more-than-just-data/
https://www.l2inc.com/daily-insights/winners-and-losers/disrupting-the-disruptors
87. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
87
Don’t forget Security ! Regulatory Aspects
https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
https://eur-lex.europa.eu/eli/reg/2016/679/oj
88. Course by Peter Campbell
DATA MANAGEMENT IN THIS CENTURY
88
Don’t forget Security ! Regulatory Aspects
Who Will Get the First Big GDPR Fine and How to Avoid It?
FCA fines Tesco Bank £16.4m for failures in 2016 cyber attack
90. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
SUMMARY 1
20th Century: Data Management was relatively stable …
Then …. the Internet changed everything ! !
» On-line shopping » Mobile devices (Smartphones, tablets, etc) » Social Networks
» IoT (Internet of Things) » Big Data, Data Science » etc . . . . .
In this century ....
➢ Companies formed in the 1990s (Amazon, Ebay, Google, Alibaba) had huge growth;
➢ First decade of the 21st Century: New "disruptive" players arise: LinkedIn, Facebook,
Amazon Web Services, Twitter, Uber, AirBNB, others;
➢ IBM, HP, SAP, Microsoft, SAS, Oracle, Teradata, Dell, Apple, Cisco are still big players;
➢ Many of the big traditional players (HP, IBM, SAS . . . ) have had to transform / re-invent
themselves (often through acquisitions, mergers, shift to consultancy, etc)
➢ Open Source software and solutions become popular. Skills are lacking !
90
91. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
Report: 50 Ways to Transform Business Processes With Big Data
vs
SUMMARY 2
91
92. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
Parting Thoughts 1
Unfortunately, many large companies are constrained by:
M & M’s: Maintenance & Migrations
92
93. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
Parting Thoughts 2
+
+There may be a
in your future !
93
94. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
Data Driven
THINK BIG start small
Scale Up / Out (“Industrialize”)
Proof of Concept,
Proof of Value
Enhance, revise, and
continue to innovate
think
small
Parting Thoughts 3
94
95. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
Parting Thoughts 4:
Roles & Titles
19 April 2015
What are the new roles in this new era of ‘Big Data, Data Science ? ?
Informationist
Agilist
Wicked Problem Slayer
Data Scientist
Professional
Daydreamer
Chancellor of Intergalactic
Market Development
Information Citizen (Gartner)
http://www.gartner.com/newsroom/id/3067117
Data Wrangler
95
96. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
Parting Thoughts 5:
Keep on learning !
Stay curious … get out of the
office … go to to seminars,
conferences, meetups, follow
webinars and courses, etc.
You must be “data-driven”
. . . . or at least “data aware”
96
97. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
97
Professional Groups,
Associations, and
Communities 1
Data Management Association
Belgium & Luxembourg
http://www.dama-belux.org/
https://engage.isaca.org/belgiumchapter/aboutchapter/about
https://iapp.org/
http://www.baea.be/
+ many others …
98. Course by Peter Campbell
SUMMARY, PARTING THOUGHTS,
QUESTIONS & ANSWERS
98
Professional Groups,
Associations, and
Communities 2
+ others …
[ AI & Data Science Community of Belgium ]
99. Course by Peter Campbell
QUESTIONS & ANSWERS
99
What questions do you have ?
Peter Campbell
E-mail: peter_a_campbell@yahoo.com
GSM: +32 476 89 01 56
Thank you very much !
101. Course by Peter Campbell
ANNEXES
10
Made in Belgium !
https://en.wikipedia.org/wiki/Paul_Otlet
Paul Otlet
https://en.wikipedia.org/wiki/Henri_La_Fontaine
Henri La Fontaine
Nobel Peace Prize, 1913
https://en.wikipedia.org/wiki/Mundaneum
http://www.mundaneum.org/en
Rue de Nimy 76, 7000 Mons
102. Course by Peter Campbell
ANNEXES
10
Made in Belgium !
https://siliconcanals.nl/crowdfunding/collibra-becomes-the-first-belgian-tech-startup-unicorn-gets-100m-in-series-e-funding/
103. Course by Peter Campbell
ANNEXES
10
Made in Belgium !
http://www.forbes.com/sites/brucerogers/2013/10/02/dries-buytaert-
is-building-the-next-red-hat-like-open-source-success-story/
Business story, Forbes:
https://en.wikipedia.org/wiki/Drupal
https://en.wikipedia.org/wiki/Dries_Buytaert
Dries Buytaert (Original Author), Initial release January 2001
As of February 2014: more than 1,015,000 sites used Drupal
WIKIPEDIA:
104. Course by Peter Campbell
ANNEXES
104
Made in Belgium !
https://startups.be/blog/post/colruyt-family-invests-43-million-ontoforce
105. Course by Peter Campbell
ANNEXES
105
Made in Belgium !
http://www.business-insight.com/
http://www.anatella.com/
106. Course by Peter Campbell
ANNEXES
106
Made in Belgium !
NoSQL Design Tools (for MongoDB, Cassandra, etc)
CDP: Customer Data Platform
107. Course by Peter Campbell
FUN STUFF (VIDEOS)
The Four Horsemen (Video)
https://www.youtube.com/watch?v=XCvwCcEP74Q
107
108. Course by Peter Campbell
FUN STUFF (VIDEOS)
(TED = Technology Entertainment Design)
Hans Rosling : No more boring Data
http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen?language=en
David McCandless : The beauty of data visualization
http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization?language=en
Jake Porway (DataKind) : Data Science in the service of humanity
https://www.youtube.com/watch?v=fZ3xXXeVrIQ
Kevin Slavin : How Algorithms Shape our World
http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world?language=en
108
109. Course by Peter Campbell
FUN STUFF (VIDEOS)
109
Great "Panama Papers" presentation by Mar Cabra, at GraphConnect Europe (26 April 2016 in London)