The document discusses strategies for data markets and platforms. It notes that data has unique economic characteristics compared to other information goods, as data is heterogeneous, not always excludable, and implications of its use can be ongoing. Successful data platforms must address issues of market thickness, transaction costs, data provenance, and effective monitoring. Both centralized and decentralized market designs are discussed, with decentralized having potential advantages around trust and verification but also higher transaction costs. The conclusion emphasizes that careful attention to technical and institutional details is required for data markets due to data's unique attributes.
3. Which technology fields are the most influential?
Predicted control patent citations with co-listed technology fields
-101234
patentcitations(loggedcoefs)
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
Computer Control Digicomms Control
Data process Control Semiconductor Control
Thermal ap. Control Transport Control
Material Control Machine Control
All control
Patents listed in both
digital comm & control
Koutroumpis-Leiponen-Thomas:
“Invention Machines: How Instruments and InformationTechnologies Drive Global Technological Progress”
4. Why?
Invention machines:Applicable in many sectors; Facilitate invention in other sectors;A
broad and catalytic impact by enabling follow-on invention in many application sectors;
Generate massive knowledge spillovers over long periods of time
• Control instruments
• Digital communication
• Computer technologies
Instruments enable manipulation of material; computers enable manipulation of
information
Automation requires instrumentation
Internet ofThings
5. Web 3.0
Control instruments – sensors, indicators, logic devices,
actuators
+
Data – social, administrative, industrial, personal
+
Artificial Intelligence – algorithms, machine learning, prescriptive
analytics
=
“SecondWave of the Second Machine Age”
(Erik Brynjolfsson/MIT)
7. Printing press
§ Johannes Gutenberg (Germany 1452)
§ Reading became accessible to common people
§ Fiction, entertainment, propaganda
§ Mass education
§ Network externalities:
availability of books à
incentives to learn to read à
demand for books
8. Impact of the printing press
} 30 years later a printing shop in Florence run
by nuns charged 3 florins for 1000 copies of
Plato’s Dialogues, while a scribe would have
charged 1 florin for 1 copy
} Availability of paper from China à prices fell,
demand increased
} #books produced in 50 years following the
invention = #books produced by European
scribes in preceding 1000 years!
} Fust was suspected in Paris to be in league
with the devil – fear of novelty
è How did the printing press change
society, lifestyles, economy?
9. Expect societal changes due to Web 3.0
} Privacy needs to be defined
} Ownership of data
} Right to be forgotten – in/alienability
} Intellectual property for data
} Data security
} Legal framework
} Radical transparency
} Real-time visibility
} Data integration, inference, prediction
} New business models, new platforms, new winners
10. Information Economy vs. Data Economy
} Nonrival?
} Partially excludable?
} Experience good?
} High fixed cost/low or
constant marginal cost?
} Yes
} NOT excludable
} Yes if no metadata
No if proper metadata
} Varies: exhaust data vs.
data collected for a
purpose
Are data information goods?
11. PUZZLE: How can data be commercially
exploited?
• Data are not intellectual property
– Individual data points have no legal protection
• Essentially needs to be controlled contractually
(secrecy, organization forms, product design,
non-compete and confidentiality contracts),
– Not via intellectual property rights
• How can something so “leaky” be valuable,
commercialized?
12. Money vs. Data
} Data is viewed as the “new oil”, an asset class
} Digital currency is data on a fundamental level – streams of
bits
} Currencies rely on trust in the medium – data have
intrinsic value.
} Increasing subjectivity of data goods as we go from raw
to tagging/cleaning, aggregating, combining, processing
} Provenance is hard to prove for data, currencies are
verifiable
} Non-exchangeability of data – there is no quantum of
data with a minimum value
} Nevertheless CS researchers starting to consider “data as
money”; developing conceptual models of a “central bank
for data”
13. Content vs. Data
} Both have (some) intrinsic value
} Both governed by copyright
} On a fundamental level, content IS data and subject to
analytics (Natural Language Processing)
} But the value of record data largely comes from
combination with other data and algorithms (models,
statistics, prediction, deep learning…)
} And as a result, copyright is very weak on data
14. Economic features of digital goods –
all controversial and legally contested
Record Data Content Software Currency
Information
Type
Raw records or
structured
databases
Knowledge
(insights)
Knowledge
(instructions)
Pure value
Good Type Intermediate/
Final
Final Final Final
Alienability Variable Medium High High
Inferability High Low Low Zero
Excludability None Variable Variable High
Fungibility Variable Low Low High
Protection
Method
Secrecy Copyright Copyright or patents
in some cases
Blockchain or
other verification
technology
Protection
Aspect
Reuse Expression
(patterns)
Expression (patterns)
or insight (invention)
Transaction value
?
15. Characteristics of different data sources
Source of data Privacy
implications
Alienability Duration/
useful life
Sampling
frequency
Inferrability
Health care High Low (health, retail, social
network, locational)
>50 years Very low Low
Public sector
administration
Medium Medium (public sector) –
these usually have specific
data protection protocols
(confidential, etc)
>50 years Low Low
Manufacturing/
Operations
(sensor networks)
Medium Medium (manufacturing) -
these usually have specific
data protection protocols
(confidential, etc)
10-20 years Medium Low
Individual
behavior
High Low (health, retail, social
network)
1-5 years High High
Personal
Locational Data
Medium Medium 1-5 years Very high Medium
16. Summary I
} The economics of data goods depend on an analysis of
data characteristics
} Data are very heterogeneous
} Description, classification of data and its institutional
framework is necessary for understanding its
commercialization potential
} Overall, data goods substantially differ from other
information goods
} Excludability (protection)
} Transparency (metadata)
} Alienability (ongoing implications for individuals)
} Inferability (implications of data integration for individuals)
17. Emergence of data markets?
} Data markets will work differently in different
industries
} The legal framework is evolving à data attributes
} Competitive strategies & outcomes will depend
particularly on the fungibility, excludability,
alienability/inferability of the data in question
} Business model design with determine profit potential of
fungible, poorly excludable, alienable data
18. Types of market matching mechanisms
Matching Marketplace
design
Terms of
Exchange
Examples
One-to-one Bilateral Negotiated Data brokers
One-to-many Dispersal Standardized Twitter API
Many-to-one Harvest Implicit barter Google Services
Many-to-many Multilateral Standardized or
negotiated
InfoChimps,
Microsoft Azure
“The (unfullfilled) promise of Data Marketplaces”, P. Koutroumpis,A. Leiponen, L.Thomas
19. Bilateral: Proprietary data vs. other IP
licenses
Data Patents Trademarks Copyrights
License duration 1-2 years 10-20 years Up to 20 years 1-5 years
Exclusivity Rare Frequent Often regional Rare
Confidentiality Frequent Rare Rare Rare
Use restrictions Abundant Concise Specific Concise
Warranty ‘As is’ Frequent -- --
Obligation &
remedy
Correct/refund/replace/
update
-- -- --
Audit Frequent -- -- --
Modal fee schedule Annual subscription % of sales or
flat fee
NA Per device
“Data Contracts”, P. Koutroumpis,A. Leiponen, L .Thomas & J.Wu (2016)
20. 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Contract type
Proprietary
License
Open Database
Comons
GNU
FOI / Open
Government
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Commercial use
Not Noted
No Commercial
Use Permitted
Commercial Use
Permitted
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Data sharing
Sharing
Permitted
Share Alike
Not Noted
No Sharing
Academic
37 %
Commerci
al
19 %
Governme
nt
21 %
Non-Profit
17 %
Personal
4 %
Internatio
nal
2 %
Dispersal: 366 Open Data Contracts (T&C)
21. Multilateral: Centralized Data
Platform
• Selling data outside the
firm through the
platform
• Platform provider takes
the risk, provides
services, takes a cut
• Technical challenges in
standardization, rights
management,
• Strategic challenges in
revenue sharing, chicken
& egg etc
Data
Marketplace
Data
Providers
Algorithm
Providers
Expert
Advice
Customer Customer Customer
Complement Complement
Supply
Demand
22. Common Pool Resources (Ostrom 1990)
} Costly but not impossible to exclude
potential beneficiaries from obtaining
benefits from use
} CPR àTragedy of the Commons
} Collective action resolvesTOTC and
maintains resource if
} Clearly defined boundaries identify
legitimate users
} Rules define how CPR should be used;
metarules to change rules
} Effective monitoring to enforce rules,
boundaries
23. Decentralized Data Platform –
blockchain for data?
Aggregators
User content & sensor data
Tagging & Cleaning
Public
Ledger
…
transactionXX1
transactionXX2
transactionXX3
transactionXX4
transactionXX5
…
Trading
• “Bottom-up” approach in
information exchange
• Users and sensors collect
data
• Aggregators can buy/sell
data for profit; data
owners get paid and have
control over future uses
• Processing, analysis and
insights are separate
A
D
B
C
G
F
E
HI
“The (unfullfilled) promise of data
marketplaces”, P. Koutroumpis,
A. Leiponen & L .Thomas (2016)Processing
25. Marketplace and data typology
Matching Marketplace
design
Transaction
costs
Provenance Boundary
definition
Rules
definition
Effective
monitoring
Characteristics
of data
One-to-one Bilateral High High High High High High value, High
privacy
One-to-many Dispersal Low Low Low Low Minimal Low value, Low
privacy
Many-to-one Harvest Low Low Low Low Minimal Low value, Low
privacy
Many-to-many Multilateral
Centralized
Low Medium Medium Medium Low Medium value,
Medium privacy
Many-to-many Multilateral
Decentralized
Medium High Low High High High value,
Medium privacy
Data is no longer a
Common Pool Resource!
26. Performance of centralized and
decentralized market designs
Centralized Decentralized
Thickness Variable depending on
the rules and
membership/usage fees
Assumed to have full
participation
Congestion Assumed to have
minimal effect
Assumed to have minimal
effect
Transaction
costs
Very low Increased friction for each
transaction (can be limited
by using trusted third-party
licensing)
Decentralized:Trading off market thickness against increasing
(technical) transaction costs
http://hackingdistributed.com/2016/08/04/byzcoin/
https://www.technologyreview.com/s/600781/technical-roadblock-might-shatter-bitcoin-dreams/
27. Conclusions
} Platforms/multisided markets bring together multiple
different types of parties
} There are complementarities among the parties
} Need to engage the different sides
} Pricing and integration strategies may help in reaching critical
mass for the platform
} Successful platforms benefit from strong network effects and
scale economies and can become very profitable …and very
powerful
} Monopolization of communication and information platforms can be
societally harmful
} Algorithmic transparency/monitoring will be necessary
} How digital platforms are operationalized depends on the
nature of the service/good provided, institutional setting,
Digital Rights Management – IoT
28. Summary II
• Data really is a different kind of an intellectual asset
– Careful attention to technical, institutional detail is required!
• Trading regimes: secrecy & trust or verification technology
(blockchain?) – or ‘FREE’
– Bilateral trading sets up a complex relationship with remedies,
audits, subscriptions as contractual features
– Multilateral based on verification tech could be anonymous
and one-off – probably for more high-value data due to
computing cost
• Continuing evolution in control technologies and Artificial
Intelligence will be the “invention machines” of the 21st century
– data will be the lubricant