SlideShare a Scribd company logo
1 of 15
F# Data
Making structured data first-class citizens
Tomas Petricek, University of Cambridge
Project homepage: http://fsharp.github.io/FSharp.Data
Get in touch: @tomaspetricek | tomas@tomasp.net
F# Software Foundation
http://www.fsharp.org
software stacks
trainings teaching F# user groups snippets
mac and linux cross-platform tutorials
F# community open-source MonoDevelop
contributions research support
consultancy mailing list
F# Data type providers
First-class data
CSV, REST, WorldBank…
R Type provider
Statistics & visualization
5000 tested packages
www.fslab.org
Deedle data frame
Data exploration
Indexing and aggregation
F# Charting library
Simple & composable
Interactive style
www.fslab.org
What are type providers?
Integrating WorldBank and R
http://youtu.be/7r2-B-5H_io
The confusion of languages
What are type providers?
What are type providers?
Type provider research questions
Data vs.
Schema
Laziness
Mapping
to types
Schema
inference
Schema
inference
Schema inference
Loading Titanic data from CSV
http://youtu.be/yjBdZduc0ko
Inferring primitive types
null intbool
string
decimal
float
Structure inference
Working with XML and JSON data
http://youtu.be/_DjX0ybaXZY
http://youtu.be/SkZBzlREOMo
Inferring structured types
person { name : string } person { name : string, age : int }
person { name : string, age : int option }
[ { num : int } ] [ { str : string } ]
[ { num : int option, str : string option } ]
int { value : int }
int + { value : int }
Does it scale?
Query movies using Apiary provider
http://youtu.be/-Am2uRUv39c
Conclusions
Inference from small-scale samples works!
Schema is (very) often missing
But data is (very) often regular
Check out F# Data and contribute!
Project homepage: http://fsharp.github.io/FSharp.Data
Get in touch: @tomaspetricek | tomas@tomasp.net

More Related Content

What's hot

Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...
Lviv Startup Club
 
AjayBhullar_Resume (5)
AjayBhullar_Resume (5)AjayBhullar_Resume (5)
AjayBhullar_Resume (5)
Ajay Bhullar
 

What's hot (20)

Benchmark BIB-R @ TPDL 2016
Benchmark BIB-R @ TPDL 2016Benchmark BIB-R @ TPDL 2016
Benchmark BIB-R @ TPDL 2016
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs:A DBpedia StudyCrowdsourcing the Quality of Knowledge Graphs:A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
 
Newspapers, IIIF, and ALTO
Newspapers, IIIF, and ALTONewspapers, IIIF, and ALTO
Newspapers, IIIF, and ALTO
 
A Closer Look at the Changing Dynamics of DBpedia Mappings
A Closer Look at the Changing Dynamics of DBpedia MappingsA Closer Look at the Changing Dynamics of DBpedia Mappings
A Closer Look at the Changing Dynamics of DBpedia Mappings
 
Milex 2010 final
Milex 2010 finalMilex 2010 final
Milex 2010 final
 
Topical_Facets
Topical_FacetsTopical_Facets
Topical_Facets
 
Creating a FRBRized Data Structure in XML
Creating a FRBRized Data Structure in XMLCreating a FRBRized Data Structure in XML
Creating a FRBRized Data Structure in XML
 
Data Quality Assessment in Europeana: Metrics for Multilinguality
Data Quality Assessment in Europeana:  Metrics for MultilingualityData Quality Assessment in Europeana:  Metrics for Multilinguality
Data Quality Assessment in Europeana: Metrics for Multilinguality
 
Search4similars
Search4similarsSearch4similars
Search4similars
 
Analyzing poetry databases to develop a metadata application profile. Why eac...
Analyzing poetry databases to develop a metadata application profile. Why eac...Analyzing poetry databases to develop a metadata application profile. Why eac...
Analyzing poetry databases to develop a metadata application profile. Why eac...
 
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sfSparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
 
Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...
 
Link Discovery Tutorial Introduction
Link Discovery Tutorial IntroductionLink Discovery Tutorial Introduction
Link Discovery Tutorial Introduction
 
Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)
 
AjayBhullar_Resume (5)
AjayBhullar_Resume (5)AjayBhullar_Resume (5)
AjayBhullar_Resume (5)
 
Healthcare Data Management using Domain Specific Languages for Metadata Manag...
Healthcare Data Management using Domain Specific Languages for Metadata Manag...Healthcare Data Management using Domain Specific Languages for Metadata Manag...
Healthcare Data Management using Domain Specific Languages for Metadata Manag...
 
Summary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in GermanySummary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in Germany
 
Linked Data APIs (Funding Circle May 2015)
Linked Data APIs (Funding Circle May 2015)Linked Data APIs (Funding Circle May 2015)
Linked Data APIs (Funding Circle May 2015)
 
Session 1.6 fostering interoperability of european qualifications: the qual...
Session 1.6   fostering interoperability of european qualifications: the qual...Session 1.6   fostering interoperability of european qualifications: the qual...
Session 1.6 fostering interoperability of european qualifications: the qual...
 

Similar to F# Data: Making structured data first class citizens

EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
European Data Forum
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashups
giurca
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
National Information Standards Organization (NISO)
 
Semantic Search Summer School2009
Semantic Search Summer School2009Semantic Search Summer School2009
Semantic Search Summer School2009
Peter Mika
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
Gezim Sejdiu
 

Similar to F# Data: Making structured data first class citizens (20)

F# and Financial Data Making Data Analysis Simple
F# and Financial Data Making Data Analysis SimpleF# and Financial Data Making Data Analysis Simple
F# and Financial Data Making Data Analysis Simple
 
IIIF for CNI Spring 2014 Membership Meeting
IIIF for CNI Spring 2014 Membership MeetingIIIF for CNI Spring 2014 Membership Meeting
IIIF for CNI Spring 2014 Membership Meeting
 
Wimmics Overview 2021
Wimmics Overview 2021Wimmics Overview 2021
Wimmics Overview 2021
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the Web
 
Data Portability with SIOC and FOAF
Data Portability with SIOC and FOAFData Portability with SIOC and FOAF
Data Portability with SIOC and FOAF
 
EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashups
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)
 
A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)
 
Linking library data
Linking library dataLinking library data
Linking library data
 
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
 
Open Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaicsOpen Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaics
 
Semantic Search Summer School2009
Semantic Search Summer School2009Semantic Search Summer School2009
Semantic Search Summer School2009
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 

More from Tomas Petricek

Queries in general purpose languages
Queries in general purpose languagesQueries in general purpose languages
Queries in general purpose languages
Tomas Petricek
 

More from Tomas Petricek (13)

Coeffects: A Calculus of Context-Dependent Computation
Coeffects: A Calculus of Context-Dependent ComputationCoeffects: A Calculus of Context-Dependent Computation
Coeffects: A Calculus of Context-Dependent Computation
 
Creating Domain Specific Languages in F#
Creating Domain Specific Languages in F#Creating Domain Specific Languages in F#
Creating Domain Specific Languages in F#
 
F# Type Providers in Depth
F# Type Providers in DepthF# Type Providers in Depth
F# Type Providers in Depth
 
Asynchronous programming in F# (QCon 2012)
Asynchronous programming in F# (QCon 2012)Asynchronous programming in F# (QCon 2012)
Asynchronous programming in F# (QCon 2012)
 
Queries in general purpose languages
Queries in general purpose languagesQueries in general purpose languages
Queries in general purpose languages
 
Docase notation for Haskell
Docase notation for HaskellDocase notation for Haskell
Docase notation for Haskell
 
Accessing loosely structured data from F# and C#
Accessing loosely structured data from F# and C#Accessing loosely structured data from F# and C#
Accessing loosely structured data from F# and C#
 
F# on the Server-Side
F# on the Server-SideF# on the Server-Side
F# on the Server-Side
 
F# Tutorial @ QCon
F# Tutorial @ QConF# Tutorial @ QCon
F# Tutorial @ QCon
 
Teaching F#
Teaching F#Teaching F#
Teaching F#
 
F# in MonoDevelop
F# in MonoDevelopF# in MonoDevelop
F# in MonoDevelop
 
Academia
AcademiaAcademia
Academia
 
Concurrent programming with Agents
Concurrent programming with AgentsConcurrent programming with Agents
Concurrent programming with Agents
 

Recently uploaded

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

F# Data: Making structured data first class citizens