Flink Case Study: OKKAM

•Download as PPTX, PDF•

2 likes•7,853 views

Flink Forward

Flink forward 2015

Technology

A Semantic Big Data
Companion
Stefano Bortoli
bortoli@okkam.it
Flavio Pompermaier
pompermaier@okkam.it

The company (briefly)
• Okkam is
– a SME based in Trento, Italy.
– Started as spin-off of the
University of Trento and FBK (2010)
• Okkam core business is
– large-scale data integration using
semantic technologies and
an Entity Name System
• Okkam operative sectors
– Services for public administration
– Services for restaurants (and more)
– Research projects
• FP7, H2020, and Local agencies

Who we are
• Stefano Bortoli, PhD
– works as technical director and researcher at Okkam S.R.L.
(Trento, Italy). His research and development interests are in the
area of Information Integration, with special focus in entity-
centric applications exploiting semantic technologies.
• Flavio Pompermaier, MSc.
– works as senior software engineer at Okkam S.R.L. (Trento, Italy).
Flavio is a passionate developer working with state of the art
technologies, combining semantic with big data technologies.

Why we need Flink
Entiton data model
Database record
RDF statement
Triplestore
NOSQL
& Index
+
Quad
provenance IRI
predicate
object
object Type
Subject
local IRI
Subject
ENS IRI
RDF Type
Expensive
datawearhouse

Why we are here
• We want to build and manage (very) large
entity-centric knowledge bases
• We endorsed Flink since Stratosphere as data
processing framework (during DOPA FP7)
• Our use cases for Apache Flink:
– Domain reasoning (Flink + Parquet + Thrift)
– RDF data lifecycle (Flink + Parquet + Jena/Sesame )
– RDF data intelligence (Flink + ELKiBi)
– Duplicate record detection (Flink + HBase + Solr)
– Entiton Record linkage (Flink + MongoDB + Kryo)
– Telemetry analysis (Flink + MongoDB + Weka)

Come to our session!
• We are the last presenting, don’t let us ALONE!
• We are hiring! (maybe ;-)

What's hot

Intranet show and_tell_2010Charlie Hull

Amundsen at Brex and Looker integrationmarkgrover

Building a Knowledge Graph using NLP and OntologiesNeo4j

Turning search upside down with powerful open source search softwareCharlie Hull

AnzoGraph DB - SPARQL 101Cambridge Semantics

TaLend Online TrainingGlory IT Technologies

Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...Databricks

20181019 code.talks graph_analytics_k_patengeKarin Patenge

Data Integration Solutions Created By KoneksysKoneksys

Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann

Neo4j: What's Under the HoodNeo4j

Micro-Servicing Linked DataopenCypher

CodeOne 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver

Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and BeyondDatabricks

Top 5 Considerations When Evaluating NoSQLMongoDB

Devclub.lv - Introduction to stream processingNicolas Fränkel

Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryPeter Haase

Taking Cross References to the Next Level: Reltables for Non-Topic ElementsContrext Solutions

From discovering to trusting datamarkgrover

Why is JSON-LD Important to Businesses - Franz IncFranz Inc. - AllegroGraph

What's hot (20)

Intranet show and_tell_2010

Amundsen at Brex and Looker integration

Building a Knowledge Graph using NLP and Ontologies

Turning search upside down with powerful open source search software

AnzoGraph DB - SPARQL 101

TaLend Online Training

Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...

20181019 code.talks graph_analytics_k_patenge

Data Integration Solutions Created By Koneksys

Lider Reference Model ld4lt session March, 3rd, 2015

Neo4j: What's Under the Hood

Micro-Servicing Linked Data

CodeOne 2018 - Microservices in action at the Dutch National Police

Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond

Top 5 Considerations When Evaluating NoSQL

Devclub.lv - Introduction to stream processing

Visual Ontology Modeling for Domain Experts and Business Users with metaphactory

Taking Cross References to the Next Level: Reltables for Non-Topic Elements

From discovering to trusting data

Why is JSON-LD Important to Businesses - Franz Inc

Viewers also liked

Flink Case Study: AmadeusFlink Forward

Hadoop or Spark: is it an either-or proposition? By Slim BaltagiSlim Baltagi

Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent

Aljoscha Krettek - Portable stateful big data processing in Apache BeamVerverica

Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit

Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitSlim Baltagi

Kafka Streams for Java enthusiastsSlim Baltagi

Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiSlim Baltagi

Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi

Building Streaming Data Applications Using Apache KafkaSlim Baltagi

Flink vs. SparkSlim Baltagi

Marton Balassi – Stateful Stream ProcessingFlink Forward

Matthias J. Sax – A Tale of Squirrels and StormsFlink Forward

Apache Flink Training: System OverviewFlink Forward

Apache Flink internalsKostas Tzoumas

Ufuc Celebi – Stream & Batch Processing in one SystemFlink Forward

Apache Flink: API, runtime, and project roadmapKostas Tzoumas

Martin Junghans – Gradoop: Scalable Graph Analytics with Apache FlinkFlink Forward

Apache Flink Training: DataStream API Part 2 Advanced Flink Forward

Viewers also liked (20)

Flink Case Study: Amadeus

Hadoop or Spark: is it an either-or proposition? By Slim Baltagi

Building a Modern Data Architecture with Enterprise Hadoop

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry

Aljoscha Krettek - Portable stateful big data processing in Apache Beam

Apache Beam: A unified model for batch and stream processing data

Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit

Kafka Streams for Java enthusiasts

Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi

Apache Flink: Real-World Use Cases for Streaming Analytics

Building Streaming Data Applications Using Apache Kafka

Flink vs. Spark

Marton Balassi – Stateful Stream Processing

Matthias J. Sax – A Tale of Squirrels and Storms

Apache Flink Training: System Overview

Apache Flink internals

Ufuc Celebi – Stream & Batch Processing in one System

Apache Flink: API, runtime, and project roadmap

Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink

Apache Flink Training: DataStream API Part 2 Advanced

Similar to Flink Case Study: OKKAM

Nicola_Mezzetti_CV_en.pdfNicola Mezzetti

Infoproject company overviewInfoproject

PiCToR @ Vrijdag VISdagpictor_office

Interesting Thing about Internet of ThingsDr. Mazlan Abbas

Make share point the heart of your information 3 ways to extend your investmentJim Merrifield, IGP, CIP

Intranet systems beyond SharePoint in ScandinaviaPerttu Tolvanen

Python Certification Course In Ahmedabadsushmitasharan1

SPSToronto: SharePoint 2016 - Hybrid, right choice for you and your organizat...Knut Relbe-Moe [MVP, MCT]

How Python is Tackling Data Integration Challenges in Fintech.pdfFicode Technologies Limited

The Ultimate Things About IoTDr. Mazlan Abbas

Intranet systems beyond SharePoint and the future of SharePointPerttu Tolvanen

Python Certification Course In Bangaloresushmitasharan1

Records Management strategy and compliance in SharePointYi Zhang

Enterprise Collaboration using SharePointSanjeev Samala

SharePoint 2010 - Mobility, Browser Compatibility, Compliance, and its Contin...EPC Group

Data Curation @ SpazioDati - NEXA Lunch SeminarSpazioDati

Dutch IT Outsourcing Intelligence Report 2011IT Sourcing Europe

Sharepoint webinarInfogain

Case Study: From Strategy to Large-scale Change ProgramJorma Myyryläinen

IT Barometer 2011 - SummaryTIVIA ry

Similar to Flink Case Study: OKKAM (20)

Nicola_Mezzetti_CV_en.pdf

Infoproject company overview

PiCToR @ Vrijdag VISdag

Interesting Thing about Internet of Things

Make share point the heart of your information 3 ways to extend your investment

Intranet systems beyond SharePoint in Scandinavia

Python Certification Course In Ahmedabad

SPSToronto: SharePoint 2016 - Hybrid, right choice for you and your organizat...

How Python is Tackling Data Integration Challenges in Fintech.pdf

The Ultimate Things About IoT

Intranet systems beyond SharePoint and the future of SharePoint

Python Certification Course In Bangalore

Records Management strategy and compliance in SharePoint

Enterprise Collaboration using SharePoint

SharePoint 2010 - Mobility, Browser Compatibility, Compliance, and its Contin...

Data Curation @ SpazioDati - NEXA Lunch Seminar

Dutch IT Outsourcing Intelligence Report 2011

Sharepoint webinar

Case Study: From Strategy to Large-scale Change Program

IT Barometer 2011 - Summary

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Histor y of HAM Radio presentation slidevu2urc

Artificial Intelligence: Facts and MythsJoaquim Jorge

Real Time Object Detection Using Open CVKhem

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

How to convert PDF to text with Nanonetsnaman860154

A Year of the Servo Reboot: Where Are We Now?Igalia

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Slack Application Development 101 Slidespraypatel2

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

A Call to Action for Generative AI in 2024Results

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Scaling API-first – The story of a global engineering organization

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Finology Group – Insurtech Innovation Award 2024

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

GenCyber Cyber Security Day Presentation

Histor y of HAM Radio presentation slide

Artificial Intelligence: Facts and Myths

Real Time Object Detection Using Open CV

08448380779 Call Girls In Civil Lines Women Seeking Men

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

How to convert PDF to text with Nanonets

A Year of the Servo Reboot: Where Are We Now?

Handwritten Text Recognition for manuscripts and early printed texts

Slack Application Development 101 Slides

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Boost Fertility New Invention Ups Success Rates.pdf

A Call to Action for Generative AI in 2024

08448380779 Call Girls In Friends Colony Women Seeking Men

Presentation on how to chat with PDF using ChatGPT code interpreter

Flink Case Study: OKKAM

1. A Semantic Big Data Companion Stefano Bortoli bortoli@okkam.it Flavio Pompermaier pompermaier@okkam.it

2. The company (briefly) • Okkam is – a SME based in Trento, Italy. – Started as spin-off of the University of Trento and FBK (2010) • Okkam core business is – large-scale data integration using semantic technologies and an Entity Name System • Okkam operative sectors – Services for public administration – Services for restaurants (and more) – Research projects • FP7, H2020, and Local agencies

3. Who we are • Stefano Bortoli, PhD – works as technical director and researcher at Okkam S.R.L. (Trento, Italy). His research and development interests are in the area of Information Integration, with special focus in entity- centric applications exploiting semantic technologies. • Flavio Pompermaier, MSc. – works as senior software engineer at Okkam S.R.L. (Trento, Italy). Flavio is a passionate developer working with state of the art technologies, combining semantic with big data technologies.

4. What we do

5. Why we need Flink Entiton data model Database record RDF statement Triplestore NOSQL & Index + Quad provenance IRI predicate object object Type Subject local IRI Subject ENS IRI RDF Type Expensive datawearhouse

6. Why we are here • We want to build and manage (very) large entity-centric knowledge bases • We endorsed Flink since Stratosphere as data processing framework (during DOPA FP7) • Our use cases for Apache Flink: – Domain reasoning (Flink + Parquet + Thrift) – RDF data lifecycle (Flink + Parquet + Jena/Sesame ) – RDF data intelligence (Flink + ELKiBi) – Duplicate record detection (Flink + HBase + Solr) – Entiton Record linkage (Flink + MongoDB + Kryo) – Telemetry analysis (Flink + MongoDB + Weka)

7. Come to our session! • We are the last presenting, don’t let us ALONE! • We are hiring! (maybe ;-)

Flink Case Study: OKKAM

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Flink Case Study: OKKAM

Similar to Flink Case Study: OKKAM (20)

More from Flink Forward

More from Flink Forward (20)

Recently uploaded

Recently uploaded (20)

Flink Case Study: OKKAM