Submit Search
Upload
From Zero to Data Flow in Hours with Apache NiFi
•
Download as PPTX, PDF
•
12 likes
•
6,861 views
DataWorks Summit/Hadoop Summit
Follow
From Zero to Data Flow in Hours with Apache NiFi
Read less
Read more
Technology
Report
Share
Report
Share
1 of 25
Download now
Recommended
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Aldrin Piri
Apache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
DataWorks Summit/Hadoop Summit
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
Apache Nifi Crash Course
Apache Nifi Crash Course
DataWorks Summit
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
Slim Baltagi
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Recommended
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Aldrin Piri
Apache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
DataWorks Summit/Hadoop Summit
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
DataWorks Summit
Apache Nifi Crash Course
Apache Nifi Crash Course
DataWorks Summit
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
Slim Baltagi
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Timothy Spann
NiFi Developer Guide
NiFi Developer Guide
Deon Huang
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
confluent
Integrating NiFi and Flink
Integrating NiFi and Flink
Bryan Bende
Building Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with Python
Timothy Spann
Flink Streaming
Flink Streaming
Gyula Fóra
Nifi
Nifi
Julio Castro
Nifi workshop
Nifi workshop
Yifeng Jiang
Kafka 101
Kafka 101
Clement Demonchy
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
DataWorks Summit
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Timothy Spann
Introduction to Apache Flink
Introduction to Apache Flink
datamantra
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
Apache flink
Apache flink
Ahmed Nader
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
HostedbyConfluent
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
HostedbyConfluent
Druid and Hive Together : Use Cases and Best Practices
Druid and Hive Together : Use Cases and Best Practices
DataWorks Summit
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
More Related Content
What's hot
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Timothy Spann
NiFi Developer Guide
NiFi Developer Guide
Deon Huang
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
confluent
Integrating NiFi and Flink
Integrating NiFi and Flink
Bryan Bende
Building Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with Python
Timothy Spann
Flink Streaming
Flink Streaming
Gyula Fóra
Nifi
Nifi
Julio Castro
Nifi workshop
Nifi workshop
Yifeng Jiang
Kafka 101
Kafka 101
Clement Demonchy
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
DataWorks Summit
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Timothy Spann
Introduction to Apache Flink
Introduction to Apache Flink
datamantra
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
Kai Wähner
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
Alluxio, Inc.
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
Apache flink
Apache flink
Ahmed Nader
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
HostedbyConfluent
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
HostedbyConfluent
Druid and Hive Together : Use Cases and Best Practices
Druid and Hive Together : Use Cases and Best Practices
DataWorks Summit
What's hot
(20)
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
NiFi Developer Guide
NiFi Developer Guide
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Integrating NiFi and Flink
Integrating NiFi and Flink
Building Modern Data Streaming Apps with Python
Building Modern Data Streaming Apps with Python
Flink Streaming
Flink Streaming
Nifi
Nifi
Nifi workshop
Nifi workshop
Kafka 101
Kafka 101
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Introduction to Apache Flink
Introduction to Apache Flink
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
Apache flink
Apache flink
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Druid and Hive Together : Use Cases and Best Practices
Druid and Hive Together : Use Cases and Best Practices
Similar to From Zero to Data Flow in Hours with Apache NiFi
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an example
hadooparchbook
Application Architectures with Hadoop
Application Architectures with Hadoop
hadooparchbook
Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015
Cloudera, Inc.
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
IntelAPAC
Application Architectures with Hadoop
Application Architectures with Hadoop
hadooparchbook
Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting Platform
DataStax Academy
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
donaghmccabe
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
Riccardo Romani
End to End Streaming Architectures
End to End Streaming Architectures
Cloudera, Inc.
Customer Applications Of Hadoop On Red Hat Storage Server
Customer Applications Of Hadoop On Red Hat Storage Server
Red_Hat_Storage
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop Overview
Yafang Chang
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
Matt Stubbs
Spark One Platform Webinar
Spark One Platform Webinar
Cloudera, Inc.
Architecting Applications with Hadoop
Architecting Applications with Hadoop
markgrover
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
Cloudera, Inc.
TDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQL
tdc-globalcode
Similar to From Zero to Data Flow in Hours with Apache NiFi
(20)
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an example
Application Architectures with Hadoop
Application Architectures with Hadoop
Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
Application Architectures with Hadoop
Application Architectures with Hadoop
Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting Platform
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
End to End Streaming Architectures
End to End Streaming Architectures
Customer Applications Of Hadoop On Red Hat Storage Server
Customer Applications Of Hadoop On Red Hat Storage Server
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop Overview
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
Spark One Platform Webinar
Spark One Platform Webinar
Architecting Applications with Hadoop
Architecting Applications with Hadoop
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
TDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQL
More from DataWorks Summit/Hadoop Summit
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
Hadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit/Hadoop Summit
More from DataWorks Summit/Hadoop Summit
(20)
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Hadoop Crash Course
Data Science Crash Course
Data Science Crash Course
Apache Spark Crash Course
Apache Spark Crash Course
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
HBase in Practice
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Recently uploaded
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
LoriGlavin3
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
BookNet Canada
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
LoriGlavin3
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
Raghuram Pandurangan
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Curtis Poe
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
LoriGlavin3
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Dilum Bandara
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Precisely
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
LoriGlavin3
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Lorenzo Miniero
Recently uploaded
(20)
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
From Zero to Data Flow in Hours with Apache NiFi
1.
Copyright © 2016,
Schlumberger, All rights reserved. From Zero to Data Flow In Hours with Apache Nifi Hadoop Summit – San Jose 2016 Chris Herrera Schlumberger
2.
Copyright © 2016,
Schlumberger, All rights reserved. Agenda • Why is composable data flow important to the drilling industry • Current State of the System • The Breaking Point to the new system • An unexpected workflow in testing • How are we using it today • What’s Next
3.
Copyright © 2016,
Schlumberger, All rights reserved. Legal Notices This presentation is for informational purposes only. STATEMENTS AND OPINIONS EXPRESSED IN THIS PRESENTATION ARE THOSE OF THE PRESENTER AND DO NOT REFLECT THE OPINIONS OF SCHLUMBERGER. SCHLUMBERGER AND THE PRESENTER HEREBY DISCLAIM ANY REPRESENTATIONS AND/OR WARRANTIES EXPRESS OR IMPLIED. SCHLUMBERGER AND THE PRESENTER HEREBY DISCLAIM ANY RESPONSIBILITY FOR THE CONTENT, ACCURACY, AND/OR COMPLETENESS OF THE INFORMATION IN this presentation. This presentation, and any recordings or reproductions in various media formats, including, without limitation, print, audio, and video, is the copyrighted work of Schlumberger, and Schlumberger hereby retains all intellectual property and/or proprietary rights related thereto. Schlumberger and the Schlumberger logo are trademarks of Schlumberger in the U.S. and/or other countries. Other names and brands referenced in this presentation are the trademarks of their respective owners, and any references thereto are not endorsements or approvals. Copyright © 2016, Schlumberger, All rights reserved.
4.
Copyright © 2016,
Schlumberger, All rights reserved. Introduction • 2 Years managing product development and innovation teams working on real time data ingestion and delivery • 5 years of experience in the Hadoop ecosystem • 11 years of experience with various aspects of the oilfield (operational and technical) Chris Herrera Schlumberger
5.
Copyright © 2016,
Schlumberger, All rights reserved. Wireline Measurement / Logging While Drilling Mud logging Fluids Completions Cementing Rig • Several contractors brought in to develop and complete the well • Can be comprised of one, or most of the time many companies • All bringing their own system, a lot of times without a central repository of data • Can be within decent cell connectivity, or out deep in the middle of a jungle with only 128k of high latency bandwidth The Major Components of a Drilling Project
6.
Copyright © 2016,
Schlumberger, All rights reserved. Where Does This Data Need to Go? RT Server Operational Support Client Monitoring Processing and Print Centers
7.
Copyright © 2016,
Schlumberger, All rights reserved. Workflow of Data During and Post Operations ProcessingCenter Acquisition DataServer Classification & Labelling Quality Control Classification Quality Control Hosting QC & Labelling Conversion Data Delivery KPI&Reporting ProcessingAcq Sales and Job Planning Data Processor Customer Manager Client Data Delivery Sales Field Engineer
8.
Copyright © 2016,
Schlumberger, All rights reserved. Input DLIS LAS 1.2 2.0 3.0 WITS Level 0 Level 1 Level 2 CSV Profibus Modbus What Does This Mean In A Data Sense Output CSV PDS LAS 1.2 2.0 3.0 DLIS RT Server
9.
Copyright © 2016,
Schlumberger, All rights reserved. What Does This Mean in a Volume Sense ~9000 Users / Month ~10 Files / Minute ~480 Data Queries / sec ~3050 Wells / month
10.
Copyright © 2016,
Schlumberger, All rights reserved. Context Fidelity Time Acquisition - Field Interpretation - Office A Quick(ish) Note On The Importance of Data Provenance • Need to retain the fidelity throughout the flow.
11.
Copyright © 2016,
Schlumberger, All rights reserved. Typical Data Problems Concerns • What is the time zone of the data we are receiving – one day UTC... • ”Ahh, I see you did not implement that part of the standard...” • Wait, Why are you sending data at 5 times the sampling rate of the sensor... • I did not get the memo that you were changing your data model today... • Governmental / Client data residency concerns
12.
Copyright © 2016,
Schlumberger, All rights reserved. Current Solution… • 100+ Man Years of effort over 14 years • ~2,000,000 + Lines of Code • Extreme barrier to entry for workflow changes • Very little understanding of what happened to the data Input DLIS LAS 1.2 2.0 3.0 WITS Level 0 Level 1 Level 2 CSV Profibus Modbus Output CSV PDS LAS 1.2 2.0 3.0 DLIS RT Server
13.
Copyright © 2016,
Schlumberger, All rights reserved. We Needed A Simpler – Maintainable Solution…
14.
Copyright © 2016,
Schlumberger, All rights reserved. The Original Plan… Rabbit MQ DLIS Parser ETP Endpoint LAS Parser Data Writer {} DB Event Publisher Node JS What About: • Data cleansing • Routing • The ability to debug what has gone wrong • TIME (estimated 6 man months)
15.
Copyright © 2016,
Schlumberger, All rights reserved. How does Nifi fit into the equation? • Knowing where data came from is crucial (and often missing) to real time decision making • The ability to visualize the data flow at a granular level aids in troubleshooting and operational understanding • With several processors already available, there is a low barrier to entry when it comes to data flow creation
16.
Copyright © 2016,
Schlumberger, All rights reserved. Enter Nifi… Processor Creation Data Flow Creation Creation Play… 10 Man Hours ETP WITSML 1.3.1.1 / 1.4.1.1 LAS 1.2 / 2.0 1 Man Day
17.
Copyright © 2016,
Schlumberger, All rights reserved. Prototype Setup Data Source Processor Input Data Cleansing Data Enrichment { } Repo Data Storage Put Data 2 Man Days • Append Well Name • Append Client Name • Append Run name • Append Pass Name Process Group: Get Update Process Group: Fix Time Zone Remove Absent indexes Data Cleansing Routing
18.
Copyright © 2016,
Schlumberger, All rights reserved. What About Testing!
19.
Copyright © 2016,
Schlumberger, All rights reserved. Testing Landscape Today 2.2 TB Test Data • 22 Applications • 14 Different formats of data • Data of questionable quality • Stored on a file share Effort • .5 man effort / sprint on maintenance • 2 weeks to perform a full test
20.
Copyright © 2016,
Schlumberger, All rights reserved. Step 1: Data Set Curation – Creating the Set of Reference LAS 1.2 2.0 3.0 WITS Level 0 Level 1 Level 2 CSV Clean Test Data Set 2.2 TB Test Data 6 Hours
21.
Copyright © 2016,
Schlumberger, All rights reserved. Docker Step 2: Immediate Test Harness Clean Test Data Set • Step 1: Need Data • Step 2: Docker pull xxx.xxx.xxx.xxx:xxxx/flowTest • Step 3: add put processor • Step 4: start dataflow From: 2 weeks to setup a test to:
22.
Copyright © 2016,
Schlumberger, All rights reserved. • Docker Step 3: Immediate Live Data Testing Production RT System Processor Input Testing Processor Group Anonymize Data • Significantly cuts down time to test application against real data • Especially in brownfield applications • Brings a level of confidence to the project that otherwise would be missing.
23.
Copyright © 2016,
Schlumberger, All rights reserved. Next Steps
24.
Copyright © 2016,
Schlumberger, All rights reserved. Use Cases to be Explored for MiniFi – Rig Data Ingestion with Provenance RT Server • Understanding the chain of custody from sensor to user • Tracking the provenance of the data as it traverses through the system
25.
Copyright © 2016,
Schlumberger, All rights reserved. Thank You! Questions?
Editor's Notes
Different arrival times Different Data streams Exchanging data amongst themselves Unknown quality
Download now