Apache Nifi is an open source dataflow platform that automates the flow of data between systems. It uses a flow-based programming model where data is routed through configurable "processors". Nifi was donated to the Apache Foundation by the NSA in 2014 and has over 285 processors to interact with data in various formats. It provides an easy to use UI and allows users to string together processors to move and transform data within "flowfiles" through the system in a secure manner while capturing detailed provenance data.
4. Mission Statement
“Put simply NiFi was built to automate the flow of data between systems. While the
term dataflow is used in a variety of contexts, we use it here to mean the
automated and managed flow of information between systems. This problem
space has been around ever since enterprises had more than one system, where
some of the systems created data and some of the systems consumed data. The
problems and solution patterns that emerged have been discussed and articulated
extensively.” Apache Nifi Overview
5. Overview
· Short for “NiagaraFiles” - donated to apache by the NSA in 2014 as
part of the TTP (Transition To Practice) program.
· Flow based programming model.
· Version 1.13.2 (current release) has over 285 “processors” (think
manipulate or gather info) to interact with the data.
· Data format agnostic.
7. ● Easy to navigate UI
● Can be standalone or in a
cluster
● Allows users to string
together “processors” or
write their own code
● Moves data around inside a
“flowfile”
● Built in security
○ LDAP, other
authentication
8.
9. ● Automatically captures all
changes that happen to
each “flowfile”
● Has a “replay” capability
● Provenance data can be
moved outside of Nifi and
into something like
elasticsearch and kibana
for real time
dashboarding.
11. Technical Details
· Written in Java and runnable on Java 8 and 11.
· Zero-Main cluster paradigm
· Built for high concurrency and data streaming, but supports batch
type operations.
· Open source and easily extensible.
· “Owned” and supported by Cloudera.
12. Use Cases
· Data ingestion
· Streaming
· Batch
· Data processing
· Filtering
· Enrichment
· Transformations
· IOT (both Nifi and Minifi)
· Minifi offers edge computing
· Application data processing
· Offload data processing to platform
· Data and process automation
· Data governance