Bay Area Apache Flink Meetup Community Update August 2015

•

3 likes•7,090 views

Henry Saputra

Bay Area Apache Flink Meetup Community Update August 2015 at MapR

Software

Bay Area Apache Flink Meetup #2
Distributed Stream and Graph Processing
Community Update
August 2015
Henry Saputra
Committer and PMC Member
hsaputra@apache.org
@Kingwulf

Apache Flink is an open source platform for
scalable batch and stream data processing.
Apache Flink is …
2
• The core of Apache Flink is a
distributed streaming dataflow
engine.
• Executing dataflows in
parallel on clusters
• Providing a reliable
foundation for various
workloads
• DataSet and DataStream
programming abstractions are
the foundation for user programs
and higher layers

One engine for many use cases
3
Real time streaming
topologies
Machine Learning at scale
Graph Analysis
Long batch 
pipelines

What happened? - 1
• New PMC: Maximilian Michels
• New Committer: Chesnay Schepler
• Discussions for a 0.9.1 release had started
• Apache Flink is becoming more popular:
– 1000+ Twitter followers
– 500+ GitHub stars
– Named as “open source Big Data project” to
watch by ZDNet.
– Flink Forward schedule with great speakers
announced
4

What happened? - 2
• Apache Flink on Wikipedia: https://
en.wikipedia.org/wiki/Apache_Flink
• New JobManager Dashboard
• Apache SAMOA 0.3.0-incubating with Flink
integration
• New “Features” page
• Contributors list (can you spot your name?)
https://cwiki.apache.org/confluence/display/
FLINK/List+of+contributors
5

New Website Redesign and
New Features page
7

New Architecture diagram in 0.10
documentation
8

More contents in the Wiki for
Internal Information
9

In master (0.10-SNAPSHOT) - 1
10
• Gelly Scala API
• More improvements and fixes for YARN
• Flink dropped Java 6 support
• Streaming connector for Elastic Search
• Sampling operation on DataSet API
• A lot of bug fixes:
– Streaming: APIs, general stability, kafka
connector

In master (0.10-SNAPSHOT) - 2
• Low watermarks / Event time
• New JM Dashboard
• Akka messages are now aware of leader
IDs (for HA)
• Zookeeper integration (for HA)
• Live accumulators (runtime only)
• Stability improvements
11

Articles and Mentions
• High-throughput, low-latency, and exactly-once stream
processing with Apache Flink [1]
• Introducing Gelly: Graph Processing with Apache Flink [2]
• Apache Flink and the case for stream processing [3]
• Crunching Parquet Files with Apache Flink [4]
• The morning paper: Asynchronous Distributed Snapshots for
Distributed Dataflows [5]
• Five open source Big Data projects to watch [6]
• Big Data Performance Engineering: Examples from Hadoop,
Pig, HBase, Flink and Spark [7]
12
[1] http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
[2] http://flink.apache.org/news/2015/08/24/introducing-flink-gelly.html
[3] http://www.kdnuggets.com/2015/08/apache-flink-stream-processing.html
[4] https://medium.com/@istanbul_techie/crunching-parquet-files-with-apache-flink-200bec90d8a7
[5] http://blog.acolyer.org/2015/08/19/asynchronous-distributed-snapshots-for-distributed-dataflows/
[6] http://www.zdnet.com/article/five-open-source-big-data-projects-to-watch/
[7] http://www.bigsynapse.com/addressing-big-data-performance

New Meetups and Events
13
• Chicago: Flink Training @ Capital One
• Bay Area: Stream & Graph Processing @
MapR
13

Upcoming
• Sept 15: Washington DC Area Apache
Flink Meetup
• Sept 17: StreamProcessing.be meetup
• Sept 28-30: Flink Talks at ApacheCon Big
Data Budapest
New Meetup groups:
• New York
• Boston
15

Flink Forward schedule published
16
• http://flink-forward.org/?post_type=day
• Talks by Google, Data Artisans, Huawei,
CapitalOne, Bouyges, Ericsson, Amadeus,
ResearchGate, RedHat, and many more.
50%
off for this meetup‘s guests
FlinkMeetupBayArea50

What's hot

Assaf Araki – Real Time Analytics at ScaleFlink Forward

Flink StreamingGyula Fóra

Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...confluent

Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...confluent

Abstractions for managed stream processing platform (Arya Ketan - Flipkart)KafkaZone

Time Series Analysis Using an Event Streaming PlatformDr. Mirko Kämpf

Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Timo Walther

Enterprise Metadata IntegrationDr. Mirko Kämpf

Introduction to Streaming with Apache FlinkTugdual Grall

Data Analysis With Apache FlinkDataWorks Summit

Realtime streaming architecture in INFINARIOJozo Kovac

Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi

Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...Flink Forward

Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Databricks

KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...confluent

Ai platform at scaleHenry Saputra

Modern ETL Pipelines with Change Data CaptureDatabricks

Jamie Grier - Robust Stream Processing with Apache FlinkFlink Forward

Real-Time Analytics and Actions Across Large Data Sets with Apache SparkDatabricks

Flink Case Study: Bouygues TelecomFlink Forward

What's hot (20)

Assaf Araki – Real Time Analytics at Scale

Flink Streaming

Flattening the Curve with Kafka (Rishi Tarar, Northrop Grumman Corp.) Kafka S...

Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...

Abstractions for managed stream processing platform (Arya Ketan - Flipkart)

Time Series Analysis Using an Event Streaming Platform

Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...

Enterprise Metadata Integration

Introduction to Streaming with Apache Flink

Data Analysis With Apache Flink

Realtime streaming architecture in INFINARIO

Apache Flink: Real-World Use Cases for Streaming Analytics

Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...

Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...

KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...

Ai platform at scale

Modern ETL Pipelines with Change Data Capture

Jamie Grier - Robust Stream Processing with Apache Flink

Real-Time Analytics and Actions Across Large Data Sets with Apache Spark

Flink Case Study: Bouygues Telecom

Similar to Bay Area Apache Flink Meetup Community Update August 2015

Flink September 2015 Community UpdateRobert Metzger

August Flink Community UpdateRobert Metzger

Berlin Apache Flink Meetup May 2015, Community UpdateRobert Metzger

Flink Cummunity Update July (Berlin Meetup)Robert Metzger

Flink Community Update December 2015: Year in ReviewRobert Metzger

Overview of Apache Flink: the 4G of Big Data Analytics FrameworksDataWorks Summit/Hadoop Summit

Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksSlim Baltagi

Overview of Apache Fink: The 4G of Big Data Analytics FrameworksSlim Baltagi

Apache Flink First Half of 2015 Community UpdateRobert Metzger

Apache flinkJanu Jahnavi

Apache Flink Online TrainingLearntek1

Flink Community Update 2015 JuneMárton Balassi

0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019confluent

Apache flinkJanu Jahnavi

Apache Kafka - Scalable Message-Processing and more !Guido Schmutz

Data Stream Processing with Apache FlinkFabian Hueske

Trend Micro Big Data Platform and Apache BigtopEvans Ye

Cloud lunch and learn real-time streaming in azureTimothy Spann

Flink Community Update April 2015Robert Metzger

Similar to Bay Area Apache Flink Meetup Community Update August 2015 (20)

Flink September 2015 Community Update

August Flink Community Update

Berlin Apache Flink Meetup May 2015, Community Update

Flink Cummunity Update July (Berlin Meetup)

Flink Community Update December 2015: Year in Review

Overview of Apache Flink: the 4G of Big Data Analytics Frameworks

Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks

Overview of Apache Fink: The 4G of Big Data Analytics Frameworks

Apache Flink First Half of 2015 Community Update

Apache flink

Apache Flink Online Training

Flink Community Update 2015 June

0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019

Apache flink

Apache Kafka - Scalable Message-Processing and more !

Data Stream Processing with Apache Flink

Trend Micro Big Data Platform and Apache Bigtop

Cloud lunch and learn real-time streaming in azure

Flink Community Update April 2015

Recently uploaded

Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions

Introduction Computer Science - Software Design.pdfFerryKemperman

Cyber security and its impact on E commercemanigoyal112

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort

Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis

Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray

Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel

2.pdf Ejercicios de programación competitivaDiego Iván Oliveros Acosta

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent

英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl

Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services

Recently uploaded (20)

Best Web Development Agency- Idiosys USA.pdf

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...

Introduction Computer Science - Software Design.pdf

Cyber security and its impact on E commerce

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)

Folding Cheat Sheet #4 - fourth in a series

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Buds n Tech IT Solutions: Top-Notch Web Services in Noida

Unveiling Design Patterns: A Visual Guide with UML Diagrams

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...

Unveiling the Future: Sylius 2.0 New Features

2.pdf Ejercicios de programación competitiva

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

英国UN学位证,北安普顿大学毕业证书1:1制作

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE

Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany

Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...

Bay Area Apache Flink Meetup Community Update August 2015

1. Bay Area Apache Flink Meetup #2 Distributed Stream and Graph Processing Community Update August 2015 Henry Saputra Committer and PMC Member hsaputra@apache.org @Kingwulf

2. Apache Flink is an open source platform for scalable batch and stream data processing. Apache Flink is … 2 • The core of Apache Flink is a distributed streaming dataflow engine. • Executing dataflows in parallel on clusters • Providing a reliable foundation for various workloads • DataSet and DataStream programming abstractions are the foundation for user programs and higher layers

3. One engine for many use cases 3 Real time streaming topologies Machine Learning at scale Graph Analysis Long batch  pipelines

4. What happened? - 1 • New PMC: Maximilian Michels • New Committer: Chesnay Schepler • Discussions for a 0.9.1 release had started • Apache Flink is becoming more popular: – 1000+ Twitter followers – 500+ GitHub stars – Named as “open source Big Data project” to watch by ZDNet. – Flink Forward schedule with great speakers announced 4

5. What happened? - 2 • Apache Flink on Wikipedia: https:// en.wikipedia.org/wiki/Apache_Flink • New JobManager Dashboard • Apache SAMOA 0.3.0-incubating with Flink integration • New “Features” page • Contributors list (can you spot your name?) https://cwiki.apache.org/confluence/display/ FLINK/List+of+contributors 5

6. New Job Manager Dashboard 6

7. New Website Redesign and New Features page 7

8. New Architecture diagram in 0.10 documentation 8

9. More contents in the Wiki for Internal Information 9

10. In master (0.10-SNAPSHOT) - 1 10 • Gelly Scala API • More improvements and fixes for YARN • Flink dropped Java 6 support • Streaming connector for Elastic Search • Sampling operation on DataSet API • A lot of bug fixes: – Streaming: APIs, general stability, kafka connector

11. In master (0.10-SNAPSHOT) - 2 • Low watermarks / Event time • New JM Dashboard • Akka messages are now aware of leader IDs (for HA) • Zookeeper integration (for HA) • Live accumulators (runtime only) • Stability improvements 11

12. Articles and Mentions • High-throughput, low-latency, and exactly-once stream processing with Apache Flink [1] • Introducing Gelly: Graph Processing with Apache Flink [2] • Apache Flink and the case for stream processing [3] • Crunching Parquet Files with Apache Flink [4] • The morning paper: Asynchronous Distributed Snapshots for Distributed Dataflows [5] • Five open source Big Data projects to watch [6] • Big Data Performance Engineering: Examples from Hadoop, Pig, HBase, Flink and Spark [7] 12 [1] http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/ [2] http://flink.apache.org/news/2015/08/24/introducing-flink-gelly.html [3] http://www.kdnuggets.com/2015/08/apache-flink-stream-processing.html [4] https://medium.com/@istanbul_techie/crunching-parquet-files-with-apache-flink-200bec90d8a7 [5] http://blog.acolyer.org/2015/08/19/asynchronous-distributed-snapshots-for-distributed-dataflows/ [6] http://www.zdnet.com/article/five-open-source-big-data-projects-to-watch/ [7] http://www.bigsynapse.com/addressing-big-data-performance

13. New Meetups and Events 13 • Chicago: Flink Training @ Capital One • Bay Area: Stream & Graph Processing @ MapR 13

14. GitHub stats 14

15. Upcoming • Sept 15: Washington DC Area Apache Flink Meetup • Sept 17: StreamProcessing.be meetup • Sept 28-30: Flink Talks at ApacheCon Big Data Budapest New Meetup groups: • New York • Boston 15

16. Flink Forward schedule published 16 • http://flink-forward.org/?post_type=day • Talks by Google, Data Artisans, Huawei, CapitalOne, Bouyges, Ericsson, Amadeus, ResearchGate, RedHat, and many more. 50% off for this meetup‘s guests FlinkMeetupBayArea50

Bay Area Apache Flink Meetup Community Update August 2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Bay Area Apache Flink Meetup Community Update August 2015

Similar to Bay Area Apache Flink Meetup Community Update August 2015 (20)

Recently uploaded

Recently uploaded (20)

Bay Area Apache Flink Meetup Community Update August 2015