Flink Forward Berlin 2018: Rohil Surana & Prakhar Mathur - "Democratizing data in GO-JEK"

•

1 like•557 views

At GO-JEK, we build products that help millions of Indonesians commute, shop, eat and pay, daily. Data at GO-JEK doesn’t grow linearly with the business, but exponentially, as people start building new products and logging new activities on top of the growth of the business. We currently see 6 Billion events daily and rising. GO-JEK currently has 18+ products. Each and every team publishes events as Protobuf messages to Kafka clusters in order to have a well-defined schema and to ensure backward compatibility. This makes data available to all teams for different use-cases. In order to make sense out of this raw data, we needed to have some data aggregation pipeline. We found Flink to be useful. First use-case/requirement for real-time aggregation : We needed to implement Dynamic Surge Pricing. In order to do this, we needed real-time data of booking being created and drivers available to accept bookings per min per s2Id (http://s2geometry.io/) . We created two Flink jobs to achieve this. What are daggers? After the successful implementation of Surge pricing, we realised that real-time data aggregation can solve a lot of problems. So instead of creating different jobs for these use-cases by ourselves, we came up with a DIY solution for creating Flink jobs. We created a generic application knows as DAGGERS on top of Flink that could take parameters like the topic from which the user wants to read the data along with some options including watermark intervals, delays and parallelism. What is Datlantis? In order to give a DIY interface to the user, we created a portal called Datlantis which allows users to create and deploy massive, production-grade real-time data aggregation pipelines within minutes. Datlantis uses Flink's Monitoring REST API for communicating with the Flink cluster to monitor current jobs and deploying new ones. Now the users can just select Kafka topics from all Kafka clusters and write a simple SQL query on the UI which will spawn a new Flink job. Users also have the option to select one more Kafka data-stream in order to write JOINS query. This Flink job pushes data to InfluxDB, that enables the user to visualize their data on Grafana dashboards. Once the logic of the SQL is verified using the dashboard, the Flink job is then promoted to push the data to Kafka. The users can manage their Flink jobs on Datlantis. They can edit the jobs, stop or restart the jobs or change the job configurations. They can also see logs of their Flink jobs on Datlantis itself. The reasoning behind pushing data back to Kafka is so that the aggregated data is available for all the other teams. Our application FIREHOSE takes care of consuming this data from Kafka and pushing it to different sinks like relational DB sink, HTTP sink, GRPC sink, Influx sink, Redis sink etc. This data is then pushed to our cold storage which enables us to do historical analysis. Data Pipeline: Producer Apps → Kafka → Deserialization → DataStream → SQL → Result → Serialize

Technology

Democratizing Data at GOJEK
Prakhar Mathur | Rohil Surana

Agenda
BACKGROUND
DAGGERS
DIY PORTAL
ALERTING AND MONITORING
IMPACT

18 Products
3M Orders
1M Drivers
500 Microservices

Data Aggregation
● Started with one use case - Dynamic Surge Pricing
● Hand coded Flink jobs in 4 weeks
● 20 other use cases in pipeline
● Created a DIY platform

Daggers
● Generic Flink Job
● Feeds data from Kafka
● Deserializes protobuf messages
● Can process upto 2 streams
● Aggregated data from stream(s) is sent to a sink

3
Flink SQL
Apply SQL and generate result
4
Data Stream Sink
Sink result to Influx or Kafka Sinks
1
Kafka Connector
Consume Protobuf encoded data from
Kafka
2
Streaming Table Source
Decode data and registering a streaming
table and UDFS
Dagger Insights

Datlantis
● DIY platform
● User friendly interface to a fully automated system
● Creates and deploys DAGGERS
● In minutes using SQL-like syntax

What it does?
● Flink’s Monitoring REST API
● Get status of currently running Jobs
● Create new dagger jobs on flink cluster
● Stop any running Job
● Edit any running Job

Decentralisation
● Created DAGGERS cluster for different teams
● Provided them with dashboard for basic monitoring
● Added authorisation benefits

Time Series Sink
● Preview mode
● Default data sink
● Integrated with grafana - used for monitoring & alerting

Kafka Sink
● Publish to Kafka topic
● Another DIY tool to sink from Kafka to one of the following:
○ Services - HTTP or GRPC
○ DB - relational OR time series
○ Analytics platforms - Clevertap or Mixpanel
○ Log - for debugging

Atlas
● Geospatial Visualisation Platform
● Maps to actionable insights

1K+
● Spanned over 6 Flink Cluster
● Most of it created by analysts
● Actively used for monitoring
● Dashboards created are used by
city heads
REAL TIME DAGGERS
2 min
● Single Form to create DAGGER
● The data can be sent to a sink
● Data ready to be consumed as soon
as generated
TO PRODUCTION
1+ TB
● Real time data analysis across all cluster
● Processed data is sent to one of the sinks
DATA PROCESSED EVERYDAY

Deployment
● We have Flink Clusters on Yarn and Kubernetes
● Checkpointing - HDFS and Google Cloud Storage
● Dagger Kubernetes controller -
○ Job JAR is available on Flink cluster
○ Scales cluster when more slots needed

Alerting
● Automated alerts from Datlantis
● Users are provided with a Health dashboard
● Alerts are sent to specific teams via their slack channels and pager
duties

25+ Metrics03
● For allocation metrics
● Created & maintained by analysts
44,000 geolocation02
● For dynamic surge pricing
● Demand & supply
5+ Billion Messages/day01
● For system uptime
● Across 500 microservices
User segmentation & Real-
time triggers
04
● For growth campaign
● 26% better conversion
Impact

Let’s talk !
Prakhar Mathur
Medium : @prakharmathur_345
Rohil Surana
Medium : @rohilsurana

Recently uploaded

🐬 The future of MySQL is Postgres 🐘RTylerCroy

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Tech Trends Report 2024 Future Today Institute.pdfhans926745

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Histor y of HAM Radio presentation slidevu2urc

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Tech Trends Report 2024 Future Today Institute.pdf

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

How to Troubleshoot Apps for the Modern Connected Worker

Boost Fertility New Invention Ups Success Rates.pdf

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Exploring the Future Potential of AI-Enabled Smartphone Processors

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Boost PC performance: How more available memory can improve productivity

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

A Domino Admins Adventures (Engage 2024)

Histor y of HAM Radio presentation slide

How to Troubleshoot Apps for the Modern Connected Worker

GenCyber Cyber Security Day Presentation

Apidays New York 2024 - The value of a flexible API Management solution for O...

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Automating Google Workspace (GWS) & more with Apps Script

Flink Forward Berlin 2018: Rohil Surana & Prakhar Mathur - "Democratizing data in GO-JEK"

1. Democratizing Data at GOJEK Prakhar Mathur | Rohil Surana

2. Agenda BACKGROUND DAGGERS DIY PORTAL ALERTING AND MONITORING IMPACT

3. 18 Products 3M Orders 1M Drivers 500 Microservices

4. Data Pipeline

5. Data Aggregation ● Started with one use case - Dynamic Surge Pricing ● Hand coded Flink jobs in 4 weeks ● 20 other use cases in pipeline ● Created a DIY platform

7. Daggers ● Generic Flink Job ● Feeds data from Kafka ● Deserializes protobuf messages ● Can process upto 2 streams ● Aggregated data from stream(s) is sent to a sink

8. 3 Flink SQL Apply SQL and generate result 4 Data Stream Sink Sink result to Influx or Kafka Sinks 1 Kafka Connector Consume Protobuf encoded data from Kafka 2 Streaming Table Source Decode data and registering a streaming table and UDFS Dagger Insights

9. DD O Y OURSEL F I T

10. Datlantis ● DIY platform ● User friendly interface to a fully automated system ● Creates and deploys DAGGERS ● In minutes using SQL-like syntax

11. What it does? ● Flink’s Monitoring REST API ● Get status of currently running Jobs ● Create new dagger jobs on flink cluster ● Stop any running Job ● Edit any running Job

12. Sample Query

13. Decentralisation ● Created DAGGERS cluster for different teams ● Provided them with dashboard for basic monitoring ● Added authorisation benefits

14.

15.

16.

17.

18.

19.

20.

21.

22. Data Sinks ● Time series ● Kafka

23. Time Series Sink ● Preview mode ● Default data sink ● Integrated with grafana - used for monitoring & alerting

24.

25. Kafka Sink ● Publish to Kafka topic ● Another DIY tool to sink from Kafka to one of the following: ○ Services - HTTP or GRPC ○ DB - relational OR time series ○ Analytics platforms - Clevertap or Mixpanel ○ Log - for debugging

26. Architecture

27. Atlas ● Geospatial Visualisation Platform ● Maps to actionable insights

28.

29. 1K+ ● Spanned over 6 Flink Cluster ● Most of it created by analysts ● Actively used for monitoring ● Dashboards created are used by city heads REAL TIME DAGGERS 2 min ● Single Form to create DAGGER ● The data can be sent to a sink ● Data ready to be consumed as soon as generated TO PRODUCTION 1+ TB ● Real time data analysis across all cluster ● Processed data is sent to one of the sinks DATA PROCESSED EVERYDAY

30. Deployment ● We have Flink Clusters on Yarn and Kubernetes ● Checkpointing - HDFS and Google Cloud Storage ● Dagger Kubernetes controller - ○ Job JAR is available on Flink cluster ○ Scales cluster when more slots needed

31. Deployment

32. Monitoring

33.

34. Alerting ● Automated alerts from Datlantis ● Users are provided with a Health dashboard ● Alerts are sent to specific teams via their slack channels and pager duties

35. 25+ Metrics03 ● For allocation metrics ● Created & maintained by analysts 44,000 geolocation02 ● For dynamic surge pricing ● Demand & supply 5+ Billion Messages/day01 ● For system uptime ● Across 500 microservices User segmentation & Real- time triggers 04 ● For growth campaign ● 26% better conversion Impact

36. Let’s talk ! Prakhar Mathur Medium : @prakharmathur_345 Rohil Surana Medium : @rohilsurana

Flink Forward Berlin 2018: Rohil Surana & Prakhar Mathur - "Democratizing data in GO-JEK"

Recommended

Recommended

More Related Content

More from Flink Forward

More from Flink Forward (20)

Recently uploaded

Recently uploaded (20)

Flink Forward Berlin 2018: Rohil Surana & Prakhar Mathur - "Democratizing data in GO-JEK"