Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; Krzysztof Zarzycki, GetInData

Assisting Millions of User in Real-Time
Big Data Tech Warsaw 2018

2
The Speakers
Who are these guys?
Alexey Brodovshuk
@alexeybrod
Krzysztof Zarzycki
@k_zarzyk

3
About Kcell
Kcell JSC is a part of the largest Scandinavian telecommunications holding – TeliaCompany
Kcell has a strong software
development team and lots
of experience in building
services and products
We like innovations
> 10 000 000 subscribers
Largest GSM operator
in Kazakhstan
4G (40%), 3G (73%), 2G
(96%) population
Great network
coverage
There is the ongoing
process of company digital
transformation
Not only telco

4
Business needs
Assisting Millions of User in Real-Time
SMS events
Voice usage events
Data usage events
Roaming events
Location events
Input Process Actions

5
Use Cases
Use case scenarios. Just few of many.
Case
If subscriber top-ups her balance too often in
short period of time. We can offer her a less
expensive tariff or auto-payment services.
Balance Top Up Case
Trigger
UI

6
Roaming
Fraud
Trigger to Marketing Platform if subscriber
visited X country OR/AND registered in Y
visited mobile network and his device's type
is Z
Roaming case
Send an email to the anti-fraud unit if
subscriber registered in roaming but his
balance at the moment is equal to 0.
This situation is impossible in standard case.
Fraud case in roaming

7
Old System
Why did we start to look for the new solution?
External Vendor
Solution
Blackbox Solution
Scalability issues
Not reliable
1
2
3
Kcell Developers can’t fix, tweak or optimize it
Limited to ~2000 events / sec
Can’t support all needed data sources
Multiple accidents which took too much time to resolve

8
Scale
Required system throughput
160KEvents / second
10MSubscribers
22.15TB / month

9
About GetInData
Big Data. Passion. Experience.
Roots at
Spotify
Focus on
Big Data
from Day 1
Production
Experience
Contributions
to
Apache Flink

10
New Solution
Real-time Stream Processing
ingestion outgestion
events
hub
events
processing
HTTP
push/pull
FTP
NFS
MQ
HTTP
push/pull
FTP
MQ

11
New Solution
flink
events
hub
events
processing
HTTP
push/pull
FTP
NFS
MQ
HTTP
push/pull
FTP
MQ

12
Processing Flow
raw call events
data usage events
transform
transformed events
transform
transformed events
local state
RocksDB
control topic
Admin UI
HTTP
calls
notification
events
outgestion
ingestion
ingestion
submit/stop
triggers

13
New Solution (Operations)
Web UI, Monitoring, Security
flink
events
hub
events
processing
HTTP
push/pull
FTP
NFS
MQ
HTTP
push/pull
FTP
MQ
Admin UI
(Triggers workbench)
Monitoring
ELK stack - logs
InfluxDB/Grafana - metrics
Security
FreeIPA
Kerberos
LDAP/AD
API (kafka based)

14
New Solution (Data Lake)
Data Lake and Sub-second OLAP Analytics
flink
events
hub
events
processing
HTTP
push/pull
FTP
NFS
MQ
HTTP
push/pull
FTP
MQ
Data Lake
Historical Storage (HDFS)
Batch (Spark) SQL (Hive)
Keep history, Report, Explore
Column-oriented
Data store
OLAP (Druid)
online

15
Decisions made
Some decisions our team made before or during project implementation
Streaming-first approach
Apache Kafka for event hub
Apache Flink
Powerful Real-Time Analytics

16
Apache Avro
Keep state local to the process
Ingest reference data for local joins and
enrichment
● No need to query external systems
while processing
● Data time correlation correctness
Performance
transformed
events
transformed
events
Subscriber profile data
(events)
Local State
Not at >100K
events / sec

17
Nifi for data ingestion (no coding)
● but not for CEP
Web UI for configuring triggers
Ease of Use

18
Flink on YARN, with HDFS
HA for redundancy and running ~24/7
InfluxDb & Grafana for monitoring & alerting
ELK for logs collection and aggregation
Reliability and battle-tested techniques
Kerberos and AD thanks to FreeIPA
Apache Ranger for authorization
Security

19
One platform for the whole Enterprise
Batch (adhoc) queries too
● Spark, Hive/Presto
Online analytics
● OLAP
Extensiveness
HDP
Open-source technologies
HDP as a licence-free distribution
Just start with a bunch of servers
Cost-Efficiency

20
Before You Start
Words of wisdom
1
Simple sketchy trigger request quickly becomes a complex algorithm2
Know your data sources
● Do NOT assume that your data sources are ready for streaming
3
DO prepare yourself to use open-source
● NiFi is a great framework, but not a comprehend set of processors
● HDP is a great distribution, but versions in it are quickly outdated
4
Start small, Start fast

21
Our Collaboration
Two heads are better than one
Joint development team
Not a vendor solution
Development as one team
Code quality
Code review and
automated tools for
code quality control
Agile Practices
Distant geographic
locations, but
everyday standups
Go live quickly!
<4 months to first
production case
running 24/7!
Deliver
DevOps/Automation
Knowledge sharing
Constant knowledge
exchange in areas of
expertise
Testing
Separate testing
environment
Automated Unit/E2E tests

22
Make it a company-wide,
self-service go-to place for data
analysis
Future Work
We have already done a lot. But more great things are coming.
2018 Q2 2018 Q3 2018 Q4 Bright Future
More Data Sources
More Triggers
Geolocation data
CDRs
Equipment logs
Data Lake
Machine Learning
We plan to include machine
learning and other tools that
would enhance our platform even
more
Real-time BI
Intraday view on business and
operations
Call center, clickstream,
communication… all in one place
ready for behavioral analysis
Customer 360 view
Monetize valuable insights from
our combined rich data sources.
Data Monetization
Predictive maintenance
Network Optimization
To lower operational costs
And make better investments
And many more...

Questions?
Big Data Tech Warsaw 2018
zarz@getindata.com
alexey.brodovshuk@gmail.com
Contact us:

Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; Krzysztof Zarzycki, GetInData

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; Krzysztof Zarzycki, GetInData

Similar to Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; Krzysztof Zarzycki, GetInData (20)

More from Evention

More from Evention (20)

Recently uploaded

Recently uploaded (20)

Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; Krzysztof Zarzycki, GetInData