3. Simply Business
• Largest UK business insurance provider
• More than 400.000 policy holders
• Using BML, tech and data to disrupt the
business insurance market
4. Data ’n’ Analytics
• 5 Data Engineers
• 3 Business Intelligence Developers
• 3 Data Analysts
• 1 Data Scientist
• 1 Director of Data Science
• And hiring! :-)
6. Snowplow Setup
Trackers
Collector
Enrichment
Modeling
Storage
• Trackers, collectors and storage are 100% upstream Snowplow
• Enrichment:
• Spark apps that use scala-common-enrich as a library
• We add our own enrichments after the default ones
• We perform NRT identity stitching and sessionization
• Modeling: mix of Spark and SQL jobs
• Storage: Spark apps that use scala-hadoop-shred as a library
7. Why ?
• We wanted a near real-time pipeline, but KCL was too rigid:
• Provision, set up and monitor the machines
• Configuration is difficult for complex DAGs
• In contrast, Spark:
• Once set up, the cluster is a PaaS
• Allows streaming, batch, ML and graph workloads
• Allows analysts and data scientists to use Python
9. The Radio Campaign
• We’re running a radio campaign in Birmingham, Manchester and
London
• People that get a quote starting from our radio landing pages get
£25 discount
10. The Banner
• The questionnaire to get quotes can be quite long to complete
• We wanted to reassure our customers that they would get the
discount
• We wanted to display a banner at the top through all the pages of
the questionnaire
12. Our Infrastructure
Spark
Stream
NRT
Enrichment
Scala
Stream
Collector
Kinesis
MongoDB
Visitor
API
QuoBng
App
HTTP
On average, it takes 2.5s for an event to be available in the Visitor API
13. Benefits of NRT Snowplow
• Our quoting app does not need to know about marketing, user
landing pages, etc.
• Our Mongo table with active sessions’ events becomes a view of our
event log
• Can be reused for many other use cases: analytics on read!
15. Telephony System
• We have a call center in Northampton with around 200 consultants
• We used an off-the-shelf telephony system
• It worked well for a long time, but:
• Was not very well integrated with our systems
• Quite rigid, we couldn’t adapt it to all our needs
• We had daily reports and they contained aggregated data
16. Telephony System
• We decided to replace it with a home grown, Twilio-based solution
• Components:
• Contact Strategy Manager
• Voice Channel Manager
• Communication is event-based
• We transform those events into Snowplow’s unstructured
• Spark Streaming app to insert the events into Redshift every 2min
19. Benefits of NRT Snowplow
• Event Sourcing is great for reporting and analytics: ensures that
data quality remains high
• Team managers now have a NRT view of what teams are doing
• You can aggregate and drill down on the data as appropriate
• Leveraging our data platform: Snowplow pipeline, Redshift & Looker
• Leveraging our existing skills: everyone knows how to use Looker
22. NRT Benefits
• We can dynamically alter the website while the user is still using it
• We can provide insights on live processes
• Multiple uses to improve conversion:
• Instant inclusion/exclusion from remarketing lists
• Abandoned cart emails/calls
• Social proofing (3 more people are also watching…)
• …