Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

•Download as PPTX, PDF•

0 likes•164 views

Paul Dix, CTO and co-founder of InfluxData, discussed the future of InfluxDB and the release of InfluxDB 2.0 Open Source. He explained that InfluxDB 2.0 has been rebuilt from the ground up to address limitations of the original InfluxDB like lack of distributed features and poor performance for high cardinality analytics data. The new database, called InfluxDB IOx, uses a columnar data store with parquet files and is designed to be distributed, federated, and able to run analytics at scale on high cardinality data.

Technology

Paul Dix
CTO & co-founder, InfluxData
@pauldix
North America Virtual
Experience 2020-11-10
The future of InfluxDB

Introducing InfluxDB,
an open source distributed
time series database

More Events
• Measurements
• Exceptions
• Page Views
• User actions
• Commits
• Deploys
• Things happening in time

Regular time series
t0 t1 t2 t3 t4 t6 t7
Samples at regular intervals

Irregular time series
t0 t1 t2 t3 t4 t6 t7
Events whenever they come in

Things you want to ask questions about,
visualize, or summarize over time.

InfluxDB is great for analytics*
*on lower cardinality data

InfluxDB open source lacks
distributed features

Requirements
• What cardinality?
• Analytics performance
• Separate compute from storage and tiered storage
• Operator defined Replication & Partitioning
• Able to run without locally attached storage
• Bulk data import and export
• Subscriptions
• Federated by design
• Embeddable scripting
• Greater compatibility

Iterate and Refactor or Rebuild the
Core?

Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000

Line Protocol
Measurement
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000

Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Tags

Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Fields

Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
nanosecond
epoch

Line Protocol
Series
cpu,host=serverA,num=1,region=west#idle (1.667, 1492214400000000000)
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
cpu,host=serverA,num=1,region=west#system (2342.2, 1492214400000000000)

Inverted Index
Series ID
1 - cpu,host=serverA,num=1,region=west#idle (1.667, 1492214400000000000)
2 - cpu,host=serverB,num=1,region=west#system (2342.2, 1492214400000000000)
cpu - [1, 2]
host=serverA - [1]
host=serverB - [2]
num=1 - [1, 2]
region=west - [1, 2]
Posting Lists

short for iron oxide, pronounced (eye-ox)

Not just object store
Object Store Abstraction
Local
Disk
S3
GCP
Cloud
Storage
In
Memory
Azure
Blob
Storage
Minio Ceph

Partition Key region, 1h bucket: ex: west-2020-11-10-11:00
west-2020-11-10-11:00 east-2020-11-10-11:00 west-2020-11-10-12:00 Partitions
block 1 block 2 Immutable Blocks
table table Tables of data
Parquet
file
Parquet
file
In-memory
compressed
Segment
In-memory
compressed
Segment
Physical
Layout
Mutable Write Buffer

Mapping InfluxDB into
Tables
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
host num region idle system time
serverA 1 west 1.667 2342.2 1492214400000000000
Table: cpu

Real-World Compression
• 591GB TSM across 483 files
• 97GB compressed TSM with gzip (likely due to index size)
• Naive Parquet test:
• 118GB
• 246,140 files

In-memory Perf Preview (tracing example)
• env - production or staging environment
• data_centre - the region within a cloud vendor
• cluster - a specific cluster, e.g., a k8s cluster
• user_id - an id associated with the user that issued a request that was traced
• request_id - an id associated with a single request that started a trace
• trace_id - a single id associated with all spans in the trace
• node_id - the id of compute node that the trace execution ran across
• pod_id - the id of containers that the trace execution ran across
• span_id - a random id for every sample generated in the trace

Test data cardinalities
104,998,932 rows
• env - 2
• data_centre - 20
• cluster - 200
• user_id - 200,000
• request_id - 2,000,000
• trace_id - 10,000,000
• node_id - 2,000
• pod_id - 20,000
• span_id - ∞ (a new one for each sample row)

Test data sizes
104,998,932 rows ~ 12.5 GB RAM
• env column 301 B
• data_centre ~2.1 KB
• cluster ~19.7 KB
• user_id ~176 MB
• request_id ~816 MB
• trace_id ~1.6GB
• node_id ~204 KB
• pod_id ~2 MB
• span_id ~9.2GB
• duration ~840 MB
• time ~840 MB

Find spans for a trace
SELECT * FROM “traces”
WHERE “trace_id” = “0000MjNg” AND
“time” >= ‘2020-10-30 15:12’ AND
“time” < ‘2020-10-30 16:12’;

Find spans for a trace
SELECT * FROM “traces”
WHERE “trace_id” = “0000MjNg” AND
“time” >= ‘2020-10-30 15:12’ AND
“time” < ‘2020-10-30 16:12’;
Returned in: 84.666665ms ~ 1.1B rows/sec

Flexible Replication Rules
• Synchronous & Asynchronous
• Push & Pull
• Request by request, batch, or bulk
• Partition to servers, groups of servers
• Total operator control via RESTful API

Dix’s maxim
“Your licensing strategy is your
commercialization strategy, whether by
accident or design”

Introducing InfluxDB IOx,
an open source distributed
time series database

Introducing InfluxDB IOx,
an open source federated
time series database

Introducing InfluxDB IOx,
an open source distributed
time series database
analytics database

Introducing InfluxDB IOx,
an open source distributed
time series database
columnar database

Introducing InfluxDB IOx,
an open source distributed
time series database
replication system

Introducing InfluxDB IOx,
an open source distributed
time series database
events processor

Introducing InfluxDB IOx,
an open source distributed
time series database
data lifecycle manager

Introducing InfluxDB IOx,
an open source distributed
time series database
edge processor and data store

Get Involved
• Star & watch the repo at github.com/influxdata/influxdb_iox
• Find the InfluxDB IOx topic on community.influxdata.com
• Join the #influxdb_iox channel in our community Slack
• Join us on the 2nd Wednesday of every month at 8:30 AM Pacific Time for a
tech talk on InfluxDB IOx - influxdata.com/community-showcase/influxdb-tech-
talks/
• We’re hiring for Rust, distributed systems, and columnar databases expertise.
Email to recruiting@influxdata.com and CC me paul@influxdata.com.
• Star & watch the repo at github.com/influxdata/influxdb_iox
• Find the InfluxDB IOx topic on community.influxdata.com
• Join the #influxdb_iox channel in our community Slack
• Join us on the 2nd Wednesday of every month at 8:30 AM Pacific Time for a
tech talk on InfluxDB IOx - influxdata.com/community-showcase/influxdb-tech-
talks/
• We’re hiring for Rust, distributed systems, and columnar databases expertise.
Email to recruiting@influxdata.com and CC me paul@influxdata.com.

What's hot

Introducing DataStax Enterprise 4.7DataStax

Zabbix at scale with ElasticsearchLeandro Totino Pereira

Big Data on Cloud Native PlatformSunil Govindan

Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy

How to Protect Big Data in a Containerized EnvironmentBlueData, Inc.

Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...DataStax

Reporting from the Trenches: Intuit & CassandraDataStax

Webinar how to build a highly available time series solution with kairos-db (1)Julia Angell

Workshop - How to benchmark your databaseScyllaDB

Data Pipelines with Spark & DataStax EnterpriseDataStax

GCP for AWS ProfessionalsDoiT International

Overcoming Barriers of Scaling Your DatabaseScyllaDB

Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...OpenStack

Cassandra on Docker @ Walmart LabsDataStax Academy

IoT Architectural Overview - 3 use case studies from InfluxData InfluxData

Cloudian HyperStore 'Forever Live' Storage PlatformCloudian

Steering the Sea Monster - Integrating Scylla with KubernetesScyllaDB

The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatOpenStack

The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy

OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...OpenStack

What's hot (20)

Introducing DataStax Enterprise 4.7

Zabbix at scale with Elasticsearch

Big Data on Cloud Native Platform

Cisco: Cassandra adoption on Cisco UCS & OpenStack

How to Protect Big Data in a Containerized Environment

Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...

Reporting from the Trenches: Intuit & Cassandra

Webinar how to build a highly available time series solution with kairos-db (1)

Workshop - How to benchmark your database

Data Pipelines with Spark & DataStax Enterprise

GCP for AWS Professionals

Overcoming Barriers of Scaling Your Database

Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...

Cassandra on Docker @ Walmart Labs

IoT Architectural Overview - 3 use case studies from InfluxData

Cloudian HyperStore 'Forever Live' Storage Platform

Steering the Sea Monster - Integrating Scylla with Kubernetes

The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat

The Last Pickle: Distributed Tracing from Application to Database

OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...

Similar to Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxData

InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxData

Introduction to SolrCloudVarun Thacker

InfluxDB InternalsInfluxData

Intro to InfluxDBInfluxData

IBM Internet-of-Things architecture and capabilitiesIBM_Info_Management

Webinar: Faster Log Indexing with FusionLucidworks

iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy

Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot

IBM IoT Architecture and Capabilities at the Edge and Cloud Pradeep Natarajan

Apache Geode Meetup, LondonApache Geode

Intro to Time Series InfluxData

Introduction to Apache ApexApache Apex

Scalable IoT platformSwapnil Bawaskar

Meetup on Apache ZookeeperAnshul Patel

Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu

Ibm_IoT_Architecture_and_CapabilitiesIBM_Info_Management

Taking Splunk to the Next Level - Architecture Breakout SessionSplunk

Vijfhart thema-avond-oracle-12c-new-featuresmkorremans

ELK stack introduction abenyeung1

Similar to Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020 (20)

InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard

InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...

Introduction to SolrCloud

InfluxDB Internals

Intro to InfluxDB

IBM Internet-of-Things architecture and capabilities

Webinar: Faster Log Indexing with Fusion

iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...

Leveraging Cassandra for real-time multi-datacenter public cloud analytics

IBM IoT Architecture and Capabilities at the Edge and Cloud

Apache Geode Meetup, London

Intro to Time Series

Introduction to Apache Apex

Scalable IoT platform

Meetup on Apache Zookeeper

Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement

Ibm_IoT_Architecture_and_Capabilities

Taking Splunk to the Next Level - Architecture Breakout Session

Vijfhart thema-avond-oracle-12c-new-features

ELK stack introduction

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

CloudStudio User manual (basic edition):comworks

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Gen AI in Business - Global Trends Report 2024.pdfAddepto

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Commit 2024 - Secret Management made easyAlfredo García Lavilla

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Recently uploaded (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

What's New in Teams Calling, Meetings and Devices March 2024

SAP Build Work Zone - Overview L2-L3.pptx

Are Multi-Cloud and Serverless Good or Bad?

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

My Hashitalk Indonesia April 2024 Presentation

CloudStudio User manual (basic edition):

Anypoint Exchange: It’s Not Just a Repo!

Dev Dives: Streamline document processing with UiPath Studio Web

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Advanced Test Driven-Development @ php[tek] 2024

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Designing IA for AI - Information Architecture Conference 2024

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Gen AI in Business - Global Trends Report 2024.pdf

DMCC Future of Trade Web3 - Special Edition

WordPress Websites for Engineers: Elevate Your Brand

Commit 2024 - Secret Management made easy

Developer Data Modeling Mistakes: From Postgres to NoSQL

Nell’iperspazio con Rocket: il Framework Web di Rust!

Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

1. Paul Dix CTO & co-founder, InfluxData @pauldix North America Virtual Experience 2020-11-10 The future of InfluxDB

2. InfluxDB 2.0 Open Source GA!

4. November 12, 2013

5. Introducing InfluxDB, an open source distributed time series database

6. What is time series data?

8. Stock trades and quotes

9. Analytics

10. Log Events

11. More Events • Measurements • Exceptions • Page Views • User actions • Commits • Deploys • Things happening in time

12. Sensor data

13. Two kinds of time series data…

14. Regular time series t0 t1 t2 t3 t4 t6 t7 Samples at regular intervals

15. Irregular time series t0 t1 t2 t3 t4 t6 t7 Events whenever they come in

16. Things you want to ask questions about, visualize, or summarize over time.

17.

18. Where we are today

19. InfluxDB is great for metrics

20. InfluxDB is great for analytics* *on lower cardinality data

21. InfluxDB open source lacks distributed features

22. It’s time to advance…

23. Requirements • What cardinality? • Analytics performance • Separate compute from storage and tiered storage • Operator defined Replication & Partitioning • Able to run without locally attached storage • Bulk data import and export • Subscriptions • Federated by design • Embeddable scripting • Greater compatibility

24. Iterate and Refactor or Rebuild the Core?

25. How InfluxDB Organizes Data

26. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000

27. Line Protocol Measurement cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000

28. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Tags

29. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Fields

30. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 nanosecond epoch

31. Line Protocol Series cpu,host=serverA,num=1,region=west#idle (1.667, 1492214400000000000) cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 cpu,host=serverA,num=1,region=west#system (2342.2, 1492214400000000000)

32. Inverted Index Series ID 1 - cpu,host=serverA,num=1,region=west#idle (1.667, 1492214400000000000) 2 - cpu,host=serverB,num=1,region=west#system (2342.2, 1492214400000000000) cpu - [1, 2] host=serverA - [1] host=serverB - [2] num=1 - [1, 2] region=west - [1, 2] Posting Lists

33. Every new tag value expands index

34. Inverted Index & Time Series

35. mmap difficulties

36. Object Store Durability

37. Towards a new core

38.

39. short for iron oxide, pronounced (eye-ox)

40.

41. In-memory columnar database

42. No storage engine

43. Parquet + Object Store is huge

44. Not just object store Object Store Abstraction Local Disk S3 GCP Cloud Storage In Memory Azure Blob Storage Minio Ceph

45. How Data is Organized

46. Partition Key region, 1h bucket: ex: west-2020-11-10-11:00 west-2020-11-10-11:00 east-2020-11-10-11:00 west-2020-11-10-12:00 Partitions block 1 block 2 Immutable Blocks table table Tables of data Parquet file Parquet file In-memory compressed Segment In-memory compressed Segment Physical Layout Mutable Write Buffer

47. Mapping InfluxDB into Tables cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 host num region idle system time serverA 1 west 1.667 2342.2 1492214400000000000 Table: cpu

48. Real-World Compression • 591GB TSM across 483 files • 97GB compressed TSM with gzip (likely due to index size) • Naive Parquet test: • 118GB • 246,140 files

49. Partitioning is key to performance

50. In-memory Perf Preview (tracing example) • env - production or staging environment • data_centre - the region within a cloud vendor • cluster - a specific cluster, e.g., a k8s cluster • user_id - an id associated with the user that issued a request that was traced • request_id - an id associated with a single request that started a trace • trace_id - a single id associated with all spans in the trace • node_id - the id of compute node that the trace execution ran across • pod_id - the id of containers that the trace execution ran across • span_id - a random id for every sample generated in the trace

51. Test data cardinalities 104,998,932 rows • env - 2 • data_centre - 20 • cluster - 200 • user_id - 200,000 • request_id - 2,000,000 • trace_id - 10,000,000 • node_id - 2,000 • pod_id - 20,000 • span_id - ∞ (a new one for each sample row)

52. Test data sizes 104,998,932 rows ~ 12.5 GB RAM • env column 301 B • data_centre ~2.1 KB • cluster ~19.7 KB • user_id ~176 MB • request_id ~816 MB • trace_id ~1.6GB • node_id ~204 KB • pod_id ~2 MB • span_id ~9.2GB • duration ~840 MB • time ~840 MB

53. Find spans for a trace SELECT * FROM “traces” WHERE “trace_id” = “0000MjNg” AND “time” >= ‘2020-10-30 15:12’ AND “time” < ‘2020-10-30 16:12’;

54. Find spans for a trace SELECT * FROM “traces” WHERE “trace_id” = “0000MjNg” AND “time” >= ‘2020-10-30 15:12’ AND “time” < ‘2020-10-30 16:12’; Returned in: 84.666665ms ~ 1.1B rows/sec

55. How is InfluxDB IOx distributed?

56. Flexible Replication Rules • Synchronous & Asynchronous • Push & Pull • Request by request, batch, or bulk • Partition to servers, groups of servers • Total operator control via RESTful API

57. One Possible Configuration

58.

59. Federated, not fully connected cluster

60.

61. Dix’s maxim “Your licensing strategy is your commercialization strategy, whether by accident or design”

62. Who coordinates this?

63. InfluxDB 2.x OSS Journey

64. InfluxDB Cloud Journey

65. InfluxDB Enterprise Journey

66. Introducing InfluxDB, an open source distributed time series database

67. Introducing InfluxDB IOx, an open source distributed time series database

68. Introducing InfluxDB IOx, an open source federated time series database

69. Introducing InfluxDB IOx, an open source distributed time series database analytics database

70. Introducing InfluxDB IOx, an open source distributed time series database columnar database

71. Introducing InfluxDB IOx, an open source distributed time series database replication system

72. Introducing InfluxDB IOx, an open source distributed time series database events processor

73. Introducing InfluxDB IOx, an open source distributed time series database data lifecycle manager

74. Introducing InfluxDB IOx, an open source distributed time series database edge processor and data store

75.

76. Get Involved • Star & watch the repo at github.com/influxdata/influxdb_iox • Find the InfluxDB IOx topic on community.influxdata.com • Join the #influxdb_iox channel in our community Slack • Join us on the 2nd Wednesday of every month at 8:30 AM Pacific Time for a tech talk on InfluxDB IOx - influxdata.com/community-showcase/influxdb-tech- talks/ • We’re hiring for Rust, distributed systems, and columnar databases expertise. Email to recruiting@influxdata.com and CC me paul@influxdata.com. • Star & watch the repo at github.com/influxdata/influxdb_iox • Find the InfluxDB IOx topic on community.influxdata.com • Join the #influxdb_iox channel in our community Slack • Join us on the 2nd Wednesday of every month at 8:30 AM Pacific Time for a tech talk on InfluxDB IOx - influxdata.com/community-showcase/influxdb-tech- talks/ • We’re hiring for Rust, distributed systems, and columnar databases expertise. Email to recruiting@influxdata.com and CC me paul@influxdata.com.

Editor's Notes

Today I want to talk to you about the future of InfluxDB. But before that, let’s talk about some the big news!
InfluxDB 2.0 open source is now released! This represents a multi-year effort. With our cloud offering, our goal was to switch to a continuous, services based, cloud first delivery model that could be billed by usage, not by servers. This means that we ship production code every business day and make continuous incremental improvement and our customers for Cloud 2 only pay for what they use. For our open source, we wanted to ship an all-in-one database, monitoring system, visualization engine, and scripting scheduler. Flux, our new scripting and query language was the center of this effort. With it, users can now do more than ever before within the database. They can even call out to third-party APIs to bring in more data, send data out, trigger action, or send alerts. This can happen at query time in ad-hoc queries, or scheduled through the Task scheduling system. Our goal was to ship the same API in our cloud offering and in open source. We think you’ll love the open source InfluxDB 2.0 for local development and deployment at the edge or on single servers within your cloud or data center environment. Ryan Betts, our VP of Engineering will be covering more of the details in the talk right after mine.
For my talk, I wanted to tell you about what we’re thinking for The Future. I realize it may be early to start thinking about this with 2.0 open source just being released today, but I’m very excited about the work some of us have been doing and I want to share it publicly.
But before we get into the future, I need to talk about the past. Specifically, November 12th, 2013, which is the day I gave the first talk about InfluxDB and introduced it to the world. While the next 60 seconds will likely be review for all of you, I hope you’ll bear with me as I set the stage.
The talk was titled: Introducing InfluxDB, an open source distributed time series database.
In that talk I sought to define what I meant by time series data. I pointed to some specific examples.
Metrics being the first and most obvious example of time series as that was what most people thought of when I talked about that
I went further to give more examples of events. All things I thought could be analyzed, inspected, visualized and summarized as a time series.
I would later add sensor data to this list of time series examples.
And I talked about two different kinds of time series
More broadly, I claimed that all data you perform analytics on is time series data. Meaning, anytime you’re doing data analysis, you’re doing it either over time or as a snapshot in time. I saw time series as a useful abstraction for solving problems and building applications in a number of different use cases. The vision I laid out then is still one I have today, which is that InfluxDB should be useful for all kinds of time series data. It should also be the building block upon which future monitoring, analytics, sensor data and time series applications can be built on.
So where are we today? Some of what I’ll say is generally about the platform and some of it will be specific to open source.
Easy to write data in with libraries in many languages. Easy to query using either InfluxQL or Flux.
With the addition of Flux, there are so many more things that InfluxDB can do outside of what a normal declarative query language can provide. It’s great for analytics. However, the caveat exists that this is only true for lower cardinality data. That is you don’t have too many unique time series and your tag values don’t have too many unique values.
InfluxDB lacking distributed features in open source means that it is frequently not chosen as a building block for time series applications. This limitation is an unfortunate, but at the time it was a necessary choice that enabled us to build a business to support our open source efforts. However, it definitely gets in the way of our broader platform vision. InfluxDB should be a platform that is adopted by a very wide audience, well outside the audience of our paying customer base.
We want to push what’s possible with InfluxDB forward. Ideally for both our open source users and our paying customers.
No limits on cardinality. Write any kind of event data and don’t worry about what a tag or field is. Best-in-class performance on analytics queries in addition to our already well-served metrics queries. Tiered data storage. The DB should use cheaper object storage as its long-term durable store. Operator control over memory usage. The operator should be able to define how much memory is used for each of buffering, caching, and query processing. Operator-controlled replication. The operator should be able to set fine-grained replication rules on each server. Operator-controlled partitioning. The operator should be able to define how data is split up amongst many servers and on a per-server basis. Operator control over topology including the ability to break up and decouple server tasks for write buffering and subscriptions, query processing, and sorting and indexing for long term storage. Designed to run in an ephemeral containerized environment. That is, it should be able to run with no locally attached storage. Bulk data export and import. Fine-grained subscriptions for some or all of the data. Broader ecosystem compatibility. Where possible, we should aim to use and embrace emerging standards in the data and analytics ecosystem. Run at the edge and in the datacenter. Federated by design. Embeddable scripting for in-process computation.
Not only does it expand the index, for cases like tracing where you have new values all the time, the index becomes larger than the time series data itself. One way around this is to use fields rather than tags, but that is a limiting choice since you don’t have control over how data is organized in the DB, and thus how you might want to organize it outside of the tag system.
In order to support high cardinality use cases, we’d need to ditch the inverted index and also our indexing by individual time series. As our VP of Engineering, Ryan Betts, says: InfluxDB over indexes for these use cases.
InfluxDB uses memory mapped files for the inverted index and for the time series data storage. Many modern databases have been built using this because it gives you speed of development and offloads memory management to the OS. The downside is that you loose fine grained control over how memory is used and allocated. Mmap has also proven tricky in containerized environments.
Finally, we want to be able to run with or without locally attached storage. The way that TSM and TSI organizes data doesn’t lend itself well to having some data in object storage, some in memory, and some cached on local SSD.
Once I realized that a gradual refactor wasn’t possible, I started thinking about what it would look like to start new in 2020 rather than 2013. What tools exist today that weren’t at my disposal seven years ago? What other open source could I bring to bear that would speed this effort up? So we’re building a new core for InfluxDB. And here’s the first thing to know about it.
This project is written in Rust. I’ve written about my excitement for the language before. I think Rust is the future of systems software. It gives us the fine grained control over memory that we’re looking for, but with the safety of a higher level language. Even better, its model for programming concurrent applications, which most server software, including this project are, eliminates data races. Within our Go codebase this has been a source of a number of very hard to track down bugs over the years. Its error handling also helps developers write correct software and reduces the number of runtime bugs you might otherwise create. Also, it’s embeddable into other languages and systems. This means we can embed it into InfluxDB or other parts of our stack or other analytics systems. We could even compile it down to web assembly and run it in the browser. There’s so much to love about Rust, but this talk isn’t about that. But ultimately, I want this project to form the basis of future analytics systems for the next few decades and beyond. I remember some blog post that Bryan Cantril wrote about Rust where he talked about software with longevity and he felt that Rust was a language that would ultimately help you build that kind of software. That’s the bet we’re making here.
The project is InfluxDB IOx, which is short for iron oxide so it’s pronounced InfluxDB eye-ox. We’ll take a look at the high level architecture of it, but I just want to caveat this. This project is very early stage. We’ve largely been in research mode validating our assumptions on performance, compression and functionality. We’re not producing builds yet and we don’t have documentation up yet. But there’s a project README and you can build from source. We wanted to open this up early so that our community of users could see what we’re doing.
The second thing to know is that this project is built around Apache Arrow. Arrow is an in-memory columnar data specification. But it’s also a persistence format via Apache Parquet, which is widely used both inside and outside the Arrow ecosystem. Most data warehouses and big data processing systems can read and write Parquet data. Arrow is also Arrow Flight, an RPC specification and high performance client/server framework for transferring large datasets over the network. Within the Rust part of Arrow is another project called DataFusion, which is a columnar SQL execution engine. We’re building on top of that and contributing to it. We’re using all of these tools. That makes the big headline with Arrow the fact that we’re no longer creating this database by ourselves. With Arrow as the core, we’re working with contributors around the world that are using these libraries in their own data systems.
This is the big architectural change. InfluxDB IOx is an in-memory columnar database that uses object storage for persistence with data stored in Parquet files. We looked at the existing open source columnar databases when we were starting out. We wondered if they could form the basis of a future InfluxDB backend. What we found was that they weren’t optimized for time series. Specifically, they have varying degrees of dictionary support, which is critical for our use case, little support for querying directly on compressed in-memory data with late materialization, and they weren’t optimized for windowed aggregates and computation on time. They seem to be built around a pure analytics use case that asks a question about aggregations to a single point in time. Further, they weren’t built with our core need of being able to run with in an ephemeral environment with no locally attached storage using object store for all persistence. Our evaluation pointed to a missing solution in the open source market.
It’s not a storage engine. We’re not building our own storage engine short of buffering data in memory and writing it out to Parquet files. The persistence formats we’re using under the hood are Flatbuffers for the write ahead log and Parquet files for immutable blocks of data.
With Parquet and object storage for persistence, this opens up how you can interact with your data. Backup and restore is outside the concerns of InfluxDB IOx. You can create any kind of backup & restore system you’d like. An IOx server can read some or all of its data from object storage on startup. Bulk data transfers become trivial. Clients can get Parquet files directly from object storage and they can send Parquet files to InfluxDB IOx to organize in object storage for later query workloads. Thanks to Apache Arrow, there are libraries in many languages to work with Parquet and the support is getting better month over month. Notably, Python, C++ and Java are first class citizens in the Arrow ecosystem. They represent the gold standard of functionality. We’ll help bring Rust up to the same level of compatibility. Training a machine learning model? Ask IOx where the Parquet files are that have the data you’re looking for, get the directly from object storage and have it in your Python library of choice, all with a few lines of code.
I should mention that I’m referring to object store, but there are other abstractions
I want to talk quickly about how data is organized in InfluxDB IOx. I think this is important because it shows the flexibility you have as an operator and a user and it lets you optimize for having large blocks of immutable unchanging data, which is really what time series is all about. If you’re updating your data, that means you’re literally rewriting history. Sometimes you might do this, but that’s not what we’re optimizing for. We’re optimizing for history being a fixed thing that you can work with easily and modify on the fly at query time. That means that you have blocks of data that you can move around to other servers, send out to clients, and represent compactly in object storage.
First you have the partition key, which is generated for each line that comes in. It can use any of the metadata or actual data to generate a string that represents the partition key. You could have the measurement name, tag key information or field information or time/date formatting. Partitions are logical groupings of data based on the same partition key. When a partition is snapshotted, you create an immutable block of data. A partition can have multiple blocks, but ideally you’re buffering up everything to snapshot once into a single block. You can always compact blocks later, but this can be a separate process completely outside of the DB. Blocks have tables of data where a table is once again a logical concept. At the physical level, you have individual Parquet files, which have one table in each and you have in-memory compressed segments that are optimized for query speed with some compression via encoding schemes.
One table per measurement. Tags and fields become columns. One table per Parquet file. This means that tag and field names must be unique within a measurement. Schema gets defined and created on the fly as you write data in.
But it’s a start. And we know that we can switch to Parquet as our persistence format without any fear of some sort of data explosion.
We break data up into partitions. How data is partitioned can change over time, because each partition is self describing in terms of the summary metadata that specifies what tables it has, what columns each of those tables has, and what the summary information is for each of those columns like min, max, count, sum and potentially even bloom filters for identifiers. This summary data is used by the planner at query time. Partition summaries are kept in memory and the query is analyzed to determine which partitions need to be queried to produce a result. Once in a partition, we brute force query against it, and if we have it in our segment store, that happens against compressed data without decompressing it. That is, we perform late materialization and only decompress the values we use. This means that the partitioning scheme you choose has great impact on what your queries look like. This is why we let the users define it when they create a database/bucket. It can change on a per-database basis.
We can likely do better. We’re using RLE for the span IDs and trace IDs and we’d be better just going with dictionary without the RLE.
Notice that we have time in this example. If you’re looking up by some trace ID, where’d you get it? From a log line? You’ll have a timestamp associated with it. Use it. If you’re partitioning your data by time, and in most cases this will be at least one of the criteria by which you partition your data, you can quickly narrow down the blocks of data to query against. If you have 2h partitions, then you’ll be able to find the spans you’re looking for by querying at most 2 partitions.
This returns the 10 rows in about 85 milliseconds. If you do the rough math on this it means it was able to brute force on about 1.1B rows/sec. Note that we didn’t actually process all those rows. It was operating on compressed data. We can likely get this down by a bit more by removing the RLE compression for trace ID and span. Maybe another 2x improvement. The specifics of the compressed in memory columnar store will definitely be the subject of some future tech talks.
Here’s what I think the real future is. The example I just showed takes a data center centric view. It assumes that all your data is getting pushed up to some central cluster. I think the future is federated. It operates at the edges as single nodes, it operates in factories in small clusters, and it operates in many data centers worldwide. You’ll likely have high precision data that doesn’t make sense to replicate everything up to a central place. Or at least you’ll only replicated it in highly compressed form. The future distributed time series system isn’t a cluster that runs in a data center, even if it has rack aware capabilities and multi-region routing. There’s no limit to the scale of time series data that we’ll be collecting over the coming decades. We need flexibility in how it’s replicated, queried, and stored.
* Created InfluxDB because we saw so many people re-inventing the wheel and we wanted Influx to be the basis of it * However, the lack of distributed features left a gap in the market * Infrastructure projects that fall under source available or community licenses severely limit the audience and what you can build * InfluxDB IOx is dual-licensed under MIT and Apache 2 as is common in the Rust community. No community license, no source available license, no restrictions. You can build new projects using this code, you can build new businesses using this code, you can do whatever you want with it.
Conway’s law says that you ship your org chart. That is if you create two teams to build a system, you’ll get a system comprised of two parts. I propose Dix’s maxim as it relates to open source and licensing generally, which is that your licensing strategy is your commercialization strategy, whether by accident or design. The architecture approaches for IOx are deliberate choices because of not only the functionality and operational properties we wanted in the system, but also in how we plan to commercialize it.
InfluxDB IOx is designed to be a shared-nothing server that has an API giving the operator total control over how it behaves. However, the operator must make those changes as they are needed. Who does this operation and coordination? In the most simple setups of a single server, you don’t worry about it. In two server setups you can likely get by with shell scripts and a cron job. But the more complex your environment becomes, the more complicated this coordination becomes. It was a design goal for us to separate out the core database work from the operational work across a fleet of servers. We will create this software for our own needs to operate our cloud environment. However, our cloud may be different than yours. Your environment may be different. This is why the operational coordination is kept separate. So there is maximum flexibility in topology and configuration. We plan to run the InfluxDB IOx open source bits as is in our own cloud. We won’t be running a fork, we’ll be running right off the main branch.
At the beginning of this talk I mentioned my introduction of InfluxDB to the world. And I titled it this.
I’ll be giving more talks about InfluxDB IOx over the coming months. But here’s how I’m thinking about it. Yes, it’s a distributed time series database. But it’s a lot more than just that.
It’s federated and this is a core part of its design. With time series and analytics data, the future is federated. The scale is larger than you’ll want to manage and push up to a single cluster. You’ll have edge, multiple data centers, and many thousands of potential nodes all communicating with each other.

Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

Similar to Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020 (20)

More from InfluxData

More from InfluxData (20)

Recently uploaded

Recently uploaded (20)

Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

Editor's Notes