SlideShare a Scribd company logo
1 of 76
Paul Dix
CTO & co-founder, InfluxData
@pauldix
North America Virtual
Experience 2020-11-10
The future of InfluxDB
InfluxDB 2.0 Open Source GA!
November 12, 2013
Introducing InfluxDB,
an open source distributed
time series database
What is time series data?
Stock trades and quotes
Analytics
Log Events
More Events
• Measurements
• Exceptions
• Page Views
• User actions
• Commits
• Deploys
• Things happening in time
Sensor data
Two kinds of time series
data…
Regular time series
t0 t1 t2 t3 t4 t6 t7
Samples at regular intervals
Irregular time series
t0 t1 t2 t3 t4 t6 t7
Events whenever they come in
Things you want to ask questions about,
visualize, or summarize over time.
Where we are today
InfluxDB is great for metrics
InfluxDB is great for analytics*
*on lower cardinality data
InfluxDB open source lacks
distributed features
It’s time to advance…
Requirements
• What cardinality?
• Analytics performance
• Separate compute from storage and tiered storage
• Operator defined Replication & Partitioning
• Able to run without locally attached storage
• Bulk data import and export
• Subscriptions
• Federated by design
• Embeddable scripting
• Greater compatibility
Iterate and Refactor or Rebuild the
Core?
How InfluxDB Organizes Data
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Line Protocol
Measurement
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Tags
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
Fields
Line Protocol
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
nanosecond
epoch
Line Protocol
Series
cpu,host=serverA,num=1,region=west#idle (1.667, 1492214400000000000)
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
cpu,host=serverA,num=1,region=west#system (2342.2, 1492214400000000000)
Inverted Index
Series ID
1 - cpu,host=serverA,num=1,region=west#idle (1.667, 1492214400000000000)
2 - cpu,host=serverB,num=1,region=west#system (2342.2, 1492214400000000000)
cpu - [1, 2]
host=serverA - [1]
host=serverB - [2]
num=1 - [1, 2]
region=west - [1, 2]
Posting Lists
Every new tag value expands
index
Inverted Index & Time Series
mmap difficulties
Object Store Durability
Towards a new core
short for iron oxide, pronounced (eye-ox)
In-memory columnar database
No storage engine
Parquet + Object Store is huge
Not just object store
Object Store Abstraction
Local
Disk
S3
GCP
Cloud
Storage
In
Memory
Azure
Blob
Storage
Minio Ceph
How Data is Organized
Partition Key region, 1h bucket: ex: west-2020-11-10-11:00
west-2020-11-10-11:00 east-2020-11-10-11:00 west-2020-11-10-12:00 Partitions
block 1 block 2 Immutable Blocks
table table Tables of data
Parquet
file
Parquet
file
In-memory
compressed
Segment
In-memory
compressed
Segment
Physical
Layout
Mutable Write Buffer
Mapping InfluxDB into
Tables
cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000
host num region idle system time
serverA 1 west 1.667 2342.2 1492214400000000000
Table: cpu
Real-World Compression
• 591GB TSM across 483 files
• 97GB compressed TSM with gzip (likely due to index size)
• Naive Parquet test:
• 118GB
• 246,140 files
Partitioning is key to performance
In-memory Perf Preview (tracing example)
• env - production or staging environment
• data_centre - the region within a cloud vendor
• cluster - a specific cluster, e.g., a k8s cluster
• user_id - an id associated with the user that issued a request that was traced
• request_id - an id associated with a single request that started a trace
• trace_id - a single id associated with all spans in the trace
• node_id - the id of compute node that the trace execution ran across
• pod_id - the id of containers that the trace execution ran across
• span_id - a random id for every sample generated in the trace
Test data cardinalities
104,998,932 rows
• env - 2
• data_centre - 20
• cluster - 200
• user_id - 200,000
• request_id - 2,000,000
• trace_id - 10,000,000
• node_id - 2,000
• pod_id - 20,000
• span_id - ∞ (a new one for each sample row)
Test data sizes
104,998,932 rows ~ 12.5 GB RAM
• env column 301 B
• data_centre ~2.1 KB
• cluster ~19.7 KB
• user_id ~176 MB
• request_id ~816 MB
• trace_id ~1.6GB
• node_id ~204 KB
• pod_id ~2 MB
• span_id ~9.2GB
• duration ~840 MB
• time ~840 MB
Find spans for a trace
SELECT * FROM “traces”
WHERE “trace_id” = “0000MjNg” AND
“time” >= ‘2020-10-30 15:12’ AND
“time” < ‘2020-10-30 16:12’;
Find spans for a trace
SELECT * FROM “traces”
WHERE “trace_id” = “0000MjNg” AND
“time” >= ‘2020-10-30 15:12’ AND
“time” < ‘2020-10-30 16:12’;
Returned in: 84.666665ms ~ 1.1B rows/sec
How is InfluxDB IOx distributed?
Flexible Replication Rules
• Synchronous & Asynchronous
• Push & Pull
• Request by request, batch, or bulk
• Partition to servers, groups of servers
• Total operator control via RESTful API
One Possible Configuration
Federated, not fully connected cluster
Dix’s maxim
“Your licensing strategy is your
commercialization strategy, whether by
accident or design”
Who coordinates this?
InfluxDB 2.x OSS Journey
InfluxDB Cloud Journey
InfluxDB Enterprise Journey
Introducing InfluxDB,
an open source distributed
time series database
Introducing InfluxDB IOx,
an open source distributed
time series database
Introducing InfluxDB IOx,
an open source federated
time series database
Introducing InfluxDB IOx,
an open source distributed
time series database
analytics database
Introducing InfluxDB IOx,
an open source distributed
time series database
columnar database
Introducing InfluxDB IOx,
an open source distributed
time series database
replication system
Introducing InfluxDB IOx,
an open source distributed
time series database
events processor
Introducing InfluxDB IOx,
an open source distributed
time series database
data lifecycle manager
Introducing InfluxDB IOx,
an open source distributed
time series database
edge processor and data store
Get Involved
• Star & watch the repo at github.com/influxdata/influxdb_iox
• Find the InfluxDB IOx topic on community.influxdata.com
• Join the #influxdb_iox channel in our community Slack
• Join us on the 2nd Wednesday of every month at 8:30 AM Pacific Time for a
tech talk on InfluxDB IOx - influxdata.com/community-showcase/influxdb-tech-
talks/
• We’re hiring for Rust, distributed systems, and columnar databases expertise.
Email to recruiting@influxdata.com and CC me paul@influxdata.com.
• Star & watch the repo at github.com/influxdata/influxdb_iox
• Find the InfluxDB IOx topic on community.influxdata.com
• Join the #influxdb_iox channel in our community Slack
• Join us on the 2nd Wednesday of every month at 8:30 AM Pacific Time for a
tech talk on InfluxDB IOx - influxdata.com/community-showcase/influxdb-tech-
talks/
• We’re hiring for Rust, distributed systems, and columnar databases expertise.
Email to recruiting@influxdata.com and CC me paul@influxdata.com.

More Related Content

What's hot

Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7DataStax
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
 
How to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentHow to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentBlueData, Inc.
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...DataStax
 
Reporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraReporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraDataStax
 
Webinar how to build a highly available time series solution with kairos-db (1)
Webinar  how to build a highly available time series solution with kairos-db (1)Webinar  how to build a highly available time series solution with kairos-db (1)
Webinar how to build a highly available time series solution with kairos-db (1)Julia Angell
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your databaseScyllaDB
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseDataStax
 
Overcoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your DatabaseOvercoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your DatabaseScyllaDB
 
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...OpenStack
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData InfluxData
 
Cloudian HyperStore 'Forever Live' Storage Platform
Cloudian HyperStore 'Forever Live' Storage PlatformCloudian HyperStore 'Forever Live' Storage Platform
Cloudian HyperStore 'Forever Live' Storage PlatformCloudian
 
Steering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with KubernetesSteering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with KubernetesScyllaDB
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatOpenStack
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseDataStax Academy
 
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...OpenStack
 

What's hot (20)

Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 
Zabbix at scale with Elasticsearch
Zabbix at scale with ElasticsearchZabbix at scale with Elasticsearch
Zabbix at scale with Elasticsearch
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStack
 
How to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentHow to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized Environment
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
 
Reporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraReporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & Cassandra
 
Webinar how to build a highly available time series solution with kairos-db (1)
Webinar  how to build a highly available time series solution with kairos-db (1)Webinar  how to build a highly available time series solution with kairos-db (1)
Webinar how to build a highly available time series solution with kairos-db (1)
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your database
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
GCP for AWS Professionals
GCP for AWS ProfessionalsGCP for AWS Professionals
GCP for AWS Professionals
 
Overcoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your DatabaseOvercoming Barriers of Scaling Your Database
Overcoming Barriers of Scaling Your Database
 
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData IoT Architectural Overview - 3 use case studies from InfluxData
IoT Architectural Overview - 3 use case studies from InfluxData
 
Cloudian HyperStore 'Forever Live' Storage Platform
Cloudian HyperStore 'Forever Live' Storage PlatformCloudian HyperStore 'Forever Live' Storage Platform
Cloudian HyperStore 'Forever Live' Storage Platform
 
Steering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with KubernetesSteering the Sea Monster - Integrating Scylla with Kubernetes
Steering the Sea Monster - Integrating Scylla with Kubernetes
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
 
The Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to DatabaseThe Last Pickle: Distributed Tracing from Application to Database
The Last Pickle: Distributed Tracing from Application to Database
 
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...
OpenStack and Red Hat: How we learned to adapt with our customers in a maturi...
 

Similar to Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxData
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxData
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB InternalsInfluxData
 
Intro to InfluxDB
Intro to InfluxDBIntro to InfluxDB
Intro to InfluxDBInfluxData
 
IBM Internet-of-Things architecture and capabilities
IBM Internet-of-Things architecture and capabilitiesIBM Internet-of-Things architecture and capabilities
IBM Internet-of-Things architecture and capabilitiesIBM_Info_Management
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionLucidworks
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud Pradeep Natarajan
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode
 
Intro to Time Series
Intro to Time Series Intro to Time Series
Intro to Time Series InfluxData
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache ApexApache Apex
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache ZookeeperAnshul Patel
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu
 
Ibm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_CapabilitiesIbm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_CapabilitiesIBM_Info_Management
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction abenyeung1
 

Similar to Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020 (20)

InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB Internals
 
Intro to InfluxDB
Intro to InfluxDBIntro to InfluxDB
Intro to InfluxDB
 
IBM Internet-of-Things architecture and capabilities
IBM Internet-of-Things architecture and capabilitiesIBM Internet-of-Things architecture and capabilities
IBM Internet-of-Things architecture and capabilities
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, London
 
Intro to Time Series
Intro to Time Series Intro to Time Series
Intro to Time Series
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Scalable IoT platform
Scalable IoT platformScalable IoT platform
Scalable IoT platform
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache Zookeeper
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
 
Ibm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_CapabilitiesIbm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_Capabilities
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
 

More from InfluxData

Announcing InfluxDB Clustered
Announcing InfluxDB ClusteredAnnouncing InfluxDB Clustered
Announcing InfluxDB ClusteredInfluxData
 
Best Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow EcosystemBest Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow EcosystemInfluxData
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...InfluxData
 
Power Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBPower Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBInfluxData
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base InfluxData
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackInfluxData
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustMeet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustInfluxData
 
Introducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud DedicatedIntroducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud DedicatedInfluxData
 
Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB InfluxData
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...InfluxData
 
Introducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineIntroducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineInfluxData
 
Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena InfluxData
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineInfluxData
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBStreamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBInfluxData
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...InfluxData
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022InfluxData
 

More from InfluxData (20)

Announcing InfluxDB Clustered
Announcing InfluxDB ClusteredAnnouncing InfluxDB Clustered
Announcing InfluxDB Clustered
 
Best Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow EcosystemBest Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow Ecosystem
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
 
Power Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBPower Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDB
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING Stack
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustMeet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using Rust
 
Introducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud DedicatedIntroducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud Dedicated
 
Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
 
Introducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineIntroducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage Engine
 
Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage Engine
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBStreamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Experience NA 2020

Editor's Notes

  1. Today I want to talk to you about the future of InfluxDB. But before that, let’s talk about some the big news!
  2. InfluxDB 2.0 open source is now released! This represents a multi-year effort. With our cloud offering, our goal was to switch to a continuous, services based, cloud first delivery model that could be billed by usage, not by servers. This means that we ship production code every business day and make continuous incremental improvement and our customers for Cloud 2 only pay for what they use. For our open source, we wanted to ship an all-in-one database, monitoring system, visualization engine, and scripting scheduler. Flux, our new scripting and query language was the center of this effort. With it, users can now do more than ever before within the database. They can even call out to third-party APIs to bring in more data, send data out, trigger action, or send alerts. This can happen at query time in ad-hoc queries, or scheduled through the Task scheduling system. Our goal was to ship the same API in our cloud offering and in open source. We think you’ll love the open source InfluxDB 2.0 for local development and deployment at the edge or on single servers within your cloud or data center environment. Ryan Betts, our VP of Engineering will be covering more of the details in the talk right after mine.
  3. For my talk, I wanted to tell you about what we’re thinking for The Future. I realize it may be early to start thinking about this with 2.0 open source just being released today, but I’m very excited about the work some of us have been doing and I want to share it publicly.
  4. But before we get into the future, I need to talk about the past. Specifically, November 12th, 2013, which is the day I gave the first talk about InfluxDB and introduced it to the world. While the next 60 seconds will likely be review for all of you, I hope you’ll bear with me as I set the stage.
  5. The talk was titled: Introducing InfluxDB, an open source distributed time series database.
  6. In that talk I sought to define what I meant by time series data. I pointed to some specific examples.
  7. Metrics being the first and most obvious example of time series as that was what most people thought of when I talked about that
  8. I went further to give more examples of events. All things I thought could be analyzed, inspected, visualized and summarized as a time series.
  9. I would later add sensor data to this list of time series examples.
  10. And I talked about two different kinds of time series
  11. More broadly, I claimed that all data you perform analytics on is time series data. Meaning, anytime you’re doing data analysis, you’re doing it either over time or as a snapshot in time. I saw time series as a useful abstraction for solving problems and building applications in a number of different use cases. The vision I laid out then is still one I have today, which is that InfluxDB should be useful for all kinds of time series data. It should also be the building block upon which future monitoring, analytics, sensor data and time series applications can be built on.
  12. So where are we today? Some of what I’ll say is generally about the platform and some of it will be specific to open source.
  13. Easy to write data in with libraries in many languages. Easy to query using either InfluxQL or Flux.
  14. With the addition of Flux, there are so many more things that InfluxDB can do outside of what a normal declarative query language can provide. It’s great for analytics. However, the caveat exists that this is only true for lower cardinality data. That is you don’t have too many unique time series and your tag values don’t have too many unique values.
  15. InfluxDB lacking distributed features in open source means that it is frequently not chosen as a building block for time series applications. This limitation is an unfortunate, but at the time it was a necessary choice that enabled us to build a business to support our open source efforts. However, it definitely gets in the way of our broader platform vision. InfluxDB should be a platform that is adopted by a very wide audience, well outside the audience of our paying customer base.
  16. We want to push what’s possible with InfluxDB forward. Ideally for both our open source users and our paying customers.
  17. No limits on cardinality. Write any kind of event data and don’t worry about what a tag or field is. Best-in-class performance on analytics queries in addition to our already well-served metrics queries. Tiered data storage. The DB should use cheaper object storage as its long-term durable store. Operator control over memory usage. The operator should be able to define how much memory is used for each of buffering, caching, and query processing. Operator-controlled replication. The operator should be able to set fine-grained replication rules on each server. Operator-controlled partitioning. The operator should be able to define how data is split up amongst many servers and on a per-server basis. Operator control over topology including the ability to break up and decouple server tasks for write buffering and subscriptions, query processing, and sorting and indexing for long term storage. Designed to run in an ephemeral containerized environment. That is, it should be able to run with no locally attached storage. Bulk data export and import. Fine-grained subscriptions for some or all of the data. Broader ecosystem compatibility. Where possible, we should aim to use and embrace emerging standards in the data and analytics ecosystem. Run at the edge and in the datacenter. Federated by design. Embeddable scripting for in-process computation.
  18. Not only does it expand the index, for cases like tracing where you have new values all the time, the index becomes larger than the time series data itself. One way around this is to use fields rather than tags, but that is a limiting choice since you don’t have control over how data is organized in the DB, and thus how you might want to organize it outside of the tag system.
  19. In order to support high cardinality use cases, we’d need to ditch the inverted index and also our indexing by individual time series. As our VP of Engineering, Ryan Betts, says: InfluxDB over indexes for these use cases.
  20. InfluxDB uses memory mapped files for the inverted index and for the time series data storage. Many modern databases have been built using this because it gives you speed of development and offloads memory management to the OS. The downside is that you loose fine grained control over how memory is used and allocated. Mmap has also proven tricky in containerized environments.
  21. Finally, we want to be able to run with or without locally attached storage. The way that TSM and TSI organizes data doesn’t lend itself well to having some data in object storage, some in memory, and some cached on local SSD.
  22. Once I realized that a gradual refactor wasn’t possible, I started thinking about what it would look like to start new in 2020 rather than 2013. What tools exist today that weren’t at my disposal seven years ago? What other open source could I bring to bear that would speed this effort up? So we’re building a new core for InfluxDB. And here’s the first thing to know about it.
  23. This project is written in Rust. I’ve written about my excitement for the language before. I think Rust is the future of systems software. It gives us the fine grained control over memory that we’re looking for, but with the safety of a higher level language. Even better, its model for programming concurrent applications, which most server software, including this project are, eliminates data races. Within our Go codebase this has been a source of a number of very hard to track down bugs over the years. Its error handling also helps developers write correct software and reduces the number of runtime bugs you might otherwise create. Also, it’s embeddable into other languages and systems. This means we can embed it into InfluxDB or other parts of our stack or other analytics systems. We could even compile it down to web assembly and run it in the browser. There’s so much to love about Rust, but this talk isn’t about that. But ultimately, I want this project to form the basis of future analytics systems for the next few decades and beyond. I remember some blog post that Bryan Cantril wrote about Rust where he talked about software with longevity and he felt that Rust was a language that would ultimately help you build that kind of software. That’s the bet we’re making here.
  24. The project is InfluxDB IOx, which is short for iron oxide so it’s pronounced InfluxDB eye-ox. We’ll take a look at the high level architecture of it, but I just want to caveat this. This project is very early stage. We’ve largely been in research mode validating our assumptions on performance, compression and functionality. We’re not producing builds yet and we don’t have documentation up yet. But there’s a project README and you can build from source. We wanted to open this up early so that our community of users could see what we’re doing.
  25. The second thing to know is that this project is built around Apache Arrow. Arrow is an in-memory columnar data specification. But it’s also a persistence format via Apache Parquet, which is widely used both inside and outside the Arrow ecosystem. Most data warehouses and big data processing systems can read and write Parquet data. Arrow is also Arrow Flight, an RPC specification and high performance client/server framework for transferring large datasets over the network. Within the Rust part of Arrow is another project called DataFusion, which is a columnar SQL execution engine. We’re building on top of that and contributing to it. We’re using all of these tools. That makes the big headline with Arrow the fact that we’re no longer creating this database by ourselves. With Arrow as the core, we’re working with contributors around the world that are using these libraries in their own data systems.
  26. This is the big architectural change. InfluxDB IOx is an in-memory columnar database that uses object storage for persistence with data stored in Parquet files. We looked at the existing open source columnar databases when we were starting out. We wondered if they could form the basis of a future InfluxDB backend. What we found was that they weren’t optimized for time series. Specifically, they have varying degrees of dictionary support, which is critical for our use case, little support for querying directly on compressed in-memory data with late materialization, and they weren’t optimized for windowed aggregates and computation on time. They seem to be built around a pure analytics use case that asks a question about aggregations to a single point in time. Further, they weren’t built with our core need of being able to run with in an ephemeral environment with no locally attached storage using object store for all persistence. Our evaluation pointed to a missing solution in the open source market.
  27. It’s not a storage engine. We’re not building our own storage engine short of buffering data in memory and writing it out to Parquet files. The persistence formats we’re using under the hood are Flatbuffers for the write ahead log and Parquet files for immutable blocks of data.
  28. With Parquet and object storage for persistence, this opens up how you can interact with your data. Backup and restore is outside the concerns of InfluxDB IOx. You can create any kind of backup & restore system you’d like. An IOx server can read some or all of its data from object storage on startup. Bulk data transfers become trivial. Clients can get Parquet files directly from object storage and they can send Parquet files to InfluxDB IOx to organize in object storage for later query workloads. Thanks to Apache Arrow, there are libraries in many languages to work with Parquet and the support is getting better month over month. Notably, Python, C++ and Java are first class citizens in the Arrow ecosystem. They represent the gold standard of functionality. We’ll help bring Rust up to the same level of compatibility. Training a machine learning model? Ask IOx where the Parquet files are that have the data you’re looking for, get the directly from object storage and have it in your Python library of choice, all with a few lines of code.
  29. I should mention that I’m referring to object store, but there are other abstractions
  30. I want to talk quickly about how data is organized in InfluxDB IOx. I think this is important because it shows the flexibility you have as an operator and a user and it lets you optimize for having large blocks of immutable unchanging data, which is really what time series is all about. If you’re updating your data, that means you’re literally rewriting history. Sometimes you might do this, but that’s not what we’re optimizing for. We’re optimizing for history being a fixed thing that you can work with easily and modify on the fly at query time. That means that you have blocks of data that you can move around to other servers, send out to clients, and represent compactly in object storage.
  31. First you have the partition key, which is generated for each line that comes in. It can use any of the metadata or actual data to generate a string that represents the partition key. You could have the measurement name, tag key information or field information or time/date formatting. Partitions are logical groupings of data based on the same partition key. When a partition is snapshotted, you create an immutable block of data. A partition can have multiple blocks, but ideally you’re buffering up everything to snapshot once into a single block. You can always compact blocks later, but this can be a separate process completely outside of the DB. Blocks have tables of data where a table is once again a logical concept. At the physical level, you have individual Parquet files, which have one table in each and you have in-memory compressed segments that are optimized for query speed with some compression via encoding schemes.
  32. One table per measurement. Tags and fields become columns. One table per Parquet file. This means that tag and field names must be unique within a measurement. Schema gets defined and created on the fly as you write data in.
  33. But it’s a start. And we know that we can switch to Parquet as our persistence format without any fear of some sort of data explosion.
  34. We break data up into partitions. How data is partitioned can change over time, because each partition is self describing in terms of the summary metadata that specifies what tables it has, what columns each of those tables has, and what the summary information is for each of those columns like min, max, count, sum and potentially even bloom filters for identifiers. This summary data is used by the planner at query time. Partition summaries are kept in memory and the query is analyzed to determine which partitions need to be queried to produce a result. Once in a partition, we brute force query against it, and if we have it in our segment store, that happens against compressed data without decompressing it. That is, we perform late materialization and only decompress the values we use. This means that the partitioning scheme you choose has great impact on what your queries look like. This is why we let the users define it when they create a database/bucket. It can change on a per-database basis.
  35. We can likely do better. We’re using RLE for the span IDs and trace IDs and we’d be better just going with dictionary without the RLE.
  36. Notice that we have time in this example. If you’re looking up by some trace ID, where’d you get it? From a log line? You’ll have a timestamp associated with it. Use it. If you’re partitioning your data by time, and in most cases this will be at least one of the criteria by which you partition your data, you can quickly narrow down the blocks of data to query against. If you have 2h partitions, then you’ll be able to find the spans you’re looking for by querying at most 2 partitions.
  37. This returns the 10 rows in about 85 milliseconds. If you do the rough math on this it means it was able to brute force on about 1.1B rows/sec. Note that we didn’t actually process all those rows. It was operating on compressed data. We can likely get this down by a bit more by removing the RLE compression for trace ID and span. Maybe another 2x improvement. The specifics of the compressed in memory columnar store will definitely be the subject of some future tech talks.
  38. Here’s what I think the real future is. The example I just showed takes a data center centric view. It assumes that all your data is getting pushed up to some central cluster. I think the future is federated. It operates at the edges as single nodes, it operates in factories in small clusters, and it operates in many data centers worldwide. You’ll likely have high precision data that doesn’t make sense to replicate everything up to a central place. Or at least you’ll only replicated it in highly compressed form. The future distributed time series system isn’t a cluster that runs in a data center, even if it has rack aware capabilities and multi-region routing. There’s no limit to the scale of time series data that we’ll be collecting over the coming decades. We need flexibility in how it’s replicated, queried, and stored.
  39. * Created InfluxDB because we saw so many people re-inventing the wheel and we wanted Influx to be the basis of it * However, the lack of distributed features left a gap in the market * Infrastructure projects that fall under source available or community licenses severely limit the audience and what you can build * InfluxDB IOx is dual-licensed under MIT and Apache 2 as is common in the Rust community. No community license, no source available license, no restrictions. You can build new projects using this code, you can build new businesses using this code, you can do whatever you want with it.
  40. Conway’s law says that you ship your org chart. That is if you create two teams to build a system, you’ll get a system comprised of two parts. I propose Dix’s maxim as it relates to open source and licensing generally, which is that your licensing strategy is your commercialization strategy, whether by accident or design. The architecture approaches for IOx are deliberate choices because of not only the functionality and operational properties we wanted in the system, but also in how we plan to commercialize it.
  41. InfluxDB IOx is designed to be a shared-nothing server that has an API giving the operator total control over how it behaves. However, the operator must make those changes as they are needed. Who does this operation and coordination? In the most simple setups of a single server, you don’t worry about it. In two server setups you can likely get by with shell scripts and a cron job. But the more complex your environment becomes, the more complicated this coordination becomes. It was a design goal for us to separate out the core database work from the operational work across a fleet of servers. We will create this software for our own needs to operate our cloud environment. However, our cloud may be different than yours. Your environment may be different. This is why the operational coordination is kept separate. So there is maximum flexibility in topology and configuration. We plan to run the InfluxDB IOx open source bits as is in our own cloud. We won’t be running a fork, we’ll be running right off the main branch.
  42. At the beginning of this talk I mentioned my introduction of InfluxDB to the world. And I titled it this.
  43. I’ll be giving more talks about InfluxDB IOx over the coming months. But here’s how I’m thinking about it. Yes, it’s a distributed time series database. But it’s a lot more than just that.
  44. It’s federated and this is a core part of its design. With time series and analytics data, the future is federated. The scale is larger than you’ll want to manage and push up to a single cluster. You’ll have edge, multiple data centers, and many thousands of potential nodes all communicating with each other.