SlideShare a Scribd company logo
1 of 23
Download to read offline
The Myth of Cassandra I’ve had it with these crazed oracles NoSQL Series Cameron Kilgore | @thrillgore
Cas·san·dra[kəˈsændrə], noun [Classical Greek Mythology.] A daughter of Priam and Hecuba, a prophet cursed by Apollo so that her prophecies, though true, were fated never to be believed. [fml. “Apache Cassandra”] An open-source distributed, non-relational (NoSQL) database developed at Facebook, written in Java, and maintained as an Apache Software Foundation product
What Cassandra does Nonrelational associative array (key-value) data storage Distributed One-hop DHT (akin to Amazon Dynamo) Eventually Consistent Column-based storage Queries faster than MySQL Based on white papers and real-world use cases Fault tolerant Provides no single point of failure Load balancing
What Cassandra Does Not	 Revision History Relational Data There’s this thing called “MySQL” that might be just up your alley Provide an admin app Chiton is an in-development desktop app http://github.com/driftx/chiton Store individual data fields greater than 231-1 (2,147,483,647) bytes Provide any interfaces outside of Thrift or high-level interfaces
She who entangles companies Already at use at Facebook Also being used at: Digg Reddit Twitter Rackspace Cisco IBM Cloudkick OpenX And more…
Introducing Cassandra Understanding the concepts of data in Cassandra, scalability
Columns and Data Data is stored in columns, each organized by keyspaces Each column stores data and can be culled based on its name value, akin to an associative array +name: byte[] +value: byte[] +timestamp: long
Supercolumns What happens when Xzibit uses Cassandra Supercolumns allow you to nest n number of columns in another column And in return in a key you can nest n number of supercolumns. (not shown here due to Office fail)
Anatomy of a Column Cassandra is written in Java, so we abide by the rules of its variables Most of them will be bytestrings (byte[]), set in Unicode +time being the only value not stored as a bytestring, instead as a long Java compares the +time across other Cassandra nodes to reconcile data across nodes Is NOT used for revision history Each column represented by an unseen UUID
Anatomy of a Column (cont.) Columns are found by their +name value, not their UUID You cannot have multiple columns of the same name (assigning one with the same name rewrites an existing one in that given keyspace)
Accessing the Data Data accessed through the Apache Incubator™ Thrift API Thrift can be accessed with any programming language or application High-level implementations for languages exist For our demos we’re going to use the cassandra-cli client, which gives us the ability to insert/remove/edit
<INSERT CALL TO DEMO HERE> OH GOD HOW DID I GET HERE I AM NOT GOOD WITH COMPUTER
Security in Cassandra Cassandra does have user authentication through a SimpleAuthenticator module that is configured in conf files Very rudimentary Ran out of time and suitable documentation to demonstrate it Cassandra is not ACID-compliant
Load Balancing Cassandra 0.6 has load balancing capabilities Not automatic, must be configured per node Load is shared in a token-ring fashion across the nodes in a multi-node configuration Covered in the documentation for Cassandra
Monitoring Cassandra Cassandra exposes metrics as JMX data, so any JMX monitoring app should be sufficient. Nagios Munin OpenNMS Any official Oracle™ Java monitoring and administration software What? I can’t be bothered to not search for the name of the software? Cassandra also has software for monitoring node activity, check the docs
Use Case Example And a very simple one at that
Product Ordering Application An ordering application implemented using a SQL database could span hundreds of tables and require constant iterations over its lifespan What if the attributes of these products (in this case, HVAC components) were stored in Cassandra, and we kept pricing, users, and sessions data in a RDBMS?
Benefits to Cassandra The data for these products that might need to be added won’t require new RDBMS fields – we can just add them in new columns and write our code statements to ignore them if they aren’t there We aren’t limited to bottlenecks in the RDBMS if we choose to go multinode in our Cassandra setup No single point of failure if we choose to go multinode If we get a lot of users (unlikely), the nodes will equally distribute the load Less time spent on queries Depends on how effective our data is stored and the performance of our application
Downsides to Cassandra We may not have the funding needed to procure a multinode configuration No guarantee that existing data that might need to be reconfigured might be changed over time to meet the demands of sales, engineering, executive, etc. Data collected and given some form of relation inside the application itself, with no schema Cassandra lacks a vetted security framework that could put us at risk Cassandra also lacks a complete administration application Chiton is barely functional as-is Might not make sense when some RDBMS can scale across machines
A (crude) data map showing our data in practice
Cassandra and PHP This is a PHP User group after all.
Talking to Cassandra Low-level framework, Thrift, is the actual client API for Cassandra In PHP we have two such frameworks that work through Thrift phpcassa Pandra Ran out of time to prepare a demo There’s always another time for a demo. Stay tuned.
Any Questions? You will be baked, and there will be cake

More Related Content

What's hot

What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...Edureka!
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2Paulraj Pappaiah
 
Architecting applications in the AWS cloud
Architecting applications in the AWS cloudArchitecting applications in the AWS cloud
Architecting applications in the AWS cloudCloud Genius
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraDataStax
 
Pipedrive DW on AWS
Pipedrive DW on AWSPipedrive DW on AWS
Pipedrive DW on AWSPipedrive
 
Aws glossary flash cards
Aws glossary flash cardsAws glossary flash cards
Aws glossary flash cardsinsisiv Labs
 
Ai big dataconference_jeffrey ricker_kappa_architecture
Ai big dataconference_jeffrey ricker_kappa_architectureAi big dataconference_jeffrey ricker_kappa_architecture
Ai big dataconference_jeffrey ricker_kappa_architectureOlga Zinkevych
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsOleg Magazov
 
Samedi SQL Québec - La plateforme data de Azure
Samedi SQL Québec - La plateforme data de AzureSamedi SQL Québec - La plateforme data de Azure
Samedi SQL Québec - La plateforme data de AzureMSDEVMTL
 
Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018
Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018
Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018Amazon Web Services
 
Ignorance of CAP Is Not an Excuse
Ignorance of CAP Is Not an ExcuseIgnorance of CAP Is Not an Excuse
Ignorance of CAP Is Not an ExcuseGlobalLogic Ukraine
 
Hands-on Lab: Data Lake Analytics
Hands-on Lab: Data Lake AnalyticsHands-on Lab: Data Lake Analytics
Hands-on Lab: Data Lake AnalyticsAmazon Web Services
 
Building a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at YieldbotBuilding a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at Yieldbotyieldbot
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesAmazon Web Services
 

What's hot (20)

Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
 
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2
 
Architecting applications in the AWS cloud
Architecting applications in the AWS cloudArchitecting applications in the AWS cloud
Architecting applications in the AWS cloud
 
Azure lessons
Azure lessonsAzure lessons
Azure lessons
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache Cassandra
 
Pipedrive DW on AWS
Pipedrive DW on AWSPipedrive DW on AWS
Pipedrive DW on AWS
 
Aws glossary flash cards
Aws glossary flash cardsAws glossary flash cards
Aws glossary flash cards
 
Ai big dataconference_jeffrey ricker_kappa_architecture
Ai big dataconference_jeffrey ricker_kappa_architectureAi big dataconference_jeffrey ricker_kappa_architecture
Ai big dataconference_jeffrey ricker_kappa_architecture
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
Samedi SQL Québec - La plateforme data de Azure
Samedi SQL Québec - La plateforme data de AzureSamedi SQL Québec - La plateforme data de Azure
Samedi SQL Québec - La plateforme data de Azure
 
Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018
Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018
Cost and Performance Optimisation in Amazon RDS - AWS Summit Sydney 2018
 
Ignorance of CAP Is Not an Excuse
Ignorance of CAP Is Not an ExcuseIgnorance of CAP Is Not an Excuse
Ignorance of CAP Is Not an Excuse
 
Hands-on Lab: Data Lake Analytics
Hands-on Lab: Data Lake AnalyticsHands-on Lab: Data Lake Analytics
Hands-on Lab: Data Lake Analytics
 
Building a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at YieldbotBuilding a Lambda Architecture with Elasticsearch at Yieldbot
Building a Lambda Architecture with Elasticsearch at Yieldbot
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS Updates
 
Cassandra ppt 2
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2
 

Viewers also liked

Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014odnoklassniki.ru
 
C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...
C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...
C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...DataStax Academy
 
Demystifying Twitter Marketing
Demystifying Twitter MarketingDemystifying Twitter Marketing
Demystifying Twitter MarketingDeep Sherchan
 
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseCassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseDataStax Academy
 
Cassandra - Say Goodbye to the Relational Database (5-6-2010)
Cassandra - Say Goodbye to the Relational Database (5-6-2010)Cassandra - Say Goodbye to the Relational Database (5-6-2010)
Cassandra - Say Goodbye to the Relational Database (5-6-2010)Chris Barber
 
Cassandra - PHP
Cassandra - PHPCassandra - PHP
Cassandra - PHPmauritsl
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
PHP開発者のためのNoSQL入門
PHP開発者のためのNoSQL入門PHP開発者のためのNoSQL入門
PHP開発者のためのNoSQL入門じゅん なかざ
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Applicationsupertom
 
What Is Account Based Marketing?
What Is Account Based Marketing?What Is Account Based Marketing?
What Is Account Based Marketing?Drift
 
The Evolution of Sales Tools
The Evolution of Sales ToolsThe Evolution of Sales Tools
The Evolution of Sales ToolsDrift
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
 

Viewers also liked (12)

Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
Add a bit of ACID to Cassandra. Cassandra Summit EU 2014
 
C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...
C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...
C* Summit 2013: Java and .NET Client Drivers - Cassandra Developments on Fire...
 
Demystifying Twitter Marketing
Demystifying Twitter MarketingDemystifying Twitter Marketing
Demystifying Twitter Marketing
 
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseCassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
 
Cassandra - Say Goodbye to the Relational Database (5-6-2010)
Cassandra - Say Goodbye to the Relational Database (5-6-2010)Cassandra - Say Goodbye to the Relational Database (5-6-2010)
Cassandra - Say Goodbye to the Relational Database (5-6-2010)
 
Cassandra - PHP
Cassandra - PHPCassandra - PHP
Cassandra - PHP
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
PHP開発者のためのNoSQL入門
PHP開発者のためのNoSQL入門PHP開発者のためのNoSQL入門
PHP開発者のためのNoSQL入門
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Application
 
What Is Account Based Marketing?
What Is Account Based Marketing?What Is Account Based Marketing?
What Is Account Based Marketing?
 
The Evolution of Sales Tools
The Evolution of Sales ToolsThe Evolution of Sales Tools
The Evolution of Sales Tools
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 

Similar to The Myth of Cassandra: Understanding Cassandra's Data Model and Use Cases

Cassandra synergy
Cassandra synergyCassandra synergy
Cassandra synergyniallmilton
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to CassandraUmair Mansoob
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMIJCI JOURNAL
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
AWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloudAWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloudAmazon Web Services
 
Architecting Enterprise Applications In The Cloud
Architecting Enterprise Applications In The CloudArchitecting Enterprise Applications In The Cloud
Architecting Enterprise Applications In The CloudAmazon Web Services
 
Cloud Native Computing - Part II - Public Cloud (AWS)
Cloud Native Computing - Part II - Public Cloud (AWS)Cloud Native Computing - Part II - Public Cloud (AWS)
Cloud Native Computing - Part II - Public Cloud (AWS)Linjith Kunnon
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlDavid Daeschler
 
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...PyData
 
AWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the CloudAWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the CloudAmazon Web Services
 
cassandra
cassandracassandra
cassandraAkash R
 
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...Jamie Kinney
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...Amazon Web Services
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introductionfardinjamshidi
 

Similar to The Myth of Cassandra: Understanding Cassandra's Data Model and Use Cases (20)

Cassandra synergy
Cassandra synergyCassandra synergy
Cassandra synergy
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
AWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloudAWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloud
 
Architecting Enterprise Applications In The Cloud
Architecting Enterprise Applications In The CloudArchitecting Enterprise Applications In The Cloud
Architecting Enterprise Applications In The Cloud
 
Cloud Native Computing - Part II - Public Cloud (AWS)
Cloud Native Computing - Part II - Public Cloud (AWS)Cloud Native Computing - Part II - Public Cloud (AWS)
Cloud Native Computing - Part II - Public Cloud (AWS)
 
No sql
No sqlNo sql
No sql
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosql
 
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
Tobi Bosede - PyCassa Setting Up and Using Apache Cassandra with Python in Wi...
 
AWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the CloudAWS Webcast - Best Practices in Architecting for the Cloud
AWS Webcast - Best Practices in Architecting for the Cloud
 
cassandra
cassandracassandra
cassandra
 
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 

Recently uploaded

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Recently uploaded (20)

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

The Myth of Cassandra: Understanding Cassandra's Data Model and Use Cases

  • 1. The Myth of Cassandra I’ve had it with these crazed oracles NoSQL Series Cameron Kilgore | @thrillgore
  • 2. Cas·san·dra[kəˈsændrə], noun [Classical Greek Mythology.] A daughter of Priam and Hecuba, a prophet cursed by Apollo so that her prophecies, though true, were fated never to be believed. [fml. “Apache Cassandra”] An open-source distributed, non-relational (NoSQL) database developed at Facebook, written in Java, and maintained as an Apache Software Foundation product
  • 3. What Cassandra does Nonrelational associative array (key-value) data storage Distributed One-hop DHT (akin to Amazon Dynamo) Eventually Consistent Column-based storage Queries faster than MySQL Based on white papers and real-world use cases Fault tolerant Provides no single point of failure Load balancing
  • 4. What Cassandra Does Not Revision History Relational Data There’s this thing called “MySQL” that might be just up your alley Provide an admin app Chiton is an in-development desktop app http://github.com/driftx/chiton Store individual data fields greater than 231-1 (2,147,483,647) bytes Provide any interfaces outside of Thrift or high-level interfaces
  • 5. She who entangles companies Already at use at Facebook Also being used at: Digg Reddit Twitter Rackspace Cisco IBM Cloudkick OpenX And more…
  • 6. Introducing Cassandra Understanding the concepts of data in Cassandra, scalability
  • 7. Columns and Data Data is stored in columns, each organized by keyspaces Each column stores data and can be culled based on its name value, akin to an associative array +name: byte[] +value: byte[] +timestamp: long
  • 8. Supercolumns What happens when Xzibit uses Cassandra Supercolumns allow you to nest n number of columns in another column And in return in a key you can nest n number of supercolumns. (not shown here due to Office fail)
  • 9. Anatomy of a Column Cassandra is written in Java, so we abide by the rules of its variables Most of them will be bytestrings (byte[]), set in Unicode +time being the only value not stored as a bytestring, instead as a long Java compares the +time across other Cassandra nodes to reconcile data across nodes Is NOT used for revision history Each column represented by an unseen UUID
  • 10. Anatomy of a Column (cont.) Columns are found by their +name value, not their UUID You cannot have multiple columns of the same name (assigning one with the same name rewrites an existing one in that given keyspace)
  • 11. Accessing the Data Data accessed through the Apache Incubator™ Thrift API Thrift can be accessed with any programming language or application High-level implementations for languages exist For our demos we’re going to use the cassandra-cli client, which gives us the ability to insert/remove/edit
  • 12. <INSERT CALL TO DEMO HERE> OH GOD HOW DID I GET HERE I AM NOT GOOD WITH COMPUTER
  • 13. Security in Cassandra Cassandra does have user authentication through a SimpleAuthenticator module that is configured in conf files Very rudimentary Ran out of time and suitable documentation to demonstrate it Cassandra is not ACID-compliant
  • 14. Load Balancing Cassandra 0.6 has load balancing capabilities Not automatic, must be configured per node Load is shared in a token-ring fashion across the nodes in a multi-node configuration Covered in the documentation for Cassandra
  • 15. Monitoring Cassandra Cassandra exposes metrics as JMX data, so any JMX monitoring app should be sufficient. Nagios Munin OpenNMS Any official Oracle™ Java monitoring and administration software What? I can’t be bothered to not search for the name of the software? Cassandra also has software for monitoring node activity, check the docs
  • 16. Use Case Example And a very simple one at that
  • 17. Product Ordering Application An ordering application implemented using a SQL database could span hundreds of tables and require constant iterations over its lifespan What if the attributes of these products (in this case, HVAC components) were stored in Cassandra, and we kept pricing, users, and sessions data in a RDBMS?
  • 18. Benefits to Cassandra The data for these products that might need to be added won’t require new RDBMS fields – we can just add them in new columns and write our code statements to ignore them if they aren’t there We aren’t limited to bottlenecks in the RDBMS if we choose to go multinode in our Cassandra setup No single point of failure if we choose to go multinode If we get a lot of users (unlikely), the nodes will equally distribute the load Less time spent on queries Depends on how effective our data is stored and the performance of our application
  • 19. Downsides to Cassandra We may not have the funding needed to procure a multinode configuration No guarantee that existing data that might need to be reconfigured might be changed over time to meet the demands of sales, engineering, executive, etc. Data collected and given some form of relation inside the application itself, with no schema Cassandra lacks a vetted security framework that could put us at risk Cassandra also lacks a complete administration application Chiton is barely functional as-is Might not make sense when some RDBMS can scale across machines
  • 20. A (crude) data map showing our data in practice
  • 21. Cassandra and PHP This is a PHP User group after all.
  • 22. Talking to Cassandra Low-level framework, Thrift, is the actual client API for Cassandra In PHP we have two such frameworks that work through Thrift phpcassa Pandra Ran out of time to prepare a demo There’s always another time for a demo. Stay tuned.
  • 23. Any Questions? You will be baked, and there will be cake

Editor's Notes

  1. © 2010 Cameron Kilgore. Distribution permitted under CC-BY-ND 3.0
  2. “Eventually Consistent” means all updates across nodes of cassandra will eventually propagate and they will be consistent.
  3. Chiton is aPyGTK app and I’ve only managed to run it in an Arch install, so its pretty useless if you work in Mac OS X or Windows. I remember another app being discussed in the #cassandra IRC channel, but I can’t recall its name or if its web-based.
  4. Reddit has a pretty interesting white paper on their usage of Cassandra in place of memcacheDB. I’ll post a link on my twitter account, @thrillgore later.
  5. The timestamp is used by Cassandra to reconcile differences across nodes, not to store revisions. The timestamp itself is autogenerated from Unix time(); Due to Cassandra’s Java heritage, the name and values are stored as byte arrays.
  6. Essentially, you enable the module and then you create conf files for each user you want accessing the database. By default it uses AllowAllAuthentication, which means there isn’t any authentication any anyone can connect to it.
  7. Just a quick description of what’s going on here – we’re storing a kind of a HVAC component, in this case a Louver, in a keyspace specifically for Louvers. Inside, for each individual Louver kind, we have a supercolumn that stores columns that would represent attributes of the individual louver. For each louver, another supercolumn could be created. The method behind using this gives us more control over the (dare I say it) relation of data in a spatial methodology over a tabular methodology.