Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra+Hadoop

A presentation on recent changes to Cassandra that make it able to use Hadoop's MapReduce and Pig with it.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Cassandra+Hadoop

  1. 1. CASSANDRA + HADOOP
  2. 2. Two Aspects MapReduce Pig
  3. 3. MR + Cassandra - History
  4. 4. MR + Cassandra - History Writing to Cassandra - always been possible
  5. 5. MR + Cassandra - History Writing to Cassandra - always been possible Cassandra 0.6.x enables reading data
  6. 6. MR + Cassandra - History Writing to Cassandra - always been possible Cassandra 0.6.x enables reading data Uses its own InputSplit, InputFormat, RecordReader
  7. 7. Why MR + Cassandra? Cassandra is a great data store but what about analytics? MapReduce! Arguable win over MapReduce + HBase, no SPOF
  8. 8. Setup and Configuration
  9. 9. Setup and Configuration Job/Task Trackers
  10. 10. Setup and Configuration Job/Task Trackers On already established cluster
  11. 11. Setup and Configuration Job/Task Trackers On already established cluster Overlays Cassandra cluster
  12. 12. Setup and Configuration Job/Task Trackers On already established cluster Overlays Cassandra cluster Hybrid
  13. 13. Setup and Configuration Job/Task Trackers On already established cluster Overlays Cassandra cluster Hybrid Locality
  14. 14. Setup and Configuration Job/Task Trackers On already established cluster Overlays Cassandra cluster Hybrid Locality Gives data’s host information to job tracker
  15. 15. Setup and Configuration Job/Task Trackers On already established cluster Overlays Cassandra cluster Hybrid Locality Gives data’s host information to job tracker Configure both topologies - Cassandra + Hadoop
  16. 16. A Separate Cluster
  17. 17. A Complete Overlay Separate Job Tracker Task Trackers Collocated with Cassandra Nodes
  18. 18. A Complete Overlay Separate Job Tracker Task Trackers Collocated with Cassandra Nodes - Bonus - Data locality!
  19. 19. A Hybrid Cluster Task Trackers on Cassandra nodes
  20. 20. A Hybrid Cluster - Bonus - Data locality Integrate w/Cluster Task Trackers on Cassandra nodes
  21. 21. Tutorial contrib/word_count example
  22. 22. Pig + Cassandra contrib/pig - a Cassandra specific storage backing Requires latest Pig - 0.7
  23. 23. Future Work
  24. 24. Future Work Simple output to Cassandra - Cassandra-1101 OutputFormat, OutputReducer, OutputWriter
  25. 25. Future Work Simple output to Cassandra - Cassandra-1101 OutputFormat, OutputReducer, OutputWriter Hive support - Cassandra-913
  26. 26. Future Work Simple output to Cassandra - Cassandra-1101 OutputFormat, OutputReducer, OutputWriter Hive support - Cassandra-913 Optimizations for start/end row - Cassandra-1125
  27. 27. Future Work Simple output to Cassandra - Cassandra-1101 OutputFormat, OutputReducer, OutputWriter Hive support - Cassandra-913 Optimizations for start/end row - Cassandra-1125 Other refinements based on feedback
  28. 28. Questions... jeromatron on twitter jeromatron on #cassandra channel on freenode irc jeremy (dot) hanna (at) rackspace (dot) com

    Be the first to comment

    Login to see the comments

  • celeriis

    Jun. 9, 2010
  • dstainer

    Jul. 6, 2010
  • huiyin

    Jul. 22, 2010
  • antyRao

    Aug. 23, 2010
  • medcl

    Sep. 7, 2010
  • mawentaor

    Oct. 28, 2010
  • axiu86

    Nov. 15, 2010
  • ThibaultDory

    Feb. 28, 2011
  • liyuanxiaolu

    Mar. 4, 2011
  • lenartp

    Apr. 23, 2011
  • gautamr

    May. 26, 2011
  • webdcom

    Jul. 19, 2011
  • Kojitani

    Oct. 19, 2011
  • ahmedyehdih

    Feb. 10, 2012
  • kerhouat

    Sep. 14, 2012
  • digopoabrasil

    Oct. 1, 2012
  • kartiktv

    Jul. 13, 2014
  • rorybramwell

    Jul. 15, 2015
  • OlegGorodnitchi

    Jun. 13, 2016
  • shingoinou

    May. 1, 2017

A presentation on recent changes to Cassandra that make it able to use Hadoop's MapReduce and Pig with it.

Views

Total views

28,109

On Slideshare

0

From embeds

0

Number of embeds

1,214

Actions

Downloads

747

Shares

0

Comments

0

Likes

28

×