SlideShare a Scribd company logo
1 of 45
HBase @ Salesforce
Lars Hofhansl
Architect, Father, Meditator,Aikido Blackbelt
http://hadoop-hbase.blogspot.com
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any
litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our
relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our
service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to
larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is
included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent
fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor
Information section of our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions
based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these
forward-looking statements.
Safe Harbor
Why HBase?
• SAN
• RDBMS
• Transactions
Zookeeper?
Commodity
Hardware?
HBase?
HDFS?Unstructured
Data?
A. Why HBase?
B. Interacting with the open source community
C. HBase at Salesforce
Size Matters*
New Salesforce customer:
•“How many rows do you have?”
•We will turn folks away if they have too many!
Data Storage is expensive:
•SAN storage
•Relational Database
•Too many rows  Too expensive
* In a relational world
What if in the future we:
… and have cheaper storage?
… and never need to ask again
about the number of rows?
… grow with the data by just
adding more machines?
(Disclaimer: no transactions, no joins, no 2nd’ary indexes, …)
(A quick note about) Relational Databases
• We love them. They are core to our infrastructure.
• SQL and NoSQL NoACID are complementary.
• (Almost) everything we do is SQL based (see Phoenix – the SQL layer for HBase.)
The Search - Requirements
• Consistent
– “Eventually consistent stores are 100% consistent 99% of the time” – Ian Varley
• Scalable
– No “features” impeding horizontal scaling
• Persistent
– Duh...?
• Key lookups
• Range lookups
• Open source (ASL great, GPLv2 OK, GPLv3/AGPL not acceptable)
Enter HBase
“A Sparse, Consistent, Distributed,
Multidimensional, Persistent, Sorted Map”
Salesforce and the HBase Community
To Fork or not to Fork – that is the question
Fork - pros
• Agility. No waiting for community review. Just get stuff done
• Freedom. Patches that might not be acceptable to the community
Fork - cons
• Lose out on community work
• Patches not useful to other parties
There is no right or wrong. It’s a matter of choice, taste, and requirements.
HBase Development @ Salesforce
• No fork of HBase.
• No fork of HBase.
• Internal HBase/HDFS branch for possible emergency fixes
• All fixes are cleaned and contributed back
• We switch to the next open source point release periodically
PMC member, 2 committers, release manager, contributors
HBASE-11042 HBASE-11040 HBASE-11037 HBASE-11030 HBASE-11029 HBASE-11024 HBASE-11022 HBASE-
11010 HBASE-10996 HBASE-10989 HBASE-10988 HBASE-10987 HBASE-10982 HBASE-10969 HBASE-10847
HBASE-10805 HBASE-10722 HBASE-10706 HBASE-10642 HBASE-10594 HBASE-10562 HBASE-10551
HBASE-10546 HBASE-10505 HBASE-10501 HBASE-10489 HBASE-10470 HBASE-10420 HBASE-10416
HBASE-10383 HBASE-10363 HBASE-10320 HBASE-10317 HBASE-10286 HBASE-10284 HBASE-10281
HBASE-10279 HBASE-10259 HBASE-10257 HBASE-10250 HBASE-10181 HBASE-10117 HBASE-10076
HBASE-10058 HBASE-10057 HBASE-10015 HBASE-9993 HBASE-9971 HBASE-9956 HBASE-9915 HBASE-
9865 HBASE-9834 HBASE-9807 HBASE-9799 HBASE-9789 HBASE-9778 HBASE-9751 HBASE-9749 HBASE-
9732 HBASE-9731 HBASE-9711 HBASE-9658 HBASE-9584 HBASE-9566 HBASE-9534 HBASE-9429 HBASE-
9428 HBASE-9377 HBASE-9356 HBASE-9344 HBASE-9301 HBASE-9266 HBASE-9231 HBASE-9221 HBASE-
9186 HBASE-9158 HBASE-9103 HBASE-9097 HBASE-9049 HBASE-8971 HBASE-8945 HBASE-8930 HBASE-
8912 HBASE-8858 HBASE-8809 HBASE-8767 HBASE-8702 HBASE-8698 HBASE-8684 HBASE-8671 HBASE-
8636 HBASE-8525 HBASE-8503 HBASE-8355 HBASE-8316 HBASE-8229 HBASE-8188 HBASE-8166 HBASE-
8151 HBASE-8110 HBASE-8108 HBASE-8055 HBASE-8008 HBASE-7999 HBASE-7947 HBASE-7945 HBASE-
7817 HBASE-7801 HBASE-7729 HBASE-7725 HBASE-7717 HBASE-7709 HBASE-7702 HBASE-7681 HBASE-
7617 HBASE-7602 HBASE-7578 HBASE-7550 HBASE-7499 HBASE-7497 HBASE-7483 HBASE-7466 HBASE-
7465 HBASE-7455 HBASE-7438 HBASE-7435 HBASE-7432 HBASE-7431 HBASE-7417 HBASE-7415 HBASE-
7371 HBASE-7336 HBASE-7293 HBASE-7279 HBASE-7270 HBASE-7252 HBASE-7240 HBASE-7215 HBASE-
7214 HBASE-7180 HBASE-7177 HBASE-7166 HBASE-7165 HBASE-7091 HBASE-7069 HBASE-7051 HBASE-
7047 HBASE-7021 HBASE-7010 HBASE-6996 HBASE-6974
PMC member, 2 committers, release manager, contributors
HBASE-6949 HBASE-6946 HBASE-6912 HBASE-6889 HBASE-6879 HBASE-6868 HBASE-6865 HBASE-6863
HBASE-6797 HBASE-6796 HBASE-6784 HBASE-6765 HBASE-6757 HBASE-6755 HBASE-6711 HBASE-6707
HBASE-6690 HBASE-6667 HBASE-6638 HBASE-6637 HBASE-6621 HBASE-6582 HBASE-6580 HBASE-6579
HBASE-6573 HBASE-6571 HBASE-6570 HBASE-6569 HBASE-6568 HBASE-6561 HBASE-6523 HBASE-6522
HBASE-6505 HBASE-6504 HBASE-6496 HBASE-6495 HBASE-6441 HBASE-6439 HBASE-6427 HBASE-6426
HBASE-6421 HBASE-6406 HBASE-6355 HBASE-6347 HBASE-6326 HBASE-6296 HBASE-6293 HBASE-6291
HBASE-6178 HBASE-6138 HBASE-6113 HBASE-6112 HBASE-6110 HBASE-6087 HBASE-5961 HBASE-5955
HBASE-5909 HBASE-5884 HBASE-5871 HBASE-5865 HBASE-5782 HBASE-5775 HBASE-5774 HBASE-5682
HBASE-5670 HBASE-5659 HBASE-5641 HBASE-5609 HBASE-5604 HBASE-5574 HBASE-5569 HBASE-5548
HBASE-5547 HBASE-5541 HBASE-5526 HBASE-5523 HBASE-5509 HBASE-5497 HBASE-5460 HBASE-5455
HBASE-5440 HBASE-5431 HBASE-5368 HBASE-5350 HBASE-5348 HBASE-5318 HBASE-5304 HBASE-5266
HBASE-5229 HBASE-5203 HBASE-5118 HBASE-5096 HBASE-5088 HBASE-5084 HBASE-5070 HBASE-5058
HBASE-5005 HBASE-5001 HBASE-4998 HBASE-4981 HBASE-4979 HBASE-4945 HBASE-4886 HBASE-4874
HBASE-4870 HBASE-4838 HBASE-4805 HBASE-4800 HBASE-4691 HBASE-4682 HBASE-4673 HBASE-4657
HBASE-4626 HBASE-4605 HBASE-4583 HBASE-4561 HBASE-4559 HBASE-4556 HBASE-4536 HBASE-4517
HBASE-4488 HBASE-4454 HBASE-4439 HBASE-4404 HBASE-4387 HBASE-4347 HBASE-4336 HBASE-4335
HBASE-4334 HBASE-4331 HBASE-4296 HBASE-4283 HBASE-4263 HBASE-4242 HBASE-4241 HBASE-4197
HBASE-4178 HBASE-4171 HBASE-4102 HBASE-4071 HBASE-3661 HBASE-3645 HBASE-3584 HBASE-3443
HBASE-3433 HBASE-3387 HBASE-2947 HBASE-2196 HBASE-2195 HDFS-3979 HDFS-744
Managing HBase 0.94
Established monthly release train for 0.94
Contributed >300 of features, bug fixes, perf improvements
Reviewed 1000’s of open source patches
Committed 100’s of patches
Open Sourced Apache Phoenix – SQL skin on HBase
Salesforce High-level Architecture
Salesforce *is* a database
Salesforce is a Database
Query Parser
Query (SQL)
Parsed Query
Query Optimizer
Plan
Generator
Plan Cost
Estimator
Evaluation Plan
Query Plan Evaluator
System
Catalog
Database
Stats
Tables
Columns
Indexes
Salesforce is a Database
Query Parser
Query (SOQL)
Parsed Query
Query Optimizer
Plan
Generator
Plan Cost
Estimator
System
Catalog
Oracle
Hinted Oracle SQL
Database
Stats
Objects
Fields
Indexes
Salesforce is multi tenant
…pod
Tenant A-D
pod
Tenant E-H
pod
Tenant I-O
pod = a database instance
•Oracle RAC
•AppServers
•Blob store servers
•Search servers
•Shared SAN storage
•SAN replication for DR
App
Server
App
Server
App
Server
App
Server
…
Oracle
Node
Oracle
Node
Oracle
Node
Oracle
Node…
Oracle RAC cluster
Primary Site
Secondary Site
SAN replication
SAN
SAN
SQL/JDBC
Finally: HBase @ Salesforce
Oracle
Hinted Oracle SQL
Query Parser
Query (SOQL)
Parsed Query
Query Optimizer
Plan
Generator
Plan Cost
Estimator
System
Catalog
Database
Stats
Objects
Fields
Indexes
1. External Objects 2. Phoenix SQL
HBaseHBaseHBaseHBase
Where does HBase Fit?
Where does HBase Fit?
•Separate HBase per pod (close to 50 clusters)
•Logically co-located with Oracle
•Small clusters striped across five racks
•Each cluster’s master service on a different rack
•Identical cluster for DR
App
Server
App
Server
App
Server
App
Server
…
Oracle
Node
Oracle
Node
HBase
Node
HBase
Node…
Oracle Cluster
HBase
Node
HBase
Node
HBase
Node …
Primary Site
Secondary Site
DR HBase Cluster
Decentralized
HBase
Replication
SQL/JDBC
via Phoenix
HBase Cluster
…
SAN
SAN
Use Cases
1. Audit Trails (Entity History)
• Identity managed in RDBMS
• Indexed in HBase (Phoenix indexes)
• Historical, immutable data only
• No need to reason about updates, split identities, and transactions
2. Archiving (Data Lifecycle Management)
• Objects (rows) moved to HBase
• Identity managed in HBase after move
• Data immutable in HBase
• No Transactions
3. Live data in HBase (BigObjects)
• Mutable data (possibly)
• Everything managed in HBase
• Still no Transactions, yet
• Platform for other team to use
Merrill Lynch Rationalization Data Governance, Audit & Archive
• First Salesforce Enterprise Customer
• On PlatformArchival compelling versus On Premise
Solution from Informatica
• Retention Requirements for 7 Years
Merrill Lynch
“Data Audit, Governance & Lifecycle management is
critical for Merrill for the entire banking & financial
industry has become a benchmark requirement
Heating, ventilation, and air-conditioning in the EU
• Top 10 Platform Users
• Subject to highly variable data governance and
retention requirements
• Significant SAP footprint driving business rules –
need to connect that to Salesforce data for archival
and data retention needs
• Massive service workforce generates significant data
processing challenges
“The Salesforce.com Platform roadmap for Data Archive is
critical for future data management needs”
MichaelRoehr, CTO Vailliant
BMW Enriches Their Customer Perspective
• Sales Cloud available across all German Dealership
Franchises
• All customer data subject stringent & government
mandated protection, audit & retention
• Correlations with Car Builder App data enables more
contextual customer interactions
• Car Telemetry, used correctly help refine product
evolution and customer needs alignment
“Data driven customer engagement is a
key driver for our enhance customer
experience
System Of Record (SOR)
SOR = HA + DR + Backup + M&M
+ Security
Highly Available, Disaster Recovery
• Five peer Zookeeper Quorum
• Five Quorum Journals (for fs edits)
• Five HMasters
• Three NameNodes (yes, three, we made a patch to run more than one standby)
• HBase Replication to identical hot standby pod in a different data center
– In the event of a disaster we fail a complete pod to the secondary site
• Weekly automated, unattended rolling restarts
Replication
Backup High-level Architecture
Primary pod
HBase 48h
HDFS
Backup
per tenant
DR pod
HBase 48h
HDFS
Merkle Tree
Verification
Backup
per tenant
Monitoring & Management (M&M)
• Nagios alerts
• Trending via OpenTSDB.
Custom UI on top the time series data.
• Rolling upgrades
– Eventually scheduled and unattended
• Absolutely no unscheduled downtime.
Not even during a rack failure.
A. Why HBase?
B. Interacting with the open source community
C. HBase at Salesforce
Lars Hofhansl
http://hadoop-hbase.blogspot.com

More Related Content

What's hot

When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...confluent
 
Workday's Next Generation Private Cloud
Workday's Next Generation Private CloudWorkday's Next Generation Private Cloud
Workday's Next Generation Private CloudSilvano Buback
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsHostedbyConfluent
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015Doug O'Flaherty
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11Kenny Gryp
 
AIXpert - AIX Security expert
AIXpert - AIX Security expertAIXpert - AIX Security expert
AIXpert - AIX Security expertdlfrench
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Managementconfluent
 
Pivotal 101세미나 발표자료 (PAS,PKS)
Pivotal 101세미나 발표자료 (PAS,PKS) Pivotal 101세미나 발표자료 (PAS,PKS)
Pivotal 101세미나 발표자료 (PAS,PKS) VMware Tanzu Korea
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?David Ware
 

What's hot (20)

When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
When Kafka Meets the Scaling and Reliability needs of World's Largest Retaile...
 
Workday's Next Generation Private Cloud
Workday's Next Generation Private CloudWorkday's Next Generation Private Cloud
Workday's Next Generation Private Cloud
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis LabsRedis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
Redis + Kafka = Performance at Scale | Julien Ruaux, Redis Labs
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
MySQL Database Architectures - MySQL InnoDB ClusterSet 2021-11
 
AIXpert - AIX Security expert
AIXpert - AIX Security expertAIXpert - AIX Security expert
AIXpert - AIX Security expert
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Management
 
Pivotal 101세미나 발표자료 (PAS,PKS)
Pivotal 101세미나 발표자료 (PAS,PKS) Pivotal 101세미나 발표자료 (PAS,PKS)
Pivotal 101세미나 발표자료 (PAS,PKS)
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
Scaling HBase for Big Data
Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
 
IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?IBM MQ and Kafka, what is the difference?
IBM MQ and Kafka, what is the difference?
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 

Viewers also liked

HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best PracticesVenu Anuganti
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...Salesforce Engineering
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Noteslarsgeorge
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLVenu Anuganti
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
Durable Streaming and Enterprise Messaging
Durable Streaming and Enterprise MessagingDurable Streaming and Enterprise Messaging
Durable Streaming and Enterprise MessagingSalesforce Developers
 
TriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in ProductionTriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in Productiontrihug
 
Salesforce External Objects for Big Data
Salesforce External Objects for Big DataSalesforce External Objects for Big Data
Salesforce External Objects for Big DataSumit Sarkar
 
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designphanleson
 
Salesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social ChangeSalesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social ChangeSalesforce.org
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceSalesforce Developers
 
Phoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBasePhoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBaseSalesforce Developers
 
Analyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObjectAnalyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObjectSalesforce Developers
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Spark Summit
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Daniel Abadi
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming OverviewStratio
 

Viewers also liked (20)

HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
 
Designing Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQLDesigning Scalable Data Warehouse Using MySQL
Designing Scalable Data Warehouse Using MySQL
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Durable Streaming and Enterprise Messaging
Durable Streaming and Enterprise MessagingDurable Streaming and Enterprise Messaging
Durable Streaming and Enterprise Messaging
 
HBASE Overview
HBASE OverviewHBASE Overview
HBASE Overview
 
TriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in ProductionTriHUG 3/14: HBase in Production
TriHUG 3/14: HBase in Production
 
Salesforce External Objects for Big Data
Salesforce External Objects for Big DataSalesforce External Objects for Big Data
Salesforce External Objects for Big Data
 
HBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table designHBase In Action - Chapter 04: HBase table design
HBase In Action - Chapter 04: HBase table design
 
Salesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social ChangeSalesforce for Nonprofits: Turn Big Data into Social Change
Salesforce for Nonprofits: Turn Big Data into Social Change
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
 
Phoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBasePhoenix - A High Performance Open Source SQL Layer over HBase
Phoenix - A High Performance Open Source SQL Layer over HBase
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Analyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObjectAnalyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObject
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview[Spark meetup] Spark Streaming Overview
[Spark meetup] Spark Streaming Overview
 

Similar to Hbase at Salesforce.com

Data hero dream ole19
Data hero dream ole19Data hero dream ole19
Data hero dream ole19rikkehovgaard
 
Moving Your ERP to the Cloud
Moving Your ERP to the CloudMoving Your ERP to the Cloud
Moving Your ERP to the CloudKenandy
 
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015Peter Coffee
 
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter BootstrapSpice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter BootstrapSalesforce Developers
 
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com StreamingRealtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com StreamingSalesforce Developers
 
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜Mitch Okamoto
 
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...Peter Coffee
 
10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeauxSalesforce Admins
 
Operationalizing Big Data as a Service
Operationalizing Big Data as a ServiceOperationalizing Big Data as a Service
Operationalizing Big Data as a ServiceSalesforce Engineering
 
Df14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with ApexDf14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with Apexpbattisson
 
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, AnywhereData Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, AnywhereSalesforce Developers
 
Forcelandia 2016 Wave App Development
Forcelandia 2016   Wave App DevelopmentForcelandia 2016   Wave App Development
Forcelandia 2016 Wave App DevelopmentSkip Sauls
 
Docker on Heroku のはじめ方
Docker on Heroku のはじめ方Docker on Heroku のはじめ方
Docker on Heroku のはじめ方Takashi Abe
 
Finding relevant results faster with Elasticsearch
Finding relevant results faster with ElasticsearchFinding relevant results faster with Elasticsearch
Finding relevant results faster with ElasticsearchElasticsearch
 
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content StrategyDoc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content StrategyGavin Austin
 
Loading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with ApexLoading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with ApexSalesforce Developers
 

Similar to Hbase at Salesforce.com (20)

developer-burnout.pdf
developer-burnout.pdfdeveloper-burnout.pdf
developer-burnout.pdf
 
Winter 14 Release Developer Preview
Winter 14 Release Developer PreviewWinter 14 Release Developer Preview
Winter 14 Release Developer Preview
 
Data hero dream ole19
Data hero dream ole19Data hero dream ole19
Data hero dream ole19
 
Moving Your ERP to the Cloud
Moving Your ERP to the CloudMoving Your ERP to the Cloud
Moving Your ERP to the Cloud
 
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015Forces of the Future That's Now - Peter Coffee at SoTeC 2015
Forces of the Future That's Now - Peter Coffee at SoTeC 2015
 
Using Apex for REST Integration
Using Apex for REST IntegrationUsing Apex for REST Integration
Using Apex for REST Integration
 
Introduction to Apex Triggers
Introduction to Apex TriggersIntroduction to Apex Triggers
Introduction to Apex Triggers
 
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter BootstrapSpice up Your Internal Portal with Visualforce and Twitter Bootstrap
Spice up Your Internal Portal with Visualforce and Twitter Bootstrap
 
Realtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com StreamingRealtime Apps with Node.js, Heroku, and Force.com Streaming
Realtime Apps with Node.js, Heroku, and Force.com Streaming
 
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
再考PaaS 〜 Heroku最新情報で考える、2017年のPaaS選択基準 〜
 
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
Forcing Functions: Reconceiving Everything - Peter Coffee at AITP San Diego C...
 
10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux10 Best Practices using Flow - Darrell DeVeaux
10 Best Practices using Flow - Darrell DeVeaux
 
Operationalizing Big Data as a Service
Operationalizing Big Data as a ServiceOperationalizing Big Data as a Service
Operationalizing Big Data as a Service
 
Df14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with ApexDf14 Building Machine Learning Systems with Apex
Df14 Building Machine Learning Systems with Apex
 
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, AnywhereData Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
Data Democracy: Use Lightning Connect & Heroku to Visualize any Data, Anywhere
 
Forcelandia 2016 Wave App Development
Forcelandia 2016   Wave App DevelopmentForcelandia 2016   Wave App Development
Forcelandia 2016 Wave App Development
 
Docker on Heroku のはじめ方
Docker on Heroku のはじめ方Docker on Heroku のはじめ方
Docker on Heroku のはじめ方
 
Finding relevant results faster with Elasticsearch
Finding relevant results faster with ElasticsearchFinding relevant results faster with Elasticsearch
Finding relevant results faster with Elasticsearch
 
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content StrategyDoc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
Doc is Dead! How Walkthroughs Changed Salesforce's Content Strategy
 
Loading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with ApexLoading Data into the Analytics Cloud with Apex
Loading Data into the Analytics Cloud with Apex
 

More from Salesforce Engineering

Locker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With WebpackLocker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With WebpackSalesforce Engineering
 
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the CloudTechniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the CloudSalesforce Engineering
 
Predictive System Performance Data Analysis
Predictive System Performance Data AnalysisPredictive System Performance Data Analysis
Predictive System Performance Data AnalysisSalesforce Engineering
 
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveAspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveSalesforce Engineering
 
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache CalciteA Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache CalciteSalesforce Engineering
 
Implementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 MilesImplementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 MilesSalesforce Engineering
 
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief OverviewSalesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief OverviewSalesforce Engineering
 
Global State Management of Micro Services
Global State Management of Micro ServicesGlobal State Management of Micro Services
Global State Management of Micro ServicesSalesforce Engineering
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseSalesforce Engineering
 

More from Salesforce Engineering (20)

Locker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With WebpackLocker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With Webpack
 
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the CloudTechniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the Cloud
 
Predictive System Performance Data Analysis
Predictive System Performance Data AnalysisPredictive System Performance Data Analysis
Predictive System Performance Data Analysis
 
Apache HBase State of the Project
Apache HBase State of the ProjectApache HBase State of the Project
Apache HBase State of the Project
 
Hit the Trail with Trailhead
Hit the Trail with TrailheadHit the Trail with Trailhead
Hit the Trail with Trailhead
 
HBase/PHOENIX @ Scale
HBase/PHOENIX @ ScaleHBase/PHOENIX @ Scale
HBase/PHOENIX @ Scale
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Containers and Security for DevOps
Containers and Security for DevOpsContainers and Security for DevOps
Containers and Security for DevOps
 
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveAspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already Have
 
Monitoring @ Scale in Salesforce
Monitoring @ Scale in SalesforceMonitoring @ Scale in Salesforce
Monitoring @ Scale in Salesforce
 
Performance Tuning with XHProf
Performance Tuning with XHProfPerformance Tuning with XHProf
Performance Tuning with XHProf
 
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache CalciteA Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
 
Implementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 MilesImplementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 Miles
 
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief OverviewSalesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
 
Koober Preduction IO Presentation
Koober Preduction IO PresentationKoober Preduction IO Presentation
Koober Preduction IO Presentation
 
Finding Security Issues Fast!
Finding Security Issues Fast!Finding Security Issues Fast!
Finding Security Issues Fast!
 
Microservices
MicroservicesMicroservices
Microservices
 
Global State Management of Micro Services
Global State Management of Micro ServicesGlobal State Management of Micro Services
Global State Management of Micro Services
 
The Future of Hbase
The Future of HbaseThe Future of Hbase
The Future of Hbase
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use case
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Hbase at Salesforce.com

  • 1. HBase @ Salesforce Lars Hofhansl Architect, Father, Meditator,Aikido Blackbelt http://hadoop-hbase.blogspot.com
  • 2. Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements. Safe Harbor
  • 3. Why HBase? • SAN • RDBMS • Transactions
  • 5. A. Why HBase? B. Interacting with the open source community C. HBase at Salesforce
  • 6. Size Matters* New Salesforce customer: •“How many rows do you have?” •We will turn folks away if they have too many! Data Storage is expensive: •SAN storage •Relational Database •Too many rows  Too expensive * In a relational world
  • 7. What if in the future we: … and have cheaper storage? … and never need to ask again about the number of rows? … grow with the data by just adding more machines? (Disclaimer: no transactions, no joins, no 2nd’ary indexes, …)
  • 8. (A quick note about) Relational Databases • We love them. They are core to our infrastructure. • SQL and NoSQL NoACID are complementary. • (Almost) everything we do is SQL based (see Phoenix – the SQL layer for HBase.)
  • 9. The Search - Requirements • Consistent – “Eventually consistent stores are 100% consistent 99% of the time” – Ian Varley • Scalable – No “features” impeding horizontal scaling • Persistent – Duh...? • Key lookups • Range lookups • Open source (ASL great, GPLv2 OK, GPLv3/AGPL not acceptable)
  • 10. Enter HBase “A Sparse, Consistent, Distributed, Multidimensional, Persistent, Sorted Map”
  • 11. Salesforce and the HBase Community
  • 12. To Fork or not to Fork – that is the question Fork - pros • Agility. No waiting for community review. Just get stuff done • Freedom. Patches that might not be acceptable to the community Fork - cons • Lose out on community work • Patches not useful to other parties There is no right or wrong. It’s a matter of choice, taste, and requirements.
  • 13. HBase Development @ Salesforce • No fork of HBase. • No fork of HBase. • Internal HBase/HDFS branch for possible emergency fixes • All fixes are cleaned and contributed back • We switch to the next open source point release periodically
  • 14. PMC member, 2 committers, release manager, contributors HBASE-11042 HBASE-11040 HBASE-11037 HBASE-11030 HBASE-11029 HBASE-11024 HBASE-11022 HBASE- 11010 HBASE-10996 HBASE-10989 HBASE-10988 HBASE-10987 HBASE-10982 HBASE-10969 HBASE-10847 HBASE-10805 HBASE-10722 HBASE-10706 HBASE-10642 HBASE-10594 HBASE-10562 HBASE-10551 HBASE-10546 HBASE-10505 HBASE-10501 HBASE-10489 HBASE-10470 HBASE-10420 HBASE-10416 HBASE-10383 HBASE-10363 HBASE-10320 HBASE-10317 HBASE-10286 HBASE-10284 HBASE-10281 HBASE-10279 HBASE-10259 HBASE-10257 HBASE-10250 HBASE-10181 HBASE-10117 HBASE-10076 HBASE-10058 HBASE-10057 HBASE-10015 HBASE-9993 HBASE-9971 HBASE-9956 HBASE-9915 HBASE- 9865 HBASE-9834 HBASE-9807 HBASE-9799 HBASE-9789 HBASE-9778 HBASE-9751 HBASE-9749 HBASE- 9732 HBASE-9731 HBASE-9711 HBASE-9658 HBASE-9584 HBASE-9566 HBASE-9534 HBASE-9429 HBASE- 9428 HBASE-9377 HBASE-9356 HBASE-9344 HBASE-9301 HBASE-9266 HBASE-9231 HBASE-9221 HBASE- 9186 HBASE-9158 HBASE-9103 HBASE-9097 HBASE-9049 HBASE-8971 HBASE-8945 HBASE-8930 HBASE- 8912 HBASE-8858 HBASE-8809 HBASE-8767 HBASE-8702 HBASE-8698 HBASE-8684 HBASE-8671 HBASE- 8636 HBASE-8525 HBASE-8503 HBASE-8355 HBASE-8316 HBASE-8229 HBASE-8188 HBASE-8166 HBASE- 8151 HBASE-8110 HBASE-8108 HBASE-8055 HBASE-8008 HBASE-7999 HBASE-7947 HBASE-7945 HBASE- 7817 HBASE-7801 HBASE-7729 HBASE-7725 HBASE-7717 HBASE-7709 HBASE-7702 HBASE-7681 HBASE- 7617 HBASE-7602 HBASE-7578 HBASE-7550 HBASE-7499 HBASE-7497 HBASE-7483 HBASE-7466 HBASE- 7465 HBASE-7455 HBASE-7438 HBASE-7435 HBASE-7432 HBASE-7431 HBASE-7417 HBASE-7415 HBASE- 7371 HBASE-7336 HBASE-7293 HBASE-7279 HBASE-7270 HBASE-7252 HBASE-7240 HBASE-7215 HBASE- 7214 HBASE-7180 HBASE-7177 HBASE-7166 HBASE-7165 HBASE-7091 HBASE-7069 HBASE-7051 HBASE- 7047 HBASE-7021 HBASE-7010 HBASE-6996 HBASE-6974
  • 15. PMC member, 2 committers, release manager, contributors HBASE-6949 HBASE-6946 HBASE-6912 HBASE-6889 HBASE-6879 HBASE-6868 HBASE-6865 HBASE-6863 HBASE-6797 HBASE-6796 HBASE-6784 HBASE-6765 HBASE-6757 HBASE-6755 HBASE-6711 HBASE-6707 HBASE-6690 HBASE-6667 HBASE-6638 HBASE-6637 HBASE-6621 HBASE-6582 HBASE-6580 HBASE-6579 HBASE-6573 HBASE-6571 HBASE-6570 HBASE-6569 HBASE-6568 HBASE-6561 HBASE-6523 HBASE-6522 HBASE-6505 HBASE-6504 HBASE-6496 HBASE-6495 HBASE-6441 HBASE-6439 HBASE-6427 HBASE-6426 HBASE-6421 HBASE-6406 HBASE-6355 HBASE-6347 HBASE-6326 HBASE-6296 HBASE-6293 HBASE-6291 HBASE-6178 HBASE-6138 HBASE-6113 HBASE-6112 HBASE-6110 HBASE-6087 HBASE-5961 HBASE-5955 HBASE-5909 HBASE-5884 HBASE-5871 HBASE-5865 HBASE-5782 HBASE-5775 HBASE-5774 HBASE-5682 HBASE-5670 HBASE-5659 HBASE-5641 HBASE-5609 HBASE-5604 HBASE-5574 HBASE-5569 HBASE-5548 HBASE-5547 HBASE-5541 HBASE-5526 HBASE-5523 HBASE-5509 HBASE-5497 HBASE-5460 HBASE-5455 HBASE-5440 HBASE-5431 HBASE-5368 HBASE-5350 HBASE-5348 HBASE-5318 HBASE-5304 HBASE-5266 HBASE-5229 HBASE-5203 HBASE-5118 HBASE-5096 HBASE-5088 HBASE-5084 HBASE-5070 HBASE-5058 HBASE-5005 HBASE-5001 HBASE-4998 HBASE-4981 HBASE-4979 HBASE-4945 HBASE-4886 HBASE-4874 HBASE-4870 HBASE-4838 HBASE-4805 HBASE-4800 HBASE-4691 HBASE-4682 HBASE-4673 HBASE-4657 HBASE-4626 HBASE-4605 HBASE-4583 HBASE-4561 HBASE-4559 HBASE-4556 HBASE-4536 HBASE-4517 HBASE-4488 HBASE-4454 HBASE-4439 HBASE-4404 HBASE-4387 HBASE-4347 HBASE-4336 HBASE-4335 HBASE-4334 HBASE-4331 HBASE-4296 HBASE-4283 HBASE-4263 HBASE-4242 HBASE-4241 HBASE-4197 HBASE-4178 HBASE-4171 HBASE-4102 HBASE-4071 HBASE-3661 HBASE-3645 HBASE-3584 HBASE-3443 HBASE-3433 HBASE-3387 HBASE-2947 HBASE-2196 HBASE-2195 HDFS-3979 HDFS-744
  • 17. Established monthly release train for 0.94
  • 18. Contributed >300 of features, bug fixes, perf improvements
  • 19. Reviewed 1000’s of open source patches
  • 21. Open Sourced Apache Phoenix – SQL skin on HBase
  • 23. Salesforce *is* a database
  • 24. Salesforce is a Database Query Parser Query (SQL) Parsed Query Query Optimizer Plan Generator Plan Cost Estimator Evaluation Plan Query Plan Evaluator System Catalog Database Stats Tables Columns Indexes
  • 25. Salesforce is a Database Query Parser Query (SOQL) Parsed Query Query Optimizer Plan Generator Plan Cost Estimator System Catalog Oracle Hinted Oracle SQL Database Stats Objects Fields Indexes
  • 28. pod = a database instance •Oracle RAC •AppServers •Blob store servers •Search servers •Shared SAN storage •SAN replication for DR App Server App Server App Server App Server … Oracle Node Oracle Node Oracle Node Oracle Node… Oracle RAC cluster Primary Site Secondary Site SAN replication SAN SAN SQL/JDBC
  • 29. Finally: HBase @ Salesforce
  • 30. Oracle Hinted Oracle SQL Query Parser Query (SOQL) Parsed Query Query Optimizer Plan Generator Plan Cost Estimator System Catalog Database Stats Objects Fields Indexes 1. External Objects 2. Phoenix SQL HBaseHBaseHBaseHBase Where does HBase Fit?
  • 31. Where does HBase Fit? •Separate HBase per pod (close to 50 clusters) •Logically co-located with Oracle •Small clusters striped across five racks •Each cluster’s master service on a different rack •Identical cluster for DR App Server App Server App Server App Server … Oracle Node Oracle Node HBase Node HBase Node… Oracle Cluster HBase Node HBase Node HBase Node … Primary Site Secondary Site DR HBase Cluster Decentralized HBase Replication SQL/JDBC via Phoenix HBase Cluster … SAN SAN
  • 33. 1. Audit Trails (Entity History) • Identity managed in RDBMS • Indexed in HBase (Phoenix indexes) • Historical, immutable data only • No need to reason about updates, split identities, and transactions
  • 34. 2. Archiving (Data Lifecycle Management) • Objects (rows) moved to HBase • Identity managed in HBase after move • Data immutable in HBase • No Transactions
  • 35. 3. Live data in HBase (BigObjects) • Mutable data (possibly) • Everything managed in HBase • Still no Transactions, yet • Platform for other team to use
  • 36. Merrill Lynch Rationalization Data Governance, Audit & Archive • First Salesforce Enterprise Customer • On PlatformArchival compelling versus On Premise Solution from Informatica • Retention Requirements for 7 Years Merrill Lynch “Data Audit, Governance & Lifecycle management is critical for Merrill for the entire banking & financial industry has become a benchmark requirement
  • 37. Heating, ventilation, and air-conditioning in the EU • Top 10 Platform Users • Subject to highly variable data governance and retention requirements • Significant SAP footprint driving business rules – need to connect that to Salesforce data for archival and data retention needs • Massive service workforce generates significant data processing challenges “The Salesforce.com Platform roadmap for Data Archive is critical for future data management needs” MichaelRoehr, CTO Vailliant
  • 38. BMW Enriches Their Customer Perspective • Sales Cloud available across all German Dealership Franchises • All customer data subject stringent & government mandated protection, audit & retention • Correlations with Car Builder App data enables more contextual customer interactions • Car Telemetry, used correctly help refine product evolution and customer needs alignment “Data driven customer engagement is a key driver for our enhance customer experience
  • 39. System Of Record (SOR) SOR = HA + DR + Backup + M&M + Security
  • 40.
  • 41. Highly Available, Disaster Recovery • Five peer Zookeeper Quorum • Five Quorum Journals (for fs edits) • Five HMasters • Three NameNodes (yes, three, we made a patch to run more than one standby) • HBase Replication to identical hot standby pod in a different data center – In the event of a disaster we fail a complete pod to the secondary site • Weekly automated, unattended rolling restarts
  • 42. Replication Backup High-level Architecture Primary pod HBase 48h HDFS Backup per tenant DR pod HBase 48h HDFS Merkle Tree Verification Backup per tenant
  • 43. Monitoring & Management (M&M) • Nagios alerts • Trending via OpenTSDB. Custom UI on top the time series data. • Rolling upgrades – Eventually scheduled and unattended • Absolutely no unscheduled downtime. Not even during a rack failure.
  • 44. A. Why HBase? B. Interacting with the open source community C. HBase at Salesforce

Editor's Notes

  1. Spent time with StumbleUpon, Facebook, many others. This is a great community.
  2. Salesforce is seeing increasing change of center of gravity of customer data.Driving this forward across verticals such as Banking & Finserv requires data audit driven by post 2008 regularity requirements and Sar-Box requirements. As this data generated in a transactional environment we use HBase as our historical and immutable storage. 
  3. Their use of the  Salesforce.com platform to drive their entire business keeps to keep their dynamic and highly work force mobile in touch with their data.Given their operating environment in Germany they are required to deliver complete data audit and use Field History for this. They also are required to keep all customer data for at least 15 years which is why Archive is so key for them.
  4. Across Germany we've had a successful deployment in each franchise to establish new base lines in customer interactions with BMW customers, leases and service interactions. Looking beyond this usecase the capability of marrying together the customer data generated for the BMW Car Builder application and cleansed and anonymizedtelemetrics data is pushing Salesforce to deliver the concepts and tools to allow BMW to absorb the full spectrum of their customer event data stream, and take business actions on it.Imagine how I would feel as a prospective customer if I walked into a dealership and they have a more informed knowledge of who I am and my likely preferences. We are using the notion of BigObjects to absorb, store and act on the data that is behind the Internet of Customers.