SlideShare a Scribd company logo
1 of 12
Download to read offline
June 2012

HiveServer2 Project (WIP)
Carl Steinbach | Platform Engineering
Hive Background: What is it?

    An ETL/Data Warehouse system for Hadoop:

    •  SQL->MR Compiler and Execution Engine

    •  SerDes: Pluggable Data Format Handlers

    •  MetaStore: Persistent Metadata Storage


2
                    ©2012 Cloudera, Inc. All Rights Reserved.
Hive Evolution

    •  Original Vision:
      –  Let users express their queries in a high-level
         language without having to write MR
         programs


    •  Now more and more:
      –  A parallel SQL DBMS that happens to use
         Hadoop for its storage and execution layer.



3
                          ©2012 Cloudera, Inc. All Rights Reserved.
What do users expect from a DBMS?

    •  Sessions/Concurrency
      –  Persistent client state on the server-side
      –  Ability to run multiple client concurrently
    •  ODBC/JDBC
      –  SQL IDEs, BI, ETL, …
    •  Authentication/Authorization
    •  Auditing/Logging


4
                        ©2012 Cloudera, Inc. All Rights Reserved.
What’s Missing?

    •  Sessions/Concurrency
      –  Current Thrift API can’t support concurrency
    •  ODBC/JDBC
      –  Thrift API doesn’t support common ODBC/JDBC
    •  Authentication/Authorization
      –  Incomplete implementations
    •  Auditing/Logging
      –  Multiple plugin interfaces in need of consolidation



5
                        ©2012 Cloudera, Inc. All Rights Reserved.
What’s Missing

    Concurrency/Sessions

    •  Current Thrift API can’t support multiple
       connections or client sessions.
    •  User/Global Configuration and Session
       Info
    •  Query compiler memory leaks


6
                      ©2012 Cloudera, Inc. All Rights Reserved.
What’s Missing

    ODBC/JDBC
    •  Thrift API can’t support common ODBC/
       JDBC calls:
      –  SQLGetInfo
      –  SQLGetTypeInfo
      –  SQLCancel
      –  SQLGetFunctions



7
                     ©2012 Cloudera, Inc. All Rights Reserved.
What’s Missing

    Authentication/Authorization
    •  SASL Authentication for HiveServer
    •  Hive supports GRANT/ROLE based
       authorization, but implementation is
       incomplete.
    •  Code injection vectors: ADD JAR,
       TRANSFORM, SET x, …



8
                     ©2012 Cloudera, Inc. All Rights Reserved.
Project Milestones

    •  HiveServer2 Thrift API Spec
    •  JDBC/ODBC HiveServer2 Drivers
    •  Concurrent Thrift clients
      –  Fix query compiler memory leaks
      –  User/Global session/configuration information
    •  Authentication (Kerberos)
    •  Authorization
      –  Extend to configuration, ADD x,
         TRANSFORM, …

9
                       ©2012 Cloudera, Inc. All Rights Reserved.
Who’s working on it?

 •  Carl Steinbach
     –  carl@cloudera
 •  Prasad Mujumdar
     –  prasadm@cloudera




10
                        ©2011 Cloudera, Inc. All Rights Reserved.
Resources

 •  HIVE-2935: Implement HiveServer2

 •  HiveServer API Proposal:
     –  https://cwiki.apache.org/confluence/display/
        Hive/HiveServer2+Thrift+API




11
                      ©2011 Cloudera, Inc. All Rights Reserved.
Questions?

 •    Questions?
 •    Questions?
 •    Questions?
 •    Questions?




12
                   ©2012 Cloudera, Inc. All Rights Reserved.

More Related Content

What's hot

Alphorm.com Formation Ansible : Le Guide Complet du Débutant
Alphorm.com Formation Ansible : Le Guide Complet du DébutantAlphorm.com Formation Ansible : Le Guide Complet du Débutant
Alphorm.com Formation Ansible : Le Guide Complet du Débutant
Alphorm
 

What's hot (20)

NFV VNF Architecture
NFV VNF ArchitectureNFV VNF Architecture
NFV VNF Architecture
 
Alphorm.com Formation Nouveautés Windows Server 2016
Alphorm.com Formation Nouveautés Windows Server 2016Alphorm.com Formation Nouveautés Windows Server 2016
Alphorm.com Formation Nouveautés Windows Server 2016
 
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIOHigh Performance Object Storage in 30 Minutes with Supermicro and MinIO
High Performance Object Storage in 30 Minutes with Supermicro and MinIO
 
Alphorm.com Formation Ansible : Le Guide Complet du Débutant
Alphorm.com Formation Ansible : Le Guide Complet du DébutantAlphorm.com Formation Ansible : Le Guide Complet du Débutant
Alphorm.com Formation Ansible : Le Guide Complet du Débutant
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
 
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
 
Ansible, best practices
Ansible, best practicesAnsible, best practices
Ansible, best practices
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
Zerto virtual replication 5.0 표준소개자료
Zerto virtual replication 5.0 표준소개자료Zerto virtual replication 5.0 표준소개자료
Zerto virtual replication 5.0 표준소개자료
 
Alexei vladishev - Open Source Monitoring With Zabbix
Alexei vladishev - Open Source Monitoring With ZabbixAlexei vladishev - Open Source Monitoring With Zabbix
Alexei vladishev - Open Source Monitoring With Zabbix
 
Docker swarm introduction
Docker swarm introductionDocker swarm introduction
Docker swarm introduction
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
 
DATABASE AUTOMATION with Thousands of database, monitoring and backup
DATABASE AUTOMATION with Thousands of database, monitoring and backupDATABASE AUTOMATION with Thousands of database, monitoring and backup
DATABASE AUTOMATION with Thousands of database, monitoring and backup
 
Label based Mandatory Access Control on PostgreSQL
Label based Mandatory Access Control on PostgreSQLLabel based Mandatory Access Control on PostgreSQL
Label based Mandatory Access Control on PostgreSQL
 
Managing Terraform Module Versioning and Dependencies
Managing Terraform Module Versioning and Dependencies Managing Terraform Module Versioning and Dependencies
Managing Terraform Module Versioning and Dependencies
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Introduction to Docker Containers - Docker Captain
Introduction to Docker Containers - Docker CaptainIntroduction to Docker Containers - Docker Captain
Introduction to Docker Containers - Docker Captain
 
Terraform
TerraformTerraform
Terraform
 
Ansible
AnsibleAnsible
Ansible
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking Overview
 

Similar to HiveServer2 for Apache Hive

Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
DataWorks Summit
 

Similar to HiveServer2 for Apache Hive (20)

Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
1 architecture & design
1   architecture & design1   architecture & design
1 architecture & design
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Data Science Languages and Industry Analytics
Data Science Languages and Industry AnalyticsData Science Languages and Industry Analytics
Data Science Languages and Industry Analytics
 
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Turning Relational Database Tables into Hadoop Datasources by Kuassi MensahTurning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
 
NoSQL and MySQL
NoSQL and MySQLNoSQL and MySQL
NoSQL and MySQL
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
The Power of Java and Oracle WebLogic Server in the Public Cloud (OpenWorld, ...
The Power of Java and Oracle WebLogic Server in the Public Cloud (OpenWorld, ...The Power of Java and Oracle WebLogic Server in the Public Cloud (OpenWorld, ...
The Power of Java and Oracle WebLogic Server in the Public Cloud (OpenWorld, ...
 
Impala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopImpala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for Hadoop
 
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
 
Debugging PL/SQL from your APEX Applications with Oracle SQL Developer
Debugging PL/SQL from your APEX Applications with Oracle SQL DeveloperDebugging PL/SQL from your APEX Applications with Oracle SQL Developer
Debugging PL/SQL from your APEX Applications with Oracle SQL Developer
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
 
Hp
HpHp
Hp
 
HP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy TeamHP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy Team
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

HiveServer2 for Apache Hive

  • 1. June 2012 HiveServer2 Project (WIP) Carl Steinbach | Platform Engineering
  • 2. Hive Background: What is it? An ETL/Data Warehouse system for Hadoop: •  SQL->MR Compiler and Execution Engine •  SerDes: Pluggable Data Format Handlers •  MetaStore: Persistent Metadata Storage 2 ©2012 Cloudera, Inc. All Rights Reserved.
  • 3. Hive Evolution •  Original Vision: –  Let users express their queries in a high-level language without having to write MR programs •  Now more and more: –  A parallel SQL DBMS that happens to use Hadoop for its storage and execution layer. 3 ©2012 Cloudera, Inc. All Rights Reserved.
  • 4. What do users expect from a DBMS? •  Sessions/Concurrency –  Persistent client state on the server-side –  Ability to run multiple client concurrently •  ODBC/JDBC –  SQL IDEs, BI, ETL, … •  Authentication/Authorization •  Auditing/Logging 4 ©2012 Cloudera, Inc. All Rights Reserved.
  • 5. What’s Missing? •  Sessions/Concurrency –  Current Thrift API can’t support concurrency •  ODBC/JDBC –  Thrift API doesn’t support common ODBC/JDBC •  Authentication/Authorization –  Incomplete implementations •  Auditing/Logging –  Multiple plugin interfaces in need of consolidation 5 ©2012 Cloudera, Inc. All Rights Reserved.
  • 6. What’s Missing Concurrency/Sessions •  Current Thrift API can’t support multiple connections or client sessions. •  User/Global Configuration and Session Info •  Query compiler memory leaks 6 ©2012 Cloudera, Inc. All Rights Reserved.
  • 7. What’s Missing ODBC/JDBC •  Thrift API can’t support common ODBC/ JDBC calls: –  SQLGetInfo –  SQLGetTypeInfo –  SQLCancel –  SQLGetFunctions 7 ©2012 Cloudera, Inc. All Rights Reserved.
  • 8. What’s Missing Authentication/Authorization •  SASL Authentication for HiveServer •  Hive supports GRANT/ROLE based authorization, but implementation is incomplete. •  Code injection vectors: ADD JAR, TRANSFORM, SET x, … 8 ©2012 Cloudera, Inc. All Rights Reserved.
  • 9. Project Milestones •  HiveServer2 Thrift API Spec •  JDBC/ODBC HiveServer2 Drivers •  Concurrent Thrift clients –  Fix query compiler memory leaks –  User/Global session/configuration information •  Authentication (Kerberos) •  Authorization –  Extend to configuration, ADD x, TRANSFORM, … 9 ©2012 Cloudera, Inc. All Rights Reserved.
  • 10. Who’s working on it? •  Carl Steinbach –  carl@cloudera •  Prasad Mujumdar –  prasadm@cloudera 10 ©2011 Cloudera, Inc. All Rights Reserved.
  • 11. Resources •  HIVE-2935: Implement HiveServer2 •  HiveServer API Proposal: –  https://cwiki.apache.org/confluence/display/ Hive/HiveServer2+Thrift+API 11 ©2011 Cloudera, Inc. All Rights Reserved.
  • 12. Questions? •  Questions? •  Questions? •  Questions? •  Questions? 12 ©2012 Cloudera, Inc. All Rights Reserved.