SlideShare a Scribd company logo
1 of 13
Download to read offline
© 2016 IBM Corporation
Introducing Big SQL Federation
Createdby C. M. Saracco,IBM Silicon Valley Lab
June 2016
© 2016 IBM Corporation2
Executive summary
§ What’s Big SQL federation?
− Integration technology for Hadoop and remote data sources
− Transparently query Big SQL (Hadoop) and RDBMS tables with standard SQL
− Query optimization, security mapping, other critical features built in
§ Why federate?
− Not always practical to move / replicate data from one source to another
− Hadoop programmers need access to corporate RDBMS data to enhance analytics,
integrate public and proprietary data, etc.
§ What’s supported?
− Big SQL tables (and views) in DFS, HBase, or Hive warehouse
− RDBMS tables (and views) in Oracle, Teradata, MS SQL Server, DB2, Informix,
Netezza, . . .
− Query data across all sources (project, restrict, join, union, wide range of sub-queries,
wide range of built-in functions )
− INSERT INTO … SELECT FROM …
− Issue data-source specific SQL
− Collect statistics and inspect detailed data access plan
− . . . .
© 2016 IBM Corporation3
Agenda
§Overview
− Key features
− When to federate
§Technology
− Architecture
− Set up, usage examples
− Supported data sources
§Summary
© 2016 IBM Corporation4
Big SQL query federation = virtualized data access
Transparent
§ Appears to be one source
§ Programmers don’t need to know how /
where data is stored
Heterogeneous
§ Accesses data from diverse sources
High Function
§ Full query support against all data
§ Capabilities of sources as well
Autonomous
§ Non-disruptive to data sources, existing
applications, systems.
High Performance
§ Optimization of distributed queries
SQL tools,
applications Data sources
Virtualized
data
© 2016 IBM Corporation5
When to federate….
§ Budget
§ Resources
§ Time
§ Ownership
§ Too ad hoc, temporary
§ Too proprietary
§ Too recent
§ Too big
Physical integration not always a requirement/option
Barriers
© 2016 IBM Corporation6
Agenda
§Overview
− Key features
− When to federate
§Technology
− Architecture
− Set up, usage examples
− Supported data sources
§Summary
© 2016 IBM Corporation7
Federation architecture and components
Wrapper
ServerServer
Nickname
Nickname
Nickname
Federated server:
BigSQL database enabled
for federation.
Wrapper: library allowing
access to a particular
class of data sources or
protocols (Net8, DRDA,
etc). Contains
information about data
source characteristics
Server: represents a
specific data source
Nickname: a local alias
to data on a remote
server (e.g, a specific
table or view)
Federation catalog
4Stores information about
4Wrappers,servers,
nicknames
4Server attributes
4Nickname attributes
4Remote functions
Federation server (Big SQL)
© 2016 IBM Corporation8
Federation in practice
§ Admin enables
federation
§ Apps connect to Big
SQL database
§ Nicknames look like
tables to the app
§ Big SQL optimizer
creates global data
access plan with cost
analysis, query push
down
§ Query fragments
executed remotely
Nickname
Nickname
Table
Cost-based optimizer
Wrapper
Client library
Wrapper
Client library
Local + Remote
Execution Plans
Remote sources
Federation server (Big SQL)
Native dialect
Connect to bigsql
© 2016 IBM Corporation9
Creating and using federated objects (example)
-- Create wrapper to identify client library (Oracle Net8)
CREATE WRAPPER ORA LIBRARY 'libdb2net8.so'
-- Create server for Oracle data source
CREATE SERVER ORASERV TYPE ORACLE VERSION 11 WRAPPER ORA
AUTHORIZATION
”orauser” PASSWORD ”orauser” OPTIONS (NODE 'TNSNODENAME', PUSHDOWN 'Y',
COLLATING_SEQUENCE 'N');
-- Map the local user 'orauser' to the Oracle user 'orauser' / password 'orauser'
CREATE USER MAPPING FOR orauser SERVER ORASERV OPTIONS (
REMOTE_PASSWORD
'orauser');
-- Create nickname for Oracle table / view
CREATE NICKNAME NICK1 FOR ORASERV.ORAUSER.TABLE1;
-- Query the nickname
SELECT * FROM NICK1 WHERE COL1 < 10;
© 2016 IBM Corporation10
Joining data across sources
© 2016 IBM Corporation11
Data sources supported by Big SQL Federation Server
§ Current list of supported data sources available at
https://www-304.ibm.com/support/entdocview.wss?uid=swg27044495
Data Source Supported Versions Notes
DB2® DB2 for Linux, UNIX, and
Windows 9.7, 9.8, 10.1, 10.5
DB2 for z/OS 8.x, 9.x, and 10.x
Oracle 11g, 11gR1, 11g R2, 12c
Teradata 12, 13, 14 Not supported on POWER systems.
Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems.
Informix 11.5
Microsoft SQL Server 2012, 2014
© 2016 IBM Corporation12
Agenda
§Overview
− Key features
− When to federate
§Technology
− Architecture
− Set up, usage examples
− Supported data sources
§Summary
© 2016 IBM Corporation13
Big SQL federation
– Easily access information on demand
– Combine Big Data in Hadoop with RDBMS data
– Quickly extend your data warehouse
Benefits
– Cost-effective
– Quick to provide fast time to value
– Agile and flexible
– Versatile
– Low risk, seamless, and transparent

More Related Content

What's hot

Big Data: Working with Big SQL data from Spark
Big Data:  Working with Big SQL data from Spark Big Data:  Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark Cynthia Saracco
 
Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Nicolas Morales
 
Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Nicolas Morales
 
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data:  Explore Hadoop and BigInsights self-study labBig Data:  Explore Hadoop and BigInsights self-study lab
Big Data: Explore Hadoop and BigInsights self-study labCynthia Saracco
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guideCynthia Saracco
 
Big Data: HBase and Big SQL self-study lab
Big Data:  HBase and Big SQL self-study lab Big Data:  HBase and Big SQL self-study lab
Big Data: HBase and Big SQL self-study lab Cynthia Saracco
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdIBM Analytics
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase Cynthia Saracco
 
Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Data Con LA
 
Big Data: Get started with SQL on Hadoop self-study lab
Big Data:  Get started with SQL on Hadoop self-study lab Big Data:  Get started with SQL on Hadoop self-study lab
Big Data: Get started with SQL on Hadoop self-study lab Cynthia Saracco
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UKSUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UKhuguk
 
Schema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteSchema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteAmr Awadallah
 
Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreCloudera, Inc.
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architectureMilos Milovanovic
 

What's hot (18)

Big Data: Working with Big SQL data from Spark
Big Data:  Working with Big SQL data from Spark Big Data:  Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark
 
Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0Taming Big Data with Big SQL 3.0
Taming Big Data with Big SQL 3.0
 
Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014
 
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data:  Explore Hadoop and BigInsights self-study labBig Data:  Explore Hadoop and BigInsights self-study lab
Big Data: Explore Hadoop and BigInsights self-study lab
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guide
 
Big Data: HBase and Big SQL self-study lab
Big Data:  HBase and Big SQL self-study lab Big Data:  HBase and Big SQL self-study lab
Big Data: HBase and Big SQL self-study lab
 
Ibm db2 big sql
Ibm db2 big sqlIbm db2 big sql
Ibm db2 big sql
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase
 
Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014Hadoop Innovation Summit 2014
Hadoop Innovation Summit 2014
 
Big Data: Get started with SQL on Hadoop self-study lab
Big Data:  Get started with SQL on Hadoop self-study lab Big Data:  Get started with SQL on Hadoop self-study lab
Big Data: Get started with SQL on Hadoop self-study lab
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UKSUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
 
Schema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteSchema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-Write
 
Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data Store
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Planing and optimizing data lake architecture
Planing and optimizing data lake architecturePlaning and optimizing data lake architecture
Planing and optimizing data lake architecture
 

Similar to Big Data: SQL query federation for Hadoop and RDBMS data

IDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data EnvironmentsIDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data EnvironmentsIDERA Software
 
RMOUG MySQL 5.7 New Features
RMOUG MySQL 5.7 New FeaturesRMOUG MySQL 5.7 New Features
RMOUG MySQL 5.7 New FeaturesDave Stokes
 
New data dictionary an internal server api that matters
New data dictionary an internal server api that mattersNew data dictionary an internal server api that matters
New data dictionary an internal server api that mattersAlexander Nozdrin
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
 
MySQL Day Paris 2016 - MySQL as a Document Store
MySQL Day Paris 2016 - MySQL as a Document StoreMySQL Day Paris 2016 - MySQL as a Document Store
MySQL Day Paris 2016 - MySQL as a Document StoreOlivier DASINI
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage CCG
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewPaulo Fagundes
 
Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Jean-Pierre König
 
What is Trove, the Database as a Service on OpenStack?
What is Trove, the Database as a Service on OpenStack?What is Trove, the Database as a Service on OpenStack?
What is Trove, the Database as a Service on OpenStack?OpenStack_Online
 
OpenStack Online Meetup
OpenStack Online MeetupOpenStack Online Meetup
OpenStack Online MeetupTesora
 
NonStop SQL/MX DBS Explained
NonStop SQL/MX DBS ExplainedNonStop SQL/MX DBS Explained
NonStop SQL/MX DBS ExplainedFrans Jongma
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Postgres Integrates Effectively in the "Enterprise Sandbox"
Postgres Integrates Effectively in the "Enterprise Sandbox"Postgres Integrates Effectively in the "Enterprise Sandbox"
Postgres Integrates Effectively in the "Enterprise Sandbox"EDB
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataInfiniteGraph
 
Ibm integrated analytics system
Ibm integrated analytics systemIbm integrated analytics system
Ibm integrated analytics systemModusOptimum
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Connor McDonald
 
MySQL Document Store
MySQL Document StoreMySQL Document Store
MySQL Document StoreMario Beck
 
MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017MySQL Brasil
 
Data API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementData API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementVictor Olex
 

Similar to Big Data: SQL query federation for Hadoop and RDBMS data (20)

IDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data EnvironmentsIDERA Live | Working with Complex Data Environments
IDERA Live | Working with Complex Data Environments
 
RMOUG MySQL 5.7 New Features
RMOUG MySQL 5.7 New FeaturesRMOUG MySQL 5.7 New Features
RMOUG MySQL 5.7 New Features
 
New data dictionary an internal server api that matters
New data dictionary an internal server api that mattersNew data dictionary an internal server api that matters
New data dictionary an internal server api that matters
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
 
MySQL Day Paris 2016 - MySQL as a Document Store
MySQL Day Paris 2016 - MySQL as a Document StoreMySQL Day Paris 2016 - MySQL as a Document Store
MySQL Day Paris 2016 - MySQL as a Document Store
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
 
Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013
 
What is Trove, the Database as a Service on OpenStack?
What is Trove, the Database as a Service on OpenStack?What is Trove, the Database as a Service on OpenStack?
What is Trove, the Database as a Service on OpenStack?
 
OpenStack Online Meetup
OpenStack Online MeetupOpenStack Online Meetup
OpenStack Online Meetup
 
NonStop SQL/MX DBS Explained
NonStop SQL/MX DBS ExplainedNonStop SQL/MX DBS Explained
NonStop SQL/MX DBS Explained
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Postgres Integrates Effectively in the "Enterprise Sandbox"
Postgres Integrates Effectively in the "Enterprise Sandbox"Postgres Integrates Effectively in the "Enterprise Sandbox"
Postgres Integrates Effectively in the "Enterprise Sandbox"
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big DataSolution Use Case Demo: The Power of Relationships in Your Big Data
Solution Use Case Demo: The Power of Relationships in Your Big Data
 
Ibm integrated analytics system
Ibm integrated analytics systemIbm integrated analytics system
Ibm integrated analytics system
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2
 
MySQL Document Store
MySQL Document StoreMySQL Document Store
MySQL Document Store
 
MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017
 
Data API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementData API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of Engagement
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Big Data: SQL query federation for Hadoop and RDBMS data

  • 1. © 2016 IBM Corporation Introducing Big SQL Federation Createdby C. M. Saracco,IBM Silicon Valley Lab June 2016
  • 2. © 2016 IBM Corporation2 Executive summary § What’s Big SQL federation? − Integration technology for Hadoop and remote data sources − Transparently query Big SQL (Hadoop) and RDBMS tables with standard SQL − Query optimization, security mapping, other critical features built in § Why federate? − Not always practical to move / replicate data from one source to another − Hadoop programmers need access to corporate RDBMS data to enhance analytics, integrate public and proprietary data, etc. § What’s supported? − Big SQL tables (and views) in DFS, HBase, or Hive warehouse − RDBMS tables (and views) in Oracle, Teradata, MS SQL Server, DB2, Informix, Netezza, . . . − Query data across all sources (project, restrict, join, union, wide range of sub-queries, wide range of built-in functions ) − INSERT INTO … SELECT FROM … − Issue data-source specific SQL − Collect statistics and inspect detailed data access plan − . . . .
  • 3. © 2016 IBM Corporation3 Agenda §Overview − Key features − When to federate §Technology − Architecture − Set up, usage examples − Supported data sources §Summary
  • 4. © 2016 IBM Corporation4 Big SQL query federation = virtualized data access Transparent § Appears to be one source § Programmers don’t need to know how / where data is stored Heterogeneous § Accesses data from diverse sources High Function § Full query support against all data § Capabilities of sources as well Autonomous § Non-disruptive to data sources, existing applications, systems. High Performance § Optimization of distributed queries SQL tools, applications Data sources Virtualized data
  • 5. © 2016 IBM Corporation5 When to federate…. § Budget § Resources § Time § Ownership § Too ad hoc, temporary § Too proprietary § Too recent § Too big Physical integration not always a requirement/option Barriers
  • 6. © 2016 IBM Corporation6 Agenda §Overview − Key features − When to federate §Technology − Architecture − Set up, usage examples − Supported data sources §Summary
  • 7. © 2016 IBM Corporation7 Federation architecture and components Wrapper ServerServer Nickname Nickname Nickname Federated server: BigSQL database enabled for federation. Wrapper: library allowing access to a particular class of data sources or protocols (Net8, DRDA, etc). Contains information about data source characteristics Server: represents a specific data source Nickname: a local alias to data on a remote server (e.g, a specific table or view) Federation catalog 4Stores information about 4Wrappers,servers, nicknames 4Server attributes 4Nickname attributes 4Remote functions Federation server (Big SQL)
  • 8. © 2016 IBM Corporation8 Federation in practice § Admin enables federation § Apps connect to Big SQL database § Nicknames look like tables to the app § Big SQL optimizer creates global data access plan with cost analysis, query push down § Query fragments executed remotely Nickname Nickname Table Cost-based optimizer Wrapper Client library Wrapper Client library Local + Remote Execution Plans Remote sources Federation server (Big SQL) Native dialect Connect to bigsql
  • 9. © 2016 IBM Corporation9 Creating and using federated objects (example) -- Create wrapper to identify client library (Oracle Net8) CREATE WRAPPER ORA LIBRARY 'libdb2net8.so' -- Create server for Oracle data source CREATE SERVER ORASERV TYPE ORACLE VERSION 11 WRAPPER ORA AUTHORIZATION ”orauser” PASSWORD ”orauser” OPTIONS (NODE 'TNSNODENAME', PUSHDOWN 'Y', COLLATING_SEQUENCE 'N'); -- Map the local user 'orauser' to the Oracle user 'orauser' / password 'orauser' CREATE USER MAPPING FOR orauser SERVER ORASERV OPTIONS ( REMOTE_PASSWORD 'orauser'); -- Create nickname for Oracle table / view CREATE NICKNAME NICK1 FOR ORASERV.ORAUSER.TABLE1; -- Query the nickname SELECT * FROM NICK1 WHERE COL1 < 10;
  • 10. © 2016 IBM Corporation10 Joining data across sources
  • 11. © 2016 IBM Corporation11 Data sources supported by Big SQL Federation Server § Current list of supported data sources available at https://www-304.ibm.com/support/entdocview.wss?uid=swg27044495 Data Source Supported Versions Notes DB2® DB2 for Linux, UNIX, and Windows 9.7, 9.8, 10.1, 10.5 DB2 for z/OS 8.x, 9.x, and 10.x Oracle 11g, 11gR1, 11g R2, 12c Teradata 12, 13, 14 Not supported on POWER systems. Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems. Informix 11.5 Microsoft SQL Server 2012, 2014
  • 12. © 2016 IBM Corporation12 Agenda §Overview − Key features − When to federate §Technology − Architecture − Set up, usage examples − Supported data sources §Summary
  • 13. © 2016 IBM Corporation13 Big SQL federation – Easily access information on demand – Combine Big Data in Hadoop with RDBMS data – Quickly extend your data warehouse Benefits – Cost-effective – Quick to provide fast time to value – Agile and flexible – Versatile – Low risk, seamless, and transparent