SlideShare a Scribd company logo
1 of 50
IBM PureData System for Analytics 
10/12/2014 1
 At the end of this session, participants will understand all the basic concepts about 
IBM Puredata System for Analytics (Netezza). 
 IBM Puredata System models and its components. 
 IBM Puredata System Architecture. And 
 How it works exactly. 
10/12/2014 2
“If you'd like to take us on, make our day.” 
- Larry Ellison, Oct 2009 
“Our goal is to become number one in the high-end server business for both 
Online Transaction Processing and Data Warehousing, both of those 
segments.” 
- Larry Ellison, Dec 2010 
10/12/2014 3
One of “The five most important M&A Deals of 2010” 
- Wall Street Journal 
10/12/2014 4
10/12/2014 5
 Dedicated device 
 Optimized for purpose 
 Complete solution 
 Fast installation 
 Very easy operation 
 Standard interfaces 
 Low cost 
10/12/2014 6
PureData for Analytics – Where Big Data Meets Deep 
Analytics 
Analytics without constraint 
10/12/2014 7
10/12/2014 8
9 
Seamless integration with Informatica, Business Objects, SAS and SQL 
Server (SSIS packages) 
Very little DDL & SQL conversion 
• Used same table structures 
• Converted Primary index to Distribution column 
10 to 200X performance improvements in BO reporting 
Fast to Deploy 
Price to Performance very appealing 
Ease of use. 
• Administrative 
• DBA Tasks 
• Supports all DB structures (3NF, Star, De-Normalized table) 
9 
Why Netezza? 
10/12/2014
 The IBM PureData System for Analytics N1001 models 
include single-rack and multi-rack configurations. 
 The N1001 model family is an update to the IBM Netezza 
1000 model family, with the same architectural and 
interface specifications. 
 Each N1001 storage array contains either two or four disk 
enclosures, depending upon the model. 
 Each disk enclosure has 12 disks. 
 For example, an N1001-005 system has one storage array 
with 48 disks. 
10/12/2014 10
10/12/2014 11
 The IBM® PureData™ System for Analytics N2001 family 
is the latest generation of data warehouse appliances. 
 It increases the capacity and performance of the N1001 
models. 
 Within each rack are numerous components that work 
together to provide the asymmetric massively parallel 
processing of the Netezza® architecture. 
 The key hardware components include: 
 Snippet blades (S-Blades) 
Hosts 
 Storage arrays 
10/12/2014 12
10/12/2014 13 
The Following figure summarizes the IBM PureData System for Analytics N2001 half-rack, 
full-rack, and two-rack models.
 The snippet processing functions are the responsibility of 
the S-Blade. 
 The S-Blade is a specialized processing board which 
combines the CPU processing power of a blade server 
with the query analysis intelligence of the Netezza 
Database Accelerator card. 
 The dualboard component resides in two slots of the S-Blade 
chassis. 
 Each chassis can contain up to 7 S-Blades. 
10/12/2014 14
 The Netezza Database Accelerator card contains the FPGA query engines, memory, 
and I/O for processing the data from the disks where user data is stored. 
10/12/2014 15
 The host server is a Linux server that runs the Netezza software 
and utilities. 
 The host controls and coordinates the activity of the appliance. 
 It performs query optimization; controls table and database 
operations; consolidates and returns query results; and monitors 
the Netezza system components to detect and report problems. 
 The host is a highly redundant, highly available, server. 
 The Netezza 1000 systems have two hosts in a highly available (HA) 
configuration. 
10/12/2014 16
 The storage arrays contain the disks that store the user data 
and related processing files to support the query activity on 
the system. 
 In the N2001 model family, each disk enclosure has 24 
disks. 
 There are 12 disk enclosures in each full rack, or 6 
enclosures in a half-rack model. 
 In the N2001 family, each rack is one storage array. 
10/12/2014 17
 Netezza's appliances use a proprietary Asymmetric Massively Parallel 
Processing (AMPP) architecture that combines open, blade-based 
servers and disk storage with a proprietary data filtering process using 
field-programmable gate arrays (FPGAs). 
 Netezza’s proprietary AMPP architecture is a two-tiered system 
designed to quickly handle very large queries from multiple users. 
 The first tier is a high-performance Linux SMP host that compiles data 
query tasks received from business intelligence applications, and 
generates query execution plans. 
 It then divides a query into a sequence of sub-tasks, or snippets that 
can be executed in parallel, and distributes the snippets to the second 
tier for execution. 
 The second tier consists of multiple no. of snippet processing blades, 
or S-Blades, where all the primary processing work of the appliance is 
executed. 
10/12/2014 18
19 10/12/2014
IBM PureData System for Analytics 
The Simple Appliance for Serious Analytics 
Built-in Expertise 
 No indexes or tuning 
 Data model agnostic 
 Fully parallel, optimized In Database Analytics 
Integration by Design 
 Server, Storage, Database in one easy to use package 
 Automatic parallelization and resource optimization to scale 
economically 
 Enterprise-class security and platform management 
Simplified Experience 
 Up and running in hours 
 Minimal ongoing administration 
 Standard interfaces to best of breed Analytics, BI, and data integration 
tools 
 Built-in analytics capabilities allow users to derive insight from data 
quickly 
 Easy connectivity to other Big Data Platform components 
10/12/2014 20
Optimized exclusively for analytic data workloads 
System for Analytics 
Delivering data services 
for analytics 
Speed 
 10-100x faster than traditional custom systems* 
 Patented MPP hardware acceleration 
(Massively Parallel Processing) 
Simplicity 
 Data load ready in hours 
 No database indexes 
 No tuning 
 No storage administration 
Scalability 
 Peta-scale data capacity 
Smart 
 Designed to runs complex analytics in minutes, 
not hours 
 Richest set of in-database analytics 
* Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, 
pre-tested and optimized. Individual results may vary. 
10/12/2014 21
Move analytics into the Data Warehouse 
 Integrate the server, 
storage and database 
into one optimized 
package 
 Move complex analytics 
into the database 
 Integrated, high 
performance analytics 
within the data 
warehouse 
Analytics 
Database 
Storage 
Server 
10/12/2014 22
10/12/2014 23
10/12/2014 24
Simplicity and 
Ease of 
Administration 
 No dbspace/tablespace sizing and configuration 
 No redo/physical/Logical log sizing and 
configuration 
 No page/block sizing and configuration for tables 
 No extent sizing and configuration for tables 
 No Temp space allocation and monitoring 
 No logical volume creations of files 
 No integration of OS kernel recommendations 
 No maintenance of OS recommended patch levels 
Data Experts, 
not Database 
Experts 
 Easy Administration Portal 
 No software installation 
 No indexes and tuning 
 No storage administration 
10/12/2014 25
0. CREATE DATABASE TEST LOGFILE 'E:OraDataTESTLOG1TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG2TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG3TEST.ORA' SIZE 2M, 
'E:OraDataTESTLOG4TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 
'E:OraDataTESTSYS1TEST.ORA' SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:OraDataTESTTEMP.ORA' SIZE 50 M 
UNDO TABLESPACE undo DATAFILE 'E:OraDataTESTUNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1; 
1. Oracle* table and indexes 
2. Oracle tablespace 
3. Oracle datafile 
4. Veritas file 
5. Veritas file system 
6. Veritas striped logical volume 
7. Veritas mirror/plex 
8. Veritas sub-disk 
9. SunOS raw device 
Netezza: Low (ZERO) Touch: 
10. Brocade SAN switch 
11. EMC Symmetrix volume 
CREATE DATABASE my_db; 
12. EMC Symmetrix striped meta-volume 
13. EMC Symmetrix hyper-volume 
14. EMC Symmetrix remote volume (replication) 
15. Days/weeks of planning meetings 
10/12/2014 26
ORACLE 
CREATE TABLE "MRDWDDM"."RDWF_DDM_ROOMS_SOLD" ("ID_PROPERTY" NUMBER(5, 
0) NOT NULL ENABLE, "ID_DATE_STAY" NUMBER(5, 0) NOT NULL ENABLE, 
"CD_ROOM_POOL" CHAR(4) NOT NULL ENABLE, "CD_RATE_PGM" CHAR(4) NOT 
NULL ENABLE, "CD_RATE_TYPE" CHAR(1) NOT NULL ENABLE, 
ORACLE Indexes 
"CD_MARKET_SEGMENT" CHAR(2) NOT NULL ENABLE, "ID_CONFO_NUM_ORIG" 
CREATE INDEX "MRDWDDM"."RDWF_DDM_ROOMS_SOLD_IDX1" ON "RDWF_DDM_ROOMS_SOLD" 
NUMBER(9, 0) NOT NULL ENABLE, "ID_CONFO_NUM_CUR" NUMBER(9, 0) NOT 
("ID_PROPERTY" , "ID_DATE_STAY" , "CD_ROOM_POOL" , "CD_RATE_PGM" , 
NULL ENABLE, "ID_DATE_CREATE" NUMBER(5, 0) NOT NULL ENABLE, 
"CD_RATE_TYPE" , "CD_MARKET_SEGMENT" ) PCTFREE 10 INITRANS 6 MAXTRANS 255 
STORAGE( FREELISTS 10) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING 
"ID_DATE_ARRIVAL" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_DEPART" 
PARALLEL ( DEGREE 4 INSTANCES 1) LOCAL(PARTITION "PART1" PCTFREE 10 
ORACLE Bitmap index 
NUMBER(5, 0) NOT NULL ENABLE, "QY_ROOMS" NUMBER(5, 0) NOT NULL 
INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 
MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL 
ENABLE, "CU_REV_PROJ_NET_LOCAL" NUMBER(21, 3) NOT NULL ENABLE, 
CREATE BITMAP INDEX "CRDBO"."SNAPSHOT_MONTH_IDX13" ON 
DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART2" 
"SNAPSHOT_OPPTY_MONTH_HIST" ("SNAPSHOT_YEAR" ) PCTFREE 10 INITRANS 2 
"CU_REV_PROJ_NET_USD" NUMBER(21, 3) NOT NULL ENABLE, 
PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 
MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4194304 MINEXTENTS 2 MAXEXTENTS 
MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 
"QY_DAYS_STAY_CUR" NUMBER(3, 0) NOT NULL ENABLE, "CD_BOOK_SOURCE" 
2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL 
1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, 
DEFAULT) TABLESPACE "SFA_DATAMART_INDEX" NOLOGGING ; 
CHAR(1) NOT NULL ENABLE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 
PARTITION "PART3" PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 
4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 
STORAGE( FREELISTS 6) TABLESPACE "DDM_ROOMS_SOLD_DATA" NOLOGGING 
FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE 
ORACLE Table Clusters 
PARTITION BY RANGE ("ID_PROPERTY" ) (PARTITION "PART1" VALUES LESS 
"DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART4" PCTFREE 10 INITRANS 6 
MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 
THAN (600) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 
CREATE CLUSTER "MRDW"."CT_INTRMDRY_CAL" ("ID_YEAR_CAL" NUMBER(4, 0), 
100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) 
"ID_MONTH_CAL" NUMBER(2, 0), "ID_PROPERTY" NUMBER(5, 0)) SIZE 16384 
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE 
TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART5" PCTFREE 10 
PCTFREE 10 PCTUSED 90 INITRANS 3 MAXTRANS 255 STORAGE(INITIAL 
INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART2" VALUES 
83886080 NEXT 41943040 MINEXTENTS 1 MAXEXTENTS 1017 PCTINCREASE 0 
MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL 
FREELISTS 4 FREELIST GROUPS 1 BUFFER_POOL RECYCLE) TABLESPACE 
LESS THAN (1200) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 
DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART6" 
"TSS_FACT" ; 
PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE 
MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART3" VALUES 
1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING ) ; 
LESS THAN (1800) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 
STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE 
"DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART4" VALUES 
Netezza 
CREATE TABLE MRDWDDM.RDWF_DDM_ROOMS_SOLD ( 
ID_PROPERTY numeric(5, 0) NOT NULL , 
ID_DATE_STAY integer NOT NULL , 
CD_ROOM_POOL CHAR(4) NOT NULL , 
CD_RATE_PGM CHAR(4) NOT NULL , 
CD_RATE_TYPE CHAR(1) NOT NULL , 
CD_MARKET_SEGMENT CHAR(2) NOT NULL , 
ID_CONFO_NUM_ORIG integer NOT NULL , 
ID_CONFO_NUM_CUR integer NOT NULL , 
ID_DATE_CREATE integer NOT NULL , 
ID_DATE_ARRIVAL integer NOT NULL , 
ID_DATE_DEPART integer NOT NULL , 
QY_ROOMS integer NOT NULL , 
CU_REV_PROJ_NET_LOCAL numeric(21, 3) NOT NULL , 
CU_REV_PROJ_NET_USD numeric(21, 3) NOT NULL , 
QY_DAYS_STAY_CUR smallint NOT NULL , 
CD_BOOK_SOURCE CHAR(1) NOT NULL) 
distribute on random; 
•No indexes 
•No Physical Tuning/Admin 
•Stripe data randomly, or by Columns 
10/12/2014 27
Data In 
Data Integration 
 Ab Initio 
 Cloudera 
 Composite Software 
 IBM Big Insights 
 IBM Information Server 
 IBM InfoSphere Streams 
 Informatica 
 Oracle Data Integrator 
 Oracle GoldenGate 
 SAP Business Objects 
SQL ODBC JDBC OLE-DB 
10/12/2014 28
Reporting and Analysis 
 IBM Cognos 
 IBM SPSS 
 IBM Unica 
 Information Builders 
 Kalido 
 KXEN 
 Microsoft Excel 
 MicroStrategy 
 Oracle OBIEE 
 SAP Business Objects 
 SAS 
 Actuate 
Data Out 
SQL ODBC JDBC OLE-DB 
10/12/2014 29
30 10/12/2014
Advanced 
Analytics 
Loader 
ETL 
BI 
Applications 
FPGA 
CPU 
Memory 
FPGA 
CPU 
Memory 
FPGA 
CPU 
Memory 
Host 
Hosts 
Disk 
Enclosures 
S-Blades™ 
Network 
Fabric 
Netezza Appliance 
10/12/2014 31
FPGA Core CPU Core 
Uncompress Project Restrict, 
Visibility 
Complex Σ 
Joins, Aggs, etc. 
select DISTRICT, 
PRODUCTGRP, 
sum(NRX) 
from MTHLY_RX_TERR_DATA 
where MONTH = '20091201' 
and MARKET = 509123 
and SPECIALTY = 'GASTRO' 
Slice of table 
MTHLY_RX_TERR_DATA 
(compressed) 
where MONTH = '20091201' 
and MARKET = 509123 
and SPECIALTY = 'GASTRO' 
sum(NRX) 
select DISTRICT, 
PRODUCTGRP, 
sum(NRX) 
10/12/2014 32
 Essentially: A big, fast SQL database 10/12/2014 33
Commodity CPU, NIC, disk 
FPGA 
 Can do basic filtering in hardware, 
i.e., stream processing before data hits main memory 
10/12/2014 34
 The four key components that make up TwinFin are: SMP 
hosts; snippet blades (called S-Blades); disk enclosures and a 
network fabric. 
 The disk enclosures contain high-density, high-performance 
disks. 
 Each disk contains a slice of the data in the database table, 
along with a mirror of the data on another disk. 
 The storage arrays are connected to the S-Blades via high-speed 
interconnects that allow all the disks to 
simultaneously stream data to the S-Blades at the fastest 
rate possible. 
10/12/2014 35
 The SMP hosts are high-performance Linux servers that are set up in an 
active-passive configuration for high-availability. 
 The active host presents a standardized interface to external tools and 
applications, such as BI and ETL tools and load utilities. 
 It compiles SQL queries into executable code segments called snippets, 
creates optimized query plans and distributes the snippets to the S-Blades 
for execution. 
10/12/2014 36
 S-Blades are intelligent processing nodes that make up the turbocharged 
MPP engine of the appliance. 
 Each S-Blade is an independent server that contains powerful multi-core 
CPUs, Netezza's unique multi-engine FPGAs and gigabytes of RAM--all 
balanced and working concurrently to deliver peak performance. 
 FPGAs are commodity chips that are designed to do process data streams 
at extremely fast rates. 
10/12/2014 37
10/12/2014 38
SAS Expander 
Module 
Intel Quad-Core 
SAS Expander 
Module 
DRAM Dual-Core FPGA 
IBM BladeCenter Server Netezza DB Accelerator 
10/12/2014 39
Netezza uses FPGA to do front line processing by 
filtering data from disk and applying additional logic 
before passing that to memory on SPU. Main 
advantages from data processing: 
 Parallelism and processing power now shifted away from 
CPU, 
 FPGA has similar dimensions as a CPU, consumes 5 times 
less power and clock speed is about 5 times less. 
 Filtering out unnecessary data. 
 Low latency, high throughput. 
 More caching capability. 
10/12/2014 40
 Netezza is the first company to leverage the power 
of FPGA to process streaming data in a data 
warehouse appliance. 
 In traditional systems, all the data for a query is 
moved and then the “where” clause is processed. 
With Netezza, instead of moving a huge set of data, 
the FPGA processes the “where” clause as data 
streams off of the disk, so only the data needed for 
processing is moved to the next step. 
10/12/2014 41
 As discussed earlier, each disk in the appliance is 
partitioned into primary, mirror and temp or swap 
partitions. 
 The primary partition in each disk is used to store 
user data like database tables, the mirror stores a 
copy of the primary partition of another disk so that 
it can be used in the event of disk failures and the 
temp/swap partition is used to store the data 
temporarily like when the appliance does data 
redistribution while processing queries. 
10/12/2014 42
 The logical representation of the data saved in the primary 
partition of each disk is called the data slice. 
 When users create database tables and load data into it, 
they get distributed across the available data slices. 
 Logical representation of data slices is called the data 
partition. 
 For TwinFin systems each S-Blade or SPU is connected to 8 
data partitions and some only to 6 disk partitions (since 
some disks are reserved for failovers). 
 There are situations like SPU failures when a SPU can have 
more than 8 partitions attached to it since it got assigned 
some of the data partitions from the failed SPU. 
10/12/2014 43
 The SPU 1001 is connected to 8 data partitions numbered 0 to 7. 
 Each data partition is connected to one data slice stored on different 
disks. 
 For e.g., the data partition 0 points to the data slice 17 stored on the 
disk with id 1063. 
 The disk 1063 also stores the mirror of the data partition 18 stored on 
disk 1064. 
 The following diagram illustrates what happens when the disk 1070 fails. 
10/12/2014 44
Immediately after the disk 1070 stops responding, 
the disk 1069 will be used by the system to satify 
queries for which data is required from data slice 23 
and 24. 
 Disk 1069 will serve the requests using the data in 
both its primary and mirror partition. 
 In the meantime, the contents in disk 1070 are 
regenerated on one of the spare disks in the disk 
array which in this case is disk 1100 using the data in 
disk 1069. 
 Once the regen is complete the SPU data partition 7 
is updated to point to the data slice 24 on disk 1100. 
10/12/2014 45
 In the situation where a SPU fails, the appliance 
assigns all the data partitions to other SPUs in the 
system. 
Pair of disks which contains the mirror copy of each 
others data slice will be assigned to other SPUs 
which will result in additional two data partitioned 
to be managed by the target SPU. 
 If for e.g. if an SPU currently manages data 
partitions 0 to 7 and if the appliance reassings two 
data partitions from a failed SPU, the SPU will have 
10 data partitions to manage and it will be 
numbered from 0 to 9. 
10/12/2014 46
47 
Speed 
• Hardware-based 
data 
streaming 
Scalability 
• True MPP 
offers 
enterprise 
scale-out 
Simple 
• Black-box 
appliance with 
no tuning or 
storage 
administration 
Smart 
• Built-in 
advanced 
analytics 
pushed deep 
into database 
NO NO NO NO 
NO YES NO LIMITED 
10/12/2014
Teradata Results In IBM Netezza Client Advantage 
Costs 
High initial cost 
Lots of professional services 
Lots of administration 
High cost of 
ownership 
Low initial cost 
Little administration 
Low total cost of ownership 
Smart 
Limited analytics pushdown 
Analytics causes resource contention 
Poor analytic 
performance 
Minimal contention due to 
analytics 
More customers benefit from 
faster analytics 
Simplicity 
Constant tuning for performance 
Needs much administration 
Difficult and slow to 
provide business 
value 
True appliance 
No tuning 
Faster time to value 
Speed 
Old inefficient legacy code 
Complex workload partitions 
Data warehouse 
performance doesn’t 
scale consistently 
Designed for balance 
Highest / most consistent data 
warehouse and advanced 
analytics performance 
Architecture 
Proprietary interconnect 
Virtualized MPP nodes (vAMPs) 
Separating compute and storage 
Unpredictable 
performance 
True MPP 
FPGA acceleration 
Best architecture for data 
warehouse and advanced 
analytics 
48 
10/12/2014 48
Oracle Exadata Results In IBM Netezza Client Advantage 
Costs 
High initial cost 
Lots of administration 
High total cost of 
ownership 
Low initial cost 
Little administration 
Low total cost of ownership 
Smart 
Limited analytics pushdown 
Inefficiency of Oracle 
Real Application Clusters (RAC) 
Poor analytic 
performance 
Extensive analytics 
Pushdown capabilities 
Fast time to insight 
More users benefit 
from faster analytics 
Simplicity 
Complexity of Oracle RAC 
Constant tuning for performance 
Complex patch process 
Complex 
administration 
True appliance 
No tuning 
Faster time to value 
Scalability 
No proof points on scaling 
RAC scalability bottleneck 
Business growth 
risk 
Proven scalability 
Business growth with 
confidence 
Speed 
Designed for OLTP 
RAC is inefficient for 
data warehouse workloads 
Poor 
data warehouse 
performance 
Designed for data 
warehousing 
Highest data warehouse 
performance 
Architecture 
Clustered SMP database layer 
+ 
Shared disk MPP storage layer 
Compromised 
performance 
True MPP 
FPGA acceleration 
Best architecture for 
data warehousing and advanced 
analytics 
49 
10/12/2014 49
50 10/12/2014

More Related Content

What's hot

Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databasesArangoDB Database
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
Teoria de Base de datos
Teoria de Base de datosTeoria de Base de datos
Teoria de Base de datosUniandes
 
Fundamentals of Database ppt ch02
Fundamentals of Database ppt ch02Fundamentals of Database ppt ch02
Fundamentals of Database ppt ch02Jotham Gadot
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptxWasm1953
 
Database systems - Chapter 2
Database systems - Chapter 2Database systems - Chapter 2
Database systems - Chapter 2shahab3
 
Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimizationUsman Tariq
 
Tokenization on the Node - Data Protection for Security and Compliance
Tokenization on the Node - Data Protection for Security and ComplianceTokenization on the Node - Data Protection for Security and Compliance
Tokenization on the Node - Data Protection for Security and ComplianceUlf Mattsson
 
Difference between fact tables and dimension tables
Difference between fact tables and dimension tablesDifference between fact tables and dimension tables
Difference between fact tables and dimension tablesKamran Haider
 

What's hot (20)

Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
U7 postgre sql
U7 postgre sqlU7 postgre sql
U7 postgre sql
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Raid
RaidRaid
Raid
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Teoria de Base de datos
Teoria de Base de datosTeoria de Base de datos
Teoria de Base de datos
 
Fundamentals of Database ppt ch02
Fundamentals of Database ppt ch02Fundamentals of Database ppt ch02
Fundamentals of Database ppt ch02
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
Database systems - Chapter 2
Database systems - Chapter 2Database systems - Chapter 2
Database systems - Chapter 2
 
Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimization
 
IBM Cloud pak for data brochure
IBM Cloud pak for data   brochureIBM Cloud pak for data   brochure
IBM Cloud pak for data brochure
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Dfd
DfdDfd
Dfd
 
Tokenization on the Node - Data Protection for Security and Compliance
Tokenization on the Node - Data Protection for Security and ComplianceTokenization on the Node - Data Protection for Security and Compliance
Tokenization on the Node - Data Protection for Security and Compliance
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Difference between fact tables and dimension tables
Difference between fact tables and dimension tablesDifference between fact tables and dimension tables
Difference between fact tables and dimension tables
 

Viewers also liked

An Introduction to Netezza
An Introduction to NetezzaAn Introduction to Netezza
An Introduction to NetezzaVijaya Chandrika
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceIBM Danmark
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200xIBM Sverige
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceIBM Sverige
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developersBiju Nair
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceBiju Nair
 
Data-Ed Webinar: A Framework for Implementing NoSQL, Hadoop
Data-Ed Webinar: A Framework for Implementing NoSQL, HadoopData-Ed Webinar: A Framework for Implementing NoSQL, Hadoop
Data-Ed Webinar: A Framework for Implementing NoSQL, HadoopDATAVERSITY
 
Postgre sql statements 03
Postgre sql statements 03Postgre sql statements 03
Postgre sql statements 03huynhle1990
 
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Ravikumar Nandigam
 
Netezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaNetezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaRavikumar Nandigam
 
JPSPSLT-「WindowsAzure 最新事情」2014年2月版
JPSPSLT-「WindowsAzure 最新事情」2014年2月版JPSPSLT-「WindowsAzure 最新事情」2014年2月版
JPSPSLT-「WindowsAzure 最新事情」2014年2月版幸智 Yukinori 黒田 Kuroda
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaBiju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload managementBiju Nair
 
End-to-end solution demonstration: From concept to delivery-Intel/IBM
End-to-end solution demonstration: From concept to delivery-Intel/IBMEnd-to-end solution demonstration: From concept to delivery-Intel/IBM
End-to-end solution demonstration: From concept to delivery-Intel/IBMIBM_Info_Management
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar DatabaseBiju Nair
 
Security best practices for informix
Security best practices for informixSecurity best practices for informix
Security best practices for informixIBM_Info_Management
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Daniel Abadi
 

Viewers also liked (20)

An Introduction to Netezza
An Introduction to NetezzaAn Introduction to Netezza
An Introduction to Netezza
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
 
My tableau
My tableauMy tableau
My tableau
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Data-Ed Webinar: A Framework for Implementing NoSQL, Hadoop
Data-Ed Webinar: A Framework for Implementing NoSQL, HadoopData-Ed Webinar: A Framework for Implementing NoSQL, Hadoop
Data-Ed Webinar: A Framework for Implementing NoSQL, Hadoop
 
Postgre sql statements 03
Postgre sql statements 03Postgre sql statements 03
Postgre sql statements 03
 
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
 
Netezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaNetezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in India
 
JPSPSLT-「WindowsAzure 最新事情」2014年2月版
JPSPSLT-「WindowsAzure 最新事情」2014年2月版JPSPSLT-「WindowsAzure 最新事情」2014年2月版
JPSPSLT-「WindowsAzure 最新事情」2014年2月版
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
 
「Windows Azureで HPC 」 for JAZUG 2013年9月
「Windows Azureで HPC 」 for JAZUG 2013年9月「Windows Azureで HPC 」 for JAZUG 2013年9月
「Windows Azureで HPC 」 for JAZUG 2013年9月
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
End-to-end solution demonstration: From concept to delivery-Intel/IBM
End-to-end solution demonstration: From concept to delivery-Intel/IBMEnd-to-end solution demonstration: From concept to delivery-Intel/IBM
End-to-end solution demonstration: From concept to delivery-Intel/IBM
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Security best practices for informix
Security best practices for informixSecurity best practices for informix
Security best practices for informix
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
 

Similar to IBM Pure Data System for Analytics (Netezza)

Research OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All Server
Research OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All ServerResearch OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All Server
Research OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All ServerZiaul Hoque Prince
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Boni Bruno
 
MT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of ViewMT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of ViewDell EMC World
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Whats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwWhats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwEduardo Castro
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power EdgeSashikris
 
Dell poweredge-r730-spec-sheet
Dell poweredge-r730-spec-sheetDell poweredge-r730-spec-sheet
Dell poweredge-r730-spec-sheetIndrawanIT
 
Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2Eduardo Castro
 
Modernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL ServerModernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL ServerMicrosoft Tech Community
 
Whd master deck_final
Whd master deck_final Whd master deck_final
Whd master deck_final Juergen Domnik
 
MT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerMT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerDell EMC World
 
Dell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdf
Dell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdfDell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdf
Dell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdfhellobank1
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsAlluxio, Inc.
 
Run more applications without expanding your datacenter
Run more applications without expanding your datacenterRun more applications without expanding your datacenter
Run more applications without expanding your datacenterPrincipled Technologies
 
Informix warehouse accelerator update
Informix warehouse accelerator updateInformix warehouse accelerator update
Informix warehouse accelerator updateIBM Sverige
 
Boosting performance with the Dell Acceleration Appliance for Databases
Boosting performance with the Dell Acceleration Appliance for DatabasesBoosting performance with the Dell Acceleration Appliance for Databases
Boosting performance with the Dell Acceleration Appliance for DatabasesPrincipled Technologies
 

Similar to IBM Pure Data System for Analytics (Netezza) (20)

Research OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All Server
Research OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All ServerResearch OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All Server
Research OnThe IT Infrastructure Of Globe Pharmaceutical Ltd - All Server
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
MT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of ViewMT25 Server technology trends, workload impacts, and the Dell Point of View
MT25 Server technology trends, workload impacts, and the Dell Point of View
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Whats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwWhats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 Cw
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
 
Dell poweredge-r730-spec-sheet
Dell poweredge-r730-spec-sheetDell poweredge-r730-spec-sheet
Dell poweredge-r730-spec-sheet
 
Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2
 
Modernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL ServerModernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL Server
 
Dell SalesPlayBook.pdf
Dell SalesPlayBook.pdfDell SalesPlayBook.pdf
Dell SalesPlayBook.pdf
 
Whd master deck_final
Whd master deck_final Whd master deck_final
Whd master deck_final
 
MT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data centerMT47 Modernize infrastructure for a modern data center
MT47 Modernize infrastructure for a modern data center
 
Dell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdf
Dell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdfDell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdf
Dell_R730xd_RedHat_Ceph_Performance_SizingGuide_WhitePaper.pdf
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloads
 
Run more applications without expanding your datacenter
Run more applications without expanding your datacenterRun more applications without expanding your datacenter
Run more applications without expanding your datacenter
 
Informix warehouse accelerator update
Informix warehouse accelerator updateInformix warehouse accelerator update
Informix warehouse accelerator update
 
58750024 datastage-student-guide
58750024 datastage-student-guide58750024 datastage-student-guide
58750024 datastage-student-guide
 
Boosting performance with the Dell Acceleration Appliance for Databases
Boosting performance with the Dell Acceleration Appliance for DatabasesBoosting performance with the Dell Acceleration Appliance for Databases
Boosting performance with the Dell Acceleration Appliance for Databases
 
NetApp All Flash storage
NetApp All Flash storageNetApp All Flash storage
NetApp All Flash storage
 

More from Girish Srivastava (8)

Jquery
JqueryJquery
Jquery
 
Jscript part2
Jscript part2Jscript part2
Jscript part2
 
Extjs
ExtjsExtjs
Extjs
 
Jive
JiveJive
Jive
 
Jscript part1
Jscript part1Jscript part1
Jscript part1
 
Cgi
CgiCgi
Cgi
 
Complete Dojo
Complete DojoComplete Dojo
Complete Dojo
 
Dojo tutorial
Dojo tutorialDojo tutorial
Dojo tutorial
 

Recently uploaded

ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 

Recently uploaded (20)

ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 

IBM Pure Data System for Analytics (Netezza)

  • 1. IBM PureData System for Analytics 10/12/2014 1
  • 2.  At the end of this session, participants will understand all the basic concepts about IBM Puredata System for Analytics (Netezza).  IBM Puredata System models and its components.  IBM Puredata System Architecture. And  How it works exactly. 10/12/2014 2
  • 3. “If you'd like to take us on, make our day.” - Larry Ellison, Oct 2009 “Our goal is to become number one in the high-end server business for both Online Transaction Processing and Data Warehousing, both of those segments.” - Larry Ellison, Dec 2010 10/12/2014 3
  • 4. One of “The five most important M&A Deals of 2010” - Wall Street Journal 10/12/2014 4
  • 6.  Dedicated device  Optimized for purpose  Complete solution  Fast installation  Very easy operation  Standard interfaces  Low cost 10/12/2014 6
  • 7. PureData for Analytics – Where Big Data Meets Deep Analytics Analytics without constraint 10/12/2014 7
  • 9. 9 Seamless integration with Informatica, Business Objects, SAS and SQL Server (SSIS packages) Very little DDL & SQL conversion • Used same table structures • Converted Primary index to Distribution column 10 to 200X performance improvements in BO reporting Fast to Deploy Price to Performance very appealing Ease of use. • Administrative • DBA Tasks • Supports all DB structures (3NF, Star, De-Normalized table) 9 Why Netezza? 10/12/2014
  • 10.  The IBM PureData System for Analytics N1001 models include single-rack and multi-rack configurations.  The N1001 model family is an update to the IBM Netezza 1000 model family, with the same architectural and interface specifications.  Each N1001 storage array contains either two or four disk enclosures, depending upon the model.  Each disk enclosure has 12 disks.  For example, an N1001-005 system has one storage array with 48 disks. 10/12/2014 10
  • 12.  The IBM® PureData™ System for Analytics N2001 family is the latest generation of data warehouse appliances.  It increases the capacity and performance of the N1001 models.  Within each rack are numerous components that work together to provide the asymmetric massively parallel processing of the Netezza® architecture.  The key hardware components include:  Snippet blades (S-Blades) Hosts  Storage arrays 10/12/2014 12
  • 13. 10/12/2014 13 The Following figure summarizes the IBM PureData System for Analytics N2001 half-rack, full-rack, and two-rack models.
  • 14.  The snippet processing functions are the responsibility of the S-Blade.  The S-Blade is a specialized processing board which combines the CPU processing power of a blade server with the query analysis intelligence of the Netezza Database Accelerator card.  The dualboard component resides in two slots of the S-Blade chassis.  Each chassis can contain up to 7 S-Blades. 10/12/2014 14
  • 15.  The Netezza Database Accelerator card contains the FPGA query engines, memory, and I/O for processing the data from the disks where user data is stored. 10/12/2014 15
  • 16.  The host server is a Linux server that runs the Netezza software and utilities.  The host controls and coordinates the activity of the appliance.  It performs query optimization; controls table and database operations; consolidates and returns query results; and monitors the Netezza system components to detect and report problems.  The host is a highly redundant, highly available, server.  The Netezza 1000 systems have two hosts in a highly available (HA) configuration. 10/12/2014 16
  • 17.  The storage arrays contain the disks that store the user data and related processing files to support the query activity on the system.  In the N2001 model family, each disk enclosure has 24 disks.  There are 12 disk enclosures in each full rack, or 6 enclosures in a half-rack model.  In the N2001 family, each rack is one storage array. 10/12/2014 17
  • 18.  Netezza's appliances use a proprietary Asymmetric Massively Parallel Processing (AMPP) architecture that combines open, blade-based servers and disk storage with a proprietary data filtering process using field-programmable gate arrays (FPGAs).  Netezza’s proprietary AMPP architecture is a two-tiered system designed to quickly handle very large queries from multiple users.  The first tier is a high-performance Linux SMP host that compiles data query tasks received from business intelligence applications, and generates query execution plans.  It then divides a query into a sequence of sub-tasks, or snippets that can be executed in parallel, and distributes the snippets to the second tier for execution.  The second tier consists of multiple no. of snippet processing blades, or S-Blades, where all the primary processing work of the appliance is executed. 10/12/2014 18
  • 20. IBM PureData System for Analytics The Simple Appliance for Serious Analytics Built-in Expertise  No indexes or tuning  Data model agnostic  Fully parallel, optimized In Database Analytics Integration by Design  Server, Storage, Database in one easy to use package  Automatic parallelization and resource optimization to scale economically  Enterprise-class security and platform management Simplified Experience  Up and running in hours  Minimal ongoing administration  Standard interfaces to best of breed Analytics, BI, and data integration tools  Built-in analytics capabilities allow users to derive insight from data quickly  Easy connectivity to other Big Data Platform components 10/12/2014 20
  • 21. Optimized exclusively for analytic data workloads System for Analytics Delivering data services for analytics Speed  10-100x faster than traditional custom systems*  Patented MPP hardware acceleration (Massively Parallel Processing) Simplicity  Data load ready in hours  No database indexes  No tuning  No storage administration Scalability  Peta-scale data capacity Smart  Designed to runs complex analytics in minutes, not hours  Richest set of in-database analytics * Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary. 10/12/2014 21
  • 22. Move analytics into the Data Warehouse  Integrate the server, storage and database into one optimized package  Move complex analytics into the database  Integrated, high performance analytics within the data warehouse Analytics Database Storage Server 10/12/2014 22
  • 25. Simplicity and Ease of Administration  No dbspace/tablespace sizing and configuration  No redo/physical/Logical log sizing and configuration  No page/block sizing and configuration for tables  No extent sizing and configuration for tables  No Temp space allocation and monitoring  No logical volume creations of files  No integration of OS kernel recommendations  No maintenance of OS recommended patch levels Data Experts, not Database Experts  Easy Administration Portal  No software installation  No indexes and tuning  No storage administration 10/12/2014 25
  • 26. 0. CREATE DATABASE TEST LOGFILE 'E:OraDataTESTLOG1TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG2TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG3TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG4TEST.ORA' SIZE 2M, 'E:OraDataTESTLOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 'E:OraDataTESTSYS1TEST.ORA' SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:OraDataTESTTEMP.ORA' SIZE 50 M UNDO TABLESPACE undo DATAFILE 'E:OraDataTESTUNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1; 1. Oracle* table and indexes 2. Oracle tablespace 3. Oracle datafile 4. Veritas file 5. Veritas file system 6. Veritas striped logical volume 7. Veritas mirror/plex 8. Veritas sub-disk 9. SunOS raw device Netezza: Low (ZERO) Touch: 10. Brocade SAN switch 11. EMC Symmetrix volume CREATE DATABASE my_db; 12. EMC Symmetrix striped meta-volume 13. EMC Symmetrix hyper-volume 14. EMC Symmetrix remote volume (replication) 15. Days/weeks of planning meetings 10/12/2014 26
  • 27. ORACLE CREATE TABLE "MRDWDDM"."RDWF_DDM_ROOMS_SOLD" ("ID_PROPERTY" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_STAY" NUMBER(5, 0) NOT NULL ENABLE, "CD_ROOM_POOL" CHAR(4) NOT NULL ENABLE, "CD_RATE_PGM" CHAR(4) NOT NULL ENABLE, "CD_RATE_TYPE" CHAR(1) NOT NULL ENABLE, ORACLE Indexes "CD_MARKET_SEGMENT" CHAR(2) NOT NULL ENABLE, "ID_CONFO_NUM_ORIG" CREATE INDEX "MRDWDDM"."RDWF_DDM_ROOMS_SOLD_IDX1" ON "RDWF_DDM_ROOMS_SOLD" NUMBER(9, 0) NOT NULL ENABLE, "ID_CONFO_NUM_CUR" NUMBER(9, 0) NOT ("ID_PROPERTY" , "ID_DATE_STAY" , "CD_ROOM_POOL" , "CD_RATE_PGM" , NULL ENABLE, "ID_DATE_CREATE" NUMBER(5, 0) NOT NULL ENABLE, "CD_RATE_TYPE" , "CD_MARKET_SEGMENT" ) PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE( FREELISTS 10) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING "ID_DATE_ARRIVAL" NUMBER(5, 0) NOT NULL ENABLE, "ID_DATE_DEPART" PARALLEL ( DEGREE 4 INSTANCES 1) LOCAL(PARTITION "PART1" PCTFREE 10 ORACLE Bitmap index NUMBER(5, 0) NOT NULL ENABLE, "QY_ROOMS" NUMBER(5, 0) NOT NULL INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL ENABLE, "CU_REV_PROJ_NET_LOCAL" NUMBER(21, 3) NOT NULL ENABLE, CREATE BITMAP INDEX "CRDBO"."SNAPSHOT_MONTH_IDX13" ON DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART2" "SNAPSHOT_OPPTY_MONTH_HIST" ("SNAPSHOT_YEAR" ) PCTFREE 10 INITRANS 2 "CU_REV_PROJ_NET_USD" NUMBER(21, 3) NOT NULL ENABLE, PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4194304 MINEXTENTS 2 MAXEXTENTS MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS "QY_DAYS_STAY_CUR" NUMBER(3, 0) NOT NULL ENABLE, "CD_BOOK_SOURCE" 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL 1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, DEFAULT) TABLESPACE "SFA_DATAMART_INDEX" NOLOGGING ; CHAR(1) NOT NULL ENABLE) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 PARTITION "PART3" PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 STORAGE( FREELISTS 6) TABLESPACE "DDM_ROOMS_SOLD_DATA" NOLOGGING FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE ORACLE Table Clusters PARTITION BY RANGE ("ID_PROPERTY" ) (PARTITION "PART1" VALUES LESS "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART4" PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 MAXEXTENTS THAN (600) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 CREATE CLUSTER "MRDW"."CT_INTRMDRY_CAL" ("ID_YEAR_CAL" NUMBER(4, 0), 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) "ID_MONTH_CAL" NUMBER(2, 0), "ID_PROPERTY" NUMBER(5, 0)) SIZE 16384 STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART5" PCTFREE 10 PCTFREE 10 PCTUSED 90 INITRANS 3 MAXTRANS 255 STORAGE(INITIAL INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 MINEXTENTS 1 "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART2" VALUES 83886080 NEXT 41943040 MINEXTENTS 1 MAXEXTENTS 1017 PCTINCREASE 0 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS 1 BUFFER_POOL FREELISTS 4 FREELIST GROUPS 1 BUFFER_POOL RECYCLE) TABLESPACE LESS THAN (1200) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING, PARTITION "PART6" "TSS_FACT" ; PCTFREE 10 INITRANS 6 MAXTRANS 255 STORAGE(INITIAL 4194304 NEXT 4259840 STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE MINEXTENTS 1 MAXEXTENTS 100000 PCTINCREASE 0 FREELISTS 10 FREELIST GROUPS "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART3" VALUES 1 BUFFER_POOL DEFAULT) TABLESPACE "DDM_DATAMART_INDEX_L" NOLOGGING ) ; LESS THAN (1800) PCTFREE 5 PCTUSED 95 INITRANS 4 MAXTRANS 255 STORAGE(INITIAL 16777216 FREELISTS 6 FREELIST GROUPS 1) TABLESPACE "DDM_ROOMS_SOLD_DATA" NOLOGGING NOCOMPRESS, PARTITION "PART4" VALUES Netezza CREATE TABLE MRDWDDM.RDWF_DDM_ROOMS_SOLD ( ID_PROPERTY numeric(5, 0) NOT NULL , ID_DATE_STAY integer NOT NULL , CD_ROOM_POOL CHAR(4) NOT NULL , CD_RATE_PGM CHAR(4) NOT NULL , CD_RATE_TYPE CHAR(1) NOT NULL , CD_MARKET_SEGMENT CHAR(2) NOT NULL , ID_CONFO_NUM_ORIG integer NOT NULL , ID_CONFO_NUM_CUR integer NOT NULL , ID_DATE_CREATE integer NOT NULL , ID_DATE_ARRIVAL integer NOT NULL , ID_DATE_DEPART integer NOT NULL , QY_ROOMS integer NOT NULL , CU_REV_PROJ_NET_LOCAL numeric(21, 3) NOT NULL , CU_REV_PROJ_NET_USD numeric(21, 3) NOT NULL , QY_DAYS_STAY_CUR smallint NOT NULL , CD_BOOK_SOURCE CHAR(1) NOT NULL) distribute on random; •No indexes •No Physical Tuning/Admin •Stripe data randomly, or by Columns 10/12/2014 27
  • 28. Data In Data Integration  Ab Initio  Cloudera  Composite Software  IBM Big Insights  IBM Information Server  IBM InfoSphere Streams  Informatica  Oracle Data Integrator  Oracle GoldenGate  SAP Business Objects SQL ODBC JDBC OLE-DB 10/12/2014 28
  • 29. Reporting and Analysis  IBM Cognos  IBM SPSS  IBM Unica  Information Builders  Kalido  KXEN  Microsoft Excel  MicroStrategy  Oracle OBIEE  SAP Business Objects  SAS  Actuate Data Out SQL ODBC JDBC OLE-DB 10/12/2014 29
  • 31. Advanced Analytics Loader ETL BI Applications FPGA CPU Memory FPGA CPU Memory FPGA CPU Memory Host Hosts Disk Enclosures S-Blades™ Network Fabric Netezza Appliance 10/12/2014 31
  • 32. FPGA Core CPU Core Uncompress Project Restrict, Visibility Complex Σ Joins, Aggs, etc. select DISTRICT, PRODUCTGRP, sum(NRX) from MTHLY_RX_TERR_DATA where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' Slice of table MTHLY_RX_TERR_DATA (compressed) where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' sum(NRX) select DISTRICT, PRODUCTGRP, sum(NRX) 10/12/2014 32
  • 33.  Essentially: A big, fast SQL database 10/12/2014 33
  • 34. Commodity CPU, NIC, disk FPGA  Can do basic filtering in hardware, i.e., stream processing before data hits main memory 10/12/2014 34
  • 35.  The four key components that make up TwinFin are: SMP hosts; snippet blades (called S-Blades); disk enclosures and a network fabric.  The disk enclosures contain high-density, high-performance disks.  Each disk contains a slice of the data in the database table, along with a mirror of the data on another disk.  The storage arrays are connected to the S-Blades via high-speed interconnects that allow all the disks to simultaneously stream data to the S-Blades at the fastest rate possible. 10/12/2014 35
  • 36.  The SMP hosts are high-performance Linux servers that are set up in an active-passive configuration for high-availability.  The active host presents a standardized interface to external tools and applications, such as BI and ETL tools and load utilities.  It compiles SQL queries into executable code segments called snippets, creates optimized query plans and distributes the snippets to the S-Blades for execution. 10/12/2014 36
  • 37.  S-Blades are intelligent processing nodes that make up the turbocharged MPP engine of the appliance.  Each S-Blade is an independent server that contains powerful multi-core CPUs, Netezza's unique multi-engine FPGAs and gigabytes of RAM--all balanced and working concurrently to deliver peak performance.  FPGAs are commodity chips that are designed to do process data streams at extremely fast rates. 10/12/2014 37
  • 39. SAS Expander Module Intel Quad-Core SAS Expander Module DRAM Dual-Core FPGA IBM BladeCenter Server Netezza DB Accelerator 10/12/2014 39
  • 40. Netezza uses FPGA to do front line processing by filtering data from disk and applying additional logic before passing that to memory on SPU. Main advantages from data processing:  Parallelism and processing power now shifted away from CPU,  FPGA has similar dimensions as a CPU, consumes 5 times less power and clock speed is about 5 times less.  Filtering out unnecessary data.  Low latency, high throughput.  More caching capability. 10/12/2014 40
  • 41.  Netezza is the first company to leverage the power of FPGA to process streaming data in a data warehouse appliance.  In traditional systems, all the data for a query is moved and then the “where” clause is processed. With Netezza, instead of moving a huge set of data, the FPGA processes the “where” clause as data streams off of the disk, so only the data needed for processing is moved to the next step. 10/12/2014 41
  • 42.  As discussed earlier, each disk in the appliance is partitioned into primary, mirror and temp or swap partitions.  The primary partition in each disk is used to store user data like database tables, the mirror stores a copy of the primary partition of another disk so that it can be used in the event of disk failures and the temp/swap partition is used to store the data temporarily like when the appliance does data redistribution while processing queries. 10/12/2014 42
  • 43.  The logical representation of the data saved in the primary partition of each disk is called the data slice.  When users create database tables and load data into it, they get distributed across the available data slices.  Logical representation of data slices is called the data partition.  For TwinFin systems each S-Blade or SPU is connected to 8 data partitions and some only to 6 disk partitions (since some disks are reserved for failovers).  There are situations like SPU failures when a SPU can have more than 8 partitions attached to it since it got assigned some of the data partitions from the failed SPU. 10/12/2014 43
  • 44.  The SPU 1001 is connected to 8 data partitions numbered 0 to 7.  Each data partition is connected to one data slice stored on different disks.  For e.g., the data partition 0 points to the data slice 17 stored on the disk with id 1063.  The disk 1063 also stores the mirror of the data partition 18 stored on disk 1064.  The following diagram illustrates what happens when the disk 1070 fails. 10/12/2014 44
  • 45. Immediately after the disk 1070 stops responding, the disk 1069 will be used by the system to satify queries for which data is required from data slice 23 and 24.  Disk 1069 will serve the requests using the data in both its primary and mirror partition.  In the meantime, the contents in disk 1070 are regenerated on one of the spare disks in the disk array which in this case is disk 1100 using the data in disk 1069.  Once the regen is complete the SPU data partition 7 is updated to point to the data slice 24 on disk 1100. 10/12/2014 45
  • 46.  In the situation where a SPU fails, the appliance assigns all the data partitions to other SPUs in the system. Pair of disks which contains the mirror copy of each others data slice will be assigned to other SPUs which will result in additional two data partitioned to be managed by the target SPU.  If for e.g. if an SPU currently manages data partitions 0 to 7 and if the appliance reassings two data partitions from a failed SPU, the SPU will have 10 data partitions to manage and it will be numbered from 0 to 9. 10/12/2014 46
  • 47. 47 Speed • Hardware-based data streaming Scalability • True MPP offers enterprise scale-out Simple • Black-box appliance with no tuning or storage administration Smart • Built-in advanced analytics pushed deep into database NO NO NO NO NO YES NO LIMITED 10/12/2014
  • 48. Teradata Results In IBM Netezza Client Advantage Costs High initial cost Lots of professional services Lots of administration High cost of ownership Low initial cost Little administration Low total cost of ownership Smart Limited analytics pushdown Analytics causes resource contention Poor analytic performance Minimal contention due to analytics More customers benefit from faster analytics Simplicity Constant tuning for performance Needs much administration Difficult and slow to provide business value True appliance No tuning Faster time to value Speed Old inefficient legacy code Complex workload partitions Data warehouse performance doesn’t scale consistently Designed for balance Highest / most consistent data warehouse and advanced analytics performance Architecture Proprietary interconnect Virtualized MPP nodes (vAMPs) Separating compute and storage Unpredictable performance True MPP FPGA acceleration Best architecture for data warehouse and advanced analytics 48 10/12/2014 48
  • 49. Oracle Exadata Results In IBM Netezza Client Advantage Costs High initial cost Lots of administration High total cost of ownership Low initial cost Little administration Low total cost of ownership Smart Limited analytics pushdown Inefficiency of Oracle Real Application Clusters (RAC) Poor analytic performance Extensive analytics Pushdown capabilities Fast time to insight More users benefit from faster analytics Simplicity Complexity of Oracle RAC Constant tuning for performance Complex patch process Complex administration True appliance No tuning Faster time to value Scalability No proof points on scaling RAC scalability bottleneck Business growth risk Proven scalability Business growth with confidence Speed Designed for OLTP RAC is inefficient for data warehouse workloads Poor data warehouse performance Designed for data warehousing Highest data warehouse performance Architecture Clustered SMP database layer + Shared disk MPP storage layer Compromised performance True MPP FPGA acceleration Best architecture for data warehousing and advanced analytics 49 10/12/2014 49