SlideShare a Scribd company logo
1 of 42
built by
QuerySurge™
Automated
Big Data Testing
without Writing Code
Testing of Hadoop and Data Warehouses Visually
Bill Hayduk
CEO/President
RTTS
Jeff Bocarsly, PhD
Chief Architect
QuerySurge /RTTS
PresentationTopics
built by
QuerySurge™
• Testing a Data Warehouse
• Testing Big Data
• Current Data Testing Strategies
• About QuerySurge
• Demo
built by
QuerySurge™
About
FACTS
Founded:
1996
Headquarters:
New York
Customer profile:
• Fortune 1000
• 600+ customers
Strategic Partners:
IBM, Microsoft, HP,
Oracle, Teradata,
HortonWorks, Cloudera,
Amazon Web Services
Software:
QuerySurge
RTTS is the leading provider of software & data quality
for critical business systems
“70% of enterprises have either deployed or are planning to
deploy big data projects and programs this year”
– analyst firm IDG
“46% of companies cite data quality as a barrier for adopting
Business Intelligence products.”
- InformationWeek
“Poor data quality is a primary reason for 40% of all business
initiatives failing to achieve their targeted benefits.”
- analyst firm Gartner
Data Quality Issues
built by
QuerySurge™
Business Intelligence (BI) software
CxOs are using Business Intelligence & Analytics to make critical business decisions
– with the assumption that the underlying data is fine.
“The average organization loses
$14.2 million annually through
poor Data Quality.”
- Gartner
The Executive Office & Critical Data
potential
problem areas
ETL
Data Architecture
Flat
Files
Data Warehouse Testing
built by
Data Warehouse: the Marketplace
“The data warehousing market will see a compound annual growth rate of
11.5% …to reach a total of $13.2 billion in revenue.”
- consulting specialist The 451 Group
Data Warehouse software vendors
- Analyst firm Gartner’s Magic Quadrant for Data Warehouse Database Management Systems
Leaders
Challengers
built by
QuerySurge™
Extract
built by
QuerySurge™
Legacy DB
CRM/ERP
DB
Finance DB
Testing the Data Warehouse: the ETL process
Source
Data
ETL Process Target
Data Warehouse
Transform
Load
Testing the Data Warehouse: Test Entry Points
Recommended functional test strategy: Test every entry point in the
system (feeds, databases, internal messaging, front-end transactions).
The goal: provide rapid localization of data issues between points
test entry point test entry point test entry points
built by
QuerySurge™
Legacy DB
CRM/ERP
DB
Finance DB
ETL ETL
Source Data ETL Process Target DW ETL Process Data Mart
Business
Intelligence
software
Big Data Testing
built by
Big Data Vendors
built by
QuerySurge™
Big Data technology & services market will grow at a 26.4% CAGR to $41.5 billion
through 2018, or about 6x the growth rate of the overall IT market.
- Analyst firm IDC
Basic Hadoop Architecture
MapReduce
(Task Tracker)
HDFS
(Data
Node)
MapReduce – processing part that manages the
programming jobs. (a.k.a. Task Tracker)
HDFS (Hadoop Distributed File System) – stores
data on the machines. (a.k.a. Data Node)
machine
Cluster Add more machines for scaling, from 1 to 100 to 1,000
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Name Node Coordination for HDFS. Inserts and extraction are communicated through the Name Node.
accepts jobs, assigns tasks, identifies failed machines
MapReduce
(Task Tracker)
HDFS
(Data
Node)HiveQLHiveQL
HiveQLHiveQL
HiveQL
Hive - a data warehouse infrastructure built on top of Hadoop for
providing data summarization, query, and analysis.
Hive provides a mechanism to query the data using a SQL-like language
called HiveQL that interacts with the HDFS files
• create
• insert
• update
• delete
• select
Hive
2 Use Cases:
Hadoop
Data
Warehouse
NoSQL
Hadoop
Data
Warehouse
Recommended functional test strategy: Test every entry point in the system
(feeds, databases, internal messaging, front-end transactions).
The goal: provide rapid localization of data issues between points
test entry point
built by
Business
Intelligence
software
ETL
Source Data
Source Hadoop ETL Process Target DWH
built by
QuerySurge™
Use Case #1:
Data Warehouse & Hadoop
test entry point test entry points
Use Case #2:
MongoDB, Hadoop, Data Warehouse
Relational DB & Data
WarehousingSource Data
@
BI, Analytics &
ReportingIngestion
built by
™
test entry point
test entry point
test entry point
test entry point test entry point
2 Prevalent DataTesting Strategies
built by
1) Stare & Compare
(also known as sampling)
2) Minus Queries
Strategy #1: Stare & Compare
built by
QuerySurge™
• Review Mapping Document (business rules, data flow mapping, data movement requirements)
• Write Tests in SQL editor
• Execute 2 Tests: 1 at Source & 1 at Target
• Dump results to 2 Excel files
• Compare results by eye (‘Stare & Compare’ or ‘sampling’)
Issue with Stare & Compare:
Impossible to visually compare billions of data sets.
Result: usually less than 1% of data is compared
Example:
Current QuerySurge customer has:
• a single test with 100 million rows & 200 columns
• = 20 billion data sets
• the client has > 7,000 total tests
built by
QuerySurge™
MINUS QUERIES subtract one result set from another result set to show difference
Comment: MINUS QUERIES need to be executed 2x (Source MINUS Target; Target MINUS Source)
Result sets may not be accurate when dealing with duplicate rows of data
No historical data from past testing – audit and regulatory issues
Processing of minus queries puts pressure on the servers
Double execution means 2x testing time and resource utilization
Potential for false positives (bad data could exist on both sides of an ETL leg)
DataTesting Strategy #2: Minus Queries
Minus Query #1: Table_1 MINUS Table_2
Minus Query #2: Table_2 MINUS Table_1
Result Set #1
Result Set #2
ISSUES with MINUS QUERIES
Write 2 MINUS queries
in SQL editor
Execute
MINUS queries 2x
DataTesting Strategies
built by
QuerySurge™
a fundamental issue with both current strategies:
Assumption that all team members understand
and can write SQL or HQL code
About QuerySurge™
built by
What is QuerySurge™?
the collaborative
Big Data Testing solution
that finds bad data &
provides a holistic view
of your data’s health
built by
the QuerySurge advantage
built by
QuerySurge™
Automate the entire testing cycle
 Automate kickoff, tests, comparison, auto-emailed results
Create Tests easily with no programming
 ensures minimal time & effort to create tests / obtain results
Test across different platforms
 data warehouse, Hadoop, NoSQL, database, flat file, XML
Collaborate with team
 Data Health dashboard, shared tests & auto-emailed reports
Verify more data & do it quickly
 verifies up to 100% of all data up to 1,000 x faster
Integrate for Continuous Delivery
 Integrates with most Build, ETL & QA management software
Collaboration
Testers
- functional testing
- regression testing
- result analysis
Developers / DBAs
- unit testing
- result analysis
Data Analysts
- review, analyze data
- verify mapping failures
Operations teams
- monitoring
- result analysis
Managers
- oversight
- result analysis
Share information on the
built by
QuerySurge™
QuerySurge™ Architecture
Web-based…
Installs on...
Linux
Connects to…
…or any other JDBC compliant data source
built by
QuerySurge™
QuerySurge
Controller
QuerySurge
Server
QuerySurge
Agents
Flat Files
SQL
HQL
SQL
HQL
SQL
SQL
 QS pulls data from data sources
 QS pulls data from target data store
 QS compares data quickly
 QS generates reports, audit trails
How QuerySurge Works
Reports, Data Health Dashboard, auto emails
built by
QuerySurge™
Source Data Target Data
Data Stores
• Databases
• Data Warehouses
• Data Marts
Flat Files
• Fixed Width
• Delimited
• Excel
Big Data stores
• Hadoop
• NoSQL
Data
Warehouses
XML
built by
QuerySurge™
all QuerySurge™ Modules
Design Library
SchedulingDeep-Dive Reporting
Run Dashboard
Query Wizards
Data Health Dashboard
Design Library
• Create Query Pairs (source & target SQLs)
• Great for team members skilled with SQL
QuerySurge™ Modules
Scheduling
 Build groups of Query Pairs
 Schedule Test Runs
built by
QuerySurge™
Deep-Dive Reporting
 Examine and automatically
email test results
Run Dashboard
 View real-time execution
 Analyze real-time results
QuerySurge™ Modules
built by
QuerySurge™
QuerySurge Test Management Connectors
built by
QuerySurge™
 Drive QuerySurge execution from your Test Management Solution
 Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution
 Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge
results
• HP ALM (Quality Center)
• Microsoft Team Foundation Server
• IBM Rational Quality Manager
Integration with leading
Test Management Solutions
QuerySurge & DevOps: Continuous Delivery & Integration
built by
QuerySurge™
Automated
Testing
Automated
Reporting
Automated
Launch
Data Integration/ETL
solutions
QuerySurge™
and many others…
email
report
Test Management
solutions
QuerySurge™
email
report
and many others…
QuerySurge™
Automated Build
solutions
email
report
built by
Introducing the new
We just made data testing
REALLY EASY!
No programming needed
Testing Big Data Visually
built by
From a recent poll1 of:
• Big Data Experts
• Data Warehouse Architects
• Solution Architects
• ETL Architects
Recent Survey: Data Experts
Consensus Answer:
80% of data columns have no transformation at all
Our Question: What % of columns in your projects have no
transformations at all?
1Poll conducted by RTTS on targeted LinkedIn groups
Why is this important?
Fast and Easy.
No programming needed.
built by
QuerySurge™
QuerySurge™ Modules
Compare by Table, Column & Row
• Perform 80% of all data tests
•Automatically generates SQL & HQL code
• Opens up testing to novice & non-
technical team members
• Speeds up testing for skilled SQL coders
• provides a huge Return-On-Investment
built by
QuerySurge™
QuerySurge™ Modules
3 Types of Data Comparison Wizards:
The also provide you with automated features for:
o filtering (‘Where’ clause) and
o sorting (‘Order By’ clause)
Column-Level Comparison:
This is great for Big Data stores and Data Warehouses
Table-Level Comparison:
This comparator is great for Data Migrations and Database Upgrades.
Row Count Comparison:
Great for all - Big Data stores, Data Warehouses, Data Migrations and Database Upgrades.
Uses:
Tests the columns that have no
transformations, which means it tests
approximately 80% of your data store without
you writing any SQL code
Tests:
Big Data, Data Warehouses
Value added:
novice or non-technical: no coding needed,
productive immediately
experienced user: saves time
built by
QuerySurge™
built by
QuerySurge™
(we picked Column-Level Comparison)
Uses:
Verifies data loads when no
transformation occurs
Tests:
data migrations, upgrades
Value added:
novice or non-technical:
no coding needed
experienced user:
saves time
built by
QuerySurge™
Use:
Verify that the amount of rows from the
source match the amount from the target
Tests:
Big data, data warehouse, data
migration, database upgrades, data
interfaces
Value added:
novice: no coding needed
experienced user: saves time
built by
QuerySurge™
_________
Total
10/15/2015 40
built by
QuerySurge™
Training Courses
Data Warehouse Testing
• Data Warehouse & ETL Testing Fundamentals (1 day)
• Fundamentals of QuerySurge (1 day)
• Introduction to SQL for QuerySurge (1 day)
• Advanced SQL techniques for QuerySurge (1 day)
Big Data Testing
• Big Data And ETL Testing Fundamentals
• Introduction To Big Data Testing Using Hive And HQL
Consulting
RTTS, the software quality experts (and developer of QuerySurge), provides consulting
solutions to the challenges of Big Data & Data Warehouse / ETL Testing
• Jumpstart 2-week program – combines training courses, mentoring, consulting
• Staff Augmentation – add additional RTTS resources to your team
• Outsourcing - RTTS can perform all testing, including planning, design, execution
(1) Trial in the Cloud of QuerySurgeTM, including self-learning
tutorial that works with sample data for 3 days
(2) Downloaded Trial of QuerySurgeTM, including self-learning
tutorial with sample data or your data for 15 days
(3) Proof of Concept of QuerySurgeTM includes our team of experts
assisting you for 30 days
for more information on (1), (2) and (3),
Go to http://www.querysurge.com/compare-trial-options
TRIAL
IN THE CLOUD
built by
QuerySurge™
Free TrialsQuerySurge™
Proof
of
Concept
built by
QuerySurge™
QuerySurge Demo

More Related Content

What's hot

Oracle Management Cloud
Oracle Management CloudOracle Management Cloud
Oracle Management CloudFabio Batista
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with PythonGokhan Atil
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data FlowMark Kromer
 
Growing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSGrowing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Azure data factory
Azure data factoryAzure data factory
Azure data factoryDavid Giard
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
DevOps for Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for DatabricksDatabricks
 
Using Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on DatabricksUsing Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on DatabricksDatabricks
 
Data visualization with sql analytics
Data visualization with sql analyticsData visualization with sql analytics
Data visualization with sql analyticsDatabricks
 
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Dan Harvey
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariDataWorks Summit
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
The Roadmap for SQL Server 2019
The Roadmap for SQL Server 2019The Roadmap for SQL Server 2019
The Roadmap for SQL Server 2019Amit Banerjee
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 

What's hot (20)

Oracle Management Cloud
Oracle Management CloudOracle Management Cloud
Oracle Management Cloud
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data Flow
 
Growing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSGrowing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RS
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
DevOps for Databricks
DevOps for DatabricksDevOps for Databricks
DevOps for Databricks
 
Using Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on DatabricksUsing Redash for SQL Analytics on Databricks
Using Redash for SQL Analytics on Databricks
 
Data visualization with sql analytics
Data visualization with sql analyticsData visualization with sql analytics
Data visualization with sql analytics
 
Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.Change data capture with MongoDB and Kafka.
Change data capture with MongoDB and Kafka.
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with Ambari
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
The Roadmap for SQL Server 2019
The Roadmap for SQL Server 2019The Roadmap for SQL Server 2019
The Roadmap for SQL Server 2019
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 

Viewers also liked

Applying Testing Techniques for Big Data and Hadoop
Applying Testing Techniques for Big Data and HadoopApplying Testing Techniques for Big Data and Hadoop
Applying Testing Techniques for Big Data and HadoopMark Johnson
 
Big Data, Big Trouble: Getting into the Flow of Hadoop Testing
Big Data, Big Trouble: Getting into the Flow of Hadoop TestingBig Data, Big Trouble: Getting into the Flow of Hadoop Testing
Big Data, Big Trouble: Getting into the Flow of Hadoop TestingTechWell
 
Hadoop: Big Data Stacks validation w/ iTest How to tame the elephant?
Hadoop:  Big Data Stacks validation w/ iTest  How to tame the elephant?Hadoop:  Big Data Stacks validation w/ iTest  How to tame the elephant?
Hadoop: Big Data Stacks validation w/ iTest How to tame the elephant?Dmitri Shiryaev
 
2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in Seoul2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in SeoulJongwook Woo
 
Big data testing (1)
Big data testing (1)Big data testing (1)
Big data testing (1)vodqancr
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

Viewers also liked (7)

Applying Testing Techniques for Big Data and Hadoop
Applying Testing Techniques for Big Data and HadoopApplying Testing Techniques for Big Data and Hadoop
Applying Testing Techniques for Big Data and Hadoop
 
Big Data, Big Trouble: Getting into the Flow of Hadoop Testing
Big Data, Big Trouble: Getting into the Flow of Hadoop TestingBig Data, Big Trouble: Getting into the Flow of Hadoop Testing
Big Data, Big Trouble: Getting into the Flow of Hadoop Testing
 
Hadoop: Big Data Stacks validation w/ iTest How to tame the elephant?
Hadoop:  Big Data Stacks validation w/ iTest  How to tame the elephant?Hadoop:  Big Data Stacks validation w/ iTest  How to tame the elephant?
Hadoop: Big Data Stacks validation w/ iTest How to tame the elephant?
 
2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in Seoul2014 International Software Testing Conference in Seoul
2014 International Software Testing Conference in Seoul
 
Big data testing (1)
Big data testing (1)Big data testing (1)
Big data testing (1)
 
How to be an awesome test automation professional
How to be an awesome test automation professionalHow to be an awesome test automation professional
How to be an awesome test automation professional
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Similar to Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing Code

Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingRTTS
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityRTTS
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryRTTS
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP TestingRTTS
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaRTTS
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your DataRTTS
 
QuerySurge AI webinar
QuerySurge AI webinarQuerySurge AI webinar
QuerySurge AI webinarRTTS
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Automated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsRTTS
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopRTTS
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...RTTS
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure CloudRTTS
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingRTTS
 
Empowering Customers with Personalized Insights
Empowering Customers with Personalized InsightsEmpowering Customers with Personalized Insights
Empowering Customers with Personalized InsightsCloudera, Inc.
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23Jason Packer
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServiceswebuploader
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalytixDataServices
 

Similar to Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing Code (20)

Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data Quality
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE VerticaLeveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging HPE ALM & QuerySurge to test HPE Vertica
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your Data
 
QuerySurge AI webinar
QuerySurge AI webinarQuerySurge AI webinar
QuerySurge AI webinar
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Automated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI ReportsAutomated Testing of Microsoft Power BI Reports
Automated Testing of Microsoft Power BI Reports
 
Test Automation for Data Warehouses
Test Automation for Data Warehouses Test Automation for Data Warehouses
Test Automation for Data Warehouses
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of Hadoop
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 
TestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data TestingTestGuild and QuerySurge Presentation -DevOps for Data Testing
TestGuild and QuerySurge Presentation -DevOps for Data Testing
 
Empowering Customers with Personalized Insights
Empowering Customers with Personalized InsightsEmpowering Customers with Personalized Insights
Empowering Customers with Personalized Insights
 
DataOps , cbuswaw April '23
DataOps , cbuswaw April '23DataOps , cbuswaw April '23
DataOps , cbuswaw April '23
 
AnalysisServices
AnalysisServicesAnalysisServices
AnalysisServices
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
 

More from RTTS

State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023RTTS
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentRTTS
 
RTTS Postman and API Testing Webinar Slides.pdf
RTTS Postman and API Testing Webinar  Slides.pdfRTTS Postman and API Testing Webinar  Slides.pdf
RTTS Postman and API Testing Webinar Slides.pdfRTTS
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyRTTS
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectRTTS
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessRTTS
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOpsRTTS
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and DatabasesRTTS
 
Case study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverCase study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverRTTS
 
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumRTTS
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?RTTS
 

More from RTTS (13)

State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023State of the Market - Data Quality in 2023
State of the Market - Data Quality in 2023
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
 
RTTS Postman and API Testing Webinar Slides.pdf
RTTS Postman and API Testing Webinar  Slides.pdfRTTS Postman and API Testing Webinar  Slides.pdf
RTTS Postman and API Testing Webinar Slides.pdf
 
Creating a Data validation and Testing Strategy
Creating a Data validation and Testing StrategyCreating a Data validation and Testing Strategy
Creating a Data validation and Testing Strategy
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing Project
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and Databases
 
Case study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriverCase study: Open Source Automation Framework using Selenium WebDriver
Case study: Open Source Automation Framework using Selenium WebDriver
 
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
 
RTTS - the Software Quality Experts
RTTS - the Software Quality ExpertsRTTS - the Software Quality Experts
RTTS - the Software Quality Experts
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 

Recently uploaded

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 

Recently uploaded (20)

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing Code

  • 1. built by QuerySurge™ Automated Big Data Testing without Writing Code Testing of Hadoop and Data Warehouses Visually Bill Hayduk CEO/President RTTS Jeff Bocarsly, PhD Chief Architect QuerySurge /RTTS
  • 2. PresentationTopics built by QuerySurge™ • Testing a Data Warehouse • Testing Big Data • Current Data Testing Strategies • About QuerySurge • Demo
  • 3. built by QuerySurge™ About FACTS Founded: 1996 Headquarters: New York Customer profile: • Fortune 1000 • 600+ customers Strategic Partners: IBM, Microsoft, HP, Oracle, Teradata, HortonWorks, Cloudera, Amazon Web Services Software: QuerySurge RTTS is the leading provider of software & data quality for critical business systems
  • 4. “70% of enterprises have either deployed or are planning to deploy big data projects and programs this year” – analyst firm IDG “46% of companies cite data quality as a barrier for adopting Business Intelligence products.” - InformationWeek “Poor data quality is a primary reason for 40% of all business initiatives failing to achieve their targeted benefits.” - analyst firm Gartner Data Quality Issues built by QuerySurge™
  • 5. Business Intelligence (BI) software CxOs are using Business Intelligence & Analytics to make critical business decisions – with the assumption that the underlying data is fine. “The average organization loses $14.2 million annually through poor Data Quality.” - Gartner The Executive Office & Critical Data potential problem areas ETL Data Architecture Flat Files
  • 7. Data Warehouse: the Marketplace “The data warehousing market will see a compound annual growth rate of 11.5% …to reach a total of $13.2 billion in revenue.” - consulting specialist The 451 Group Data Warehouse software vendors - Analyst firm Gartner’s Magic Quadrant for Data Warehouse Database Management Systems Leaders Challengers built by QuerySurge™
  • 8. Extract built by QuerySurge™ Legacy DB CRM/ERP DB Finance DB Testing the Data Warehouse: the ETL process Source Data ETL Process Target Data Warehouse Transform Load
  • 9. Testing the Data Warehouse: Test Entry Points Recommended functional test strategy: Test every entry point in the system (feeds, databases, internal messaging, front-end transactions). The goal: provide rapid localization of data issues between points test entry point test entry point test entry points built by QuerySurge™ Legacy DB CRM/ERP DB Finance DB ETL ETL Source Data ETL Process Target DW ETL Process Data Mart Business Intelligence software
  • 11. Big Data Vendors built by QuerySurge™ Big Data technology & services market will grow at a 26.4% CAGR to $41.5 billion through 2018, or about 6x the growth rate of the overall IT market. - Analyst firm IDC
  • 12. Basic Hadoop Architecture MapReduce (Task Tracker) HDFS (Data Node) MapReduce – processing part that manages the programming jobs. (a.k.a. Task Tracker) HDFS (Hadoop Distributed File System) – stores data on the machines. (a.k.a. Data Node) machine Cluster Add more machines for scaling, from 1 to 100 to 1,000 Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Task Tracker Data Node Name Node Coordination for HDFS. Inserts and extraction are communicated through the Name Node. accepts jobs, assigns tasks, identifies failed machines
  • 13. MapReduce (Task Tracker) HDFS (Data Node)HiveQLHiveQL HiveQLHiveQL HiveQL Hive - a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. Hive provides a mechanism to query the data using a SQL-like language called HiveQL that interacts with the HDFS files • create • insert • update • delete • select Hive
  • 15. Recommended functional test strategy: Test every entry point in the system (feeds, databases, internal messaging, front-end transactions). The goal: provide rapid localization of data issues between points test entry point built by Business Intelligence software ETL Source Data Source Hadoop ETL Process Target DWH built by QuerySurge™ Use Case #1: Data Warehouse & Hadoop test entry point test entry points
  • 16. Use Case #2: MongoDB, Hadoop, Data Warehouse Relational DB & Data WarehousingSource Data @ BI, Analytics & ReportingIngestion built by ™ test entry point test entry point test entry point test entry point test entry point
  • 17. 2 Prevalent DataTesting Strategies built by 1) Stare & Compare (also known as sampling) 2) Minus Queries
  • 18. Strategy #1: Stare & Compare built by QuerySurge™ • Review Mapping Document (business rules, data flow mapping, data movement requirements) • Write Tests in SQL editor • Execute 2 Tests: 1 at Source & 1 at Target • Dump results to 2 Excel files • Compare results by eye (‘Stare & Compare’ or ‘sampling’) Issue with Stare & Compare: Impossible to visually compare billions of data sets. Result: usually less than 1% of data is compared Example: Current QuerySurge customer has: • a single test with 100 million rows & 200 columns • = 20 billion data sets • the client has > 7,000 total tests
  • 19. built by QuerySurge™ MINUS QUERIES subtract one result set from another result set to show difference Comment: MINUS QUERIES need to be executed 2x (Source MINUS Target; Target MINUS Source) Result sets may not be accurate when dealing with duplicate rows of data No historical data from past testing – audit and regulatory issues Processing of minus queries puts pressure on the servers Double execution means 2x testing time and resource utilization Potential for false positives (bad data could exist on both sides of an ETL leg) DataTesting Strategy #2: Minus Queries Minus Query #1: Table_1 MINUS Table_2 Minus Query #2: Table_2 MINUS Table_1 Result Set #1 Result Set #2 ISSUES with MINUS QUERIES Write 2 MINUS queries in SQL editor Execute MINUS queries 2x
  • 20. DataTesting Strategies built by QuerySurge™ a fundamental issue with both current strategies: Assumption that all team members understand and can write SQL or HQL code
  • 22. What is QuerySurge™? the collaborative Big Data Testing solution that finds bad data & provides a holistic view of your data’s health built by
  • 23. the QuerySurge advantage built by QuerySurge™ Automate the entire testing cycle  Automate kickoff, tests, comparison, auto-emailed results Create Tests easily with no programming  ensures minimal time & effort to create tests / obtain results Test across different platforms  data warehouse, Hadoop, NoSQL, database, flat file, XML Collaborate with team  Data Health dashboard, shared tests & auto-emailed reports Verify more data & do it quickly  verifies up to 100% of all data up to 1,000 x faster Integrate for Continuous Delivery  Integrates with most Build, ETL & QA management software
  • 24. Collaboration Testers - functional testing - regression testing - result analysis Developers / DBAs - unit testing - result analysis Data Analysts - review, analyze data - verify mapping failures Operations teams - monitoring - result analysis Managers - oversight - result analysis Share information on the built by QuerySurge™
  • 25. QuerySurge™ Architecture Web-based… Installs on... Linux Connects to… …or any other JDBC compliant data source built by QuerySurge™ QuerySurge Controller QuerySurge Server QuerySurge Agents Flat Files
  • 26. SQL HQL SQL HQL SQL SQL  QS pulls data from data sources  QS pulls data from target data store  QS compares data quickly  QS generates reports, audit trails How QuerySurge Works Reports, Data Health Dashboard, auto emails built by QuerySurge™ Source Data Target Data Data Stores • Databases • Data Warehouses • Data Marts Flat Files • Fixed Width • Delimited • Excel Big Data stores • Hadoop • NoSQL Data Warehouses XML
  • 27. built by QuerySurge™ all QuerySurge™ Modules Design Library SchedulingDeep-Dive Reporting Run Dashboard Query Wizards Data Health Dashboard
  • 28. Design Library • Create Query Pairs (source & target SQLs) • Great for team members skilled with SQL QuerySurge™ Modules Scheduling  Build groups of Query Pairs  Schedule Test Runs built by QuerySurge™
  • 29. Deep-Dive Reporting  Examine and automatically email test results Run Dashboard  View real-time execution  Analyze real-time results QuerySurge™ Modules built by QuerySurge™
  • 30. QuerySurge Test Management Connectors built by QuerySurge™  Drive QuerySurge execution from your Test Management Solution  Outcome results (Pass/Fail/etc.) are returned from QuerySurge to your Test Management Solution  Results are linked in your Test Management Solution so that you can click directly into detailed QuerySurge results • HP ALM (Quality Center) • Microsoft Team Foundation Server • IBM Rational Quality Manager Integration with leading Test Management Solutions
  • 31. QuerySurge & DevOps: Continuous Delivery & Integration built by QuerySurge™ Automated Testing Automated Reporting Automated Launch Data Integration/ETL solutions QuerySurge™ and many others… email report Test Management solutions QuerySurge™ email report and many others… QuerySurge™ Automated Build solutions email report
  • 32. built by Introducing the new We just made data testing REALLY EASY! No programming needed Testing Big Data Visually
  • 33. built by From a recent poll1 of: • Big Data Experts • Data Warehouse Architects • Solution Architects • ETL Architects Recent Survey: Data Experts Consensus Answer: 80% of data columns have no transformation at all Our Question: What % of columns in your projects have no transformations at all? 1Poll conducted by RTTS on targeted LinkedIn groups Why is this important?
  • 34. Fast and Easy. No programming needed. built by QuerySurge™ QuerySurge™ Modules Compare by Table, Column & Row • Perform 80% of all data tests •Automatically generates SQL & HQL code • Opens up testing to novice & non- technical team members • Speeds up testing for skilled SQL coders • provides a huge Return-On-Investment
  • 35. built by QuerySurge™ QuerySurge™ Modules 3 Types of Data Comparison Wizards: The also provide you with automated features for: o filtering (‘Where’ clause) and o sorting (‘Order By’ clause) Column-Level Comparison: This is great for Big Data stores and Data Warehouses Table-Level Comparison: This comparator is great for Data Migrations and Database Upgrades. Row Count Comparison: Great for all - Big Data stores, Data Warehouses, Data Migrations and Database Upgrades.
  • 36. Uses: Tests the columns that have no transformations, which means it tests approximately 80% of your data store without you writing any SQL code Tests: Big Data, Data Warehouses Value added: novice or non-technical: no coding needed, productive immediately experienced user: saves time built by QuerySurge™
  • 37. built by QuerySurge™ (we picked Column-Level Comparison)
  • 38. Uses: Verifies data loads when no transformation occurs Tests: data migrations, upgrades Value added: novice or non-technical: no coding needed experienced user: saves time built by QuerySurge™
  • 39. Use: Verify that the amount of rows from the source match the amount from the target Tests: Big data, data warehouse, data migration, database upgrades, data interfaces Value added: novice: no coding needed experienced user: saves time built by QuerySurge™ _________ Total
  • 40. 10/15/2015 40 built by QuerySurge™ Training Courses Data Warehouse Testing • Data Warehouse & ETL Testing Fundamentals (1 day) • Fundamentals of QuerySurge (1 day) • Introduction to SQL for QuerySurge (1 day) • Advanced SQL techniques for QuerySurge (1 day) Big Data Testing • Big Data And ETL Testing Fundamentals • Introduction To Big Data Testing Using Hive And HQL Consulting RTTS, the software quality experts (and developer of QuerySurge), provides consulting solutions to the challenges of Big Data & Data Warehouse / ETL Testing • Jumpstart 2-week program – combines training courses, mentoring, consulting • Staff Augmentation – add additional RTTS resources to your team • Outsourcing - RTTS can perform all testing, including planning, design, execution
  • 41. (1) Trial in the Cloud of QuerySurgeTM, including self-learning tutorial that works with sample data for 3 days (2) Downloaded Trial of QuerySurgeTM, including self-learning tutorial with sample data or your data for 15 days (3) Proof of Concept of QuerySurgeTM includes our team of experts assisting you for 30 days for more information on (1), (2) and (3), Go to http://www.querysurge.com/compare-trial-options TRIAL IN THE CLOUD built by QuerySurge™ Free TrialsQuerySurge™ Proof of Concept

Editor's Notes

  1. I typically read the quotes
  2. Informatica’s software is the premier used for ETL, but was not mentioned in Gartner’s report because they don’t have DW software.
  3. QuerySurge provides insight into the health of your data throughout your organization through BI dashboards and reporting at your fingertips. It is a collaborative tool that allows for distributed use of the tool throughout your organization and provides for a sharable, holistic view of your data’s health and your organization’s level of maturity of your data management.
  4. QuerySurge can utilized by active practitioners such as testers & developers to create and launch tests, or by managers, analysts and operations to view data test results and the overall health of the data. QuerySurge facilitates this by providing 2 types of licenses: (1) full user & (2) participant user. (1) Full User – This type of user has unlimited access to create QueryPairs, Suites, and Scenarios. This user can also schedule and run tests, see results, run and export reports, and export data. Perfect for anyone creating and/or running data tests while performing analysis of results. (2) Participant User – This user cannot create or run tests, but has access to all other information - including viewing all query pairs, results, and reports, receiving email notifications, and exporting test results and reports. Perfect for managers, analysts, architects, DBAs, developers, and operations users who need to know the health of their data.
  5. Your distributed team from around the world can use any of these web browsers: Internet Explorer, Chrome, Firefox and Safari. Installs on operating systems: Windows & Linux. QS connects to any JDBC-compliant data source. Even if it is not listed here.
  6. QuerySurge finds bad data by natively connecting to: any data source, whether it is any type of database, flat file or xml and can connect to any data target, whether it is a db, file, xml, data warehouse or hadoop implementation. QuerySurge pulls data from the source and the target and compares them very quickly (typically in a few minutes) and then produces reports that show every data difference, even if there are millions of rows and hundreds of columns in the test. These reports can be automatically emailed to your team. You can pick from a multitude of reports or export the results so that you can build your own reports.