1. EVOLVING ON-DEMAND INFRASTRUCTURE
FOR HADOOP 2.0 AND YARN
Prime Dimensions' Hadoop 2.0 Big Data infrastructure includes YARN
and Tez, a distributed data operating system and development
platform that extends the batch processing functionality of
MapReduce by allowing multiple types of applications to be deployed
directly across Hadoop clusters. Together, YARN and Tez represent a
paradigm shift in processing, managing and analyzing Big Data.
The real benefit of YARN is that it allows Hadoop clusters to execute
workloads beyond MapReduce. With YARN, Hadoop now has a
generic
resource-management
and
distributed
application
framework, in which multiple data processing applications can run
natively in Hadoop. YARN provides extensibility and scalability in
Hadoop by splitting the roles of the Hadoop Job Tracker into two
processes: (1) the resource management controls access to the
clusters resources (memory, CPU, etc.), and (2) the application
manager controls task execution.
In conjunction with YARN, Prime Dimensions is also offering
integration support for other Hadoop projects, such as Tez, a
dataflow graph tool, and Spark, an open source, in-memory data
analytics platform. Together, these projects make it possible to
establish domain-specific enclaves over multi-tenant compute
clusters, creating a virtualized data environment and unified analytics
platform, as enterprises evolve from “systems of records” to
“systems of engagement.” This often requires deploying in-memory,
high performance, NoSQL technologies,
but YARN, Tez and Spark offer new
options for organizations seeking these
analytic capabilities in Hadoop.
As Hadoop gains widespread adoption
not only as a Big Data technology but
also as a data warehouse augmentation
strategy, its basic functionality is
evolving to meet the demands of
increased performance and high
scalability. YARN and Tez are not simply
new releases; they represent a
revolutionary advancement of Hadoop.
We see tremendous opportunity for the
adoption of YARN, Tez and Spark as
enterprise solutions for generating
advanced analytics with reduced timeto-value. There will be significant
demand to upgrade early adopters to
Hadoop 2.0. Moreover, with the
advanced features and capabilities of
YARN and Tez, the use cases that arise
from this new paradigm span across
industries with seemingly profound,
endless
possibilities.
There
are
advantages of bringing together NoSQL,
relational and/or in-memory solutions,
both Open Source and proprietary. Such
analytic
offload
supports
the
establishment of a unified analytics
environment.
To Learn More…
Contact:
Michael Joseph, Managing Partner
703.861.9897
mjoseph@primedimensions.com
www.primedimensions.com
@PrimeDimensions