SlideShare a Scribd company logo
1 of 12
Download to read offline
TECHNOLOGY IN BRIEF




                                      THE OBJECT EVOLUTION
           EMC OBJECT-BASED STORAGE FOR ACTIVE ARCHIVING AND
                       APPLICATION DEVELOPMENT
                                               NOVEMBER 2012

                  A few years ago, object-based storage made a huge splash on-premise with the
                  promise of meaningful data relationships, information accessibility and strong
                  compliance. It remains an important component for information management
                  based on compliance and single-tenant architectures. However, the evolution of
                  object-based storage has big implications for the cloud and unstructured data:
 new approaches to active archiving, web/mobile application development and a changing model for
 cloud storage service providers.
 Object storage is optimal for the web. It has a very different architecture from file systems, which
 are frankly overkill for most cloud storage. On-premise can be a different story; having data close to
 hand under single-tenant access control is right for some data storage. But on-premise stored data
 requires that the enterprise maintain a primary data center, a cold data center for DR, replication,
 continuous data protection, and so on. Given the right set of needs this is a fine trade-off of course
 and we certainly do not counsel people to get rid of their internal data centers and redundant
 systems.
 However, cloud-based object architecture offers big benefits for storing unstructured data for
 active archiving, global access to data, fast application development and much lower cost compared
 to the high computing and data protection costs of on-premise NAS. EMC has engineered Atmos to
 provide these capabilities and many more as a massively scalable, distributed cloud-based system.
 In this Technology in Brief we will examine the fast-changing world of archiving and development
 on the web, and how object-based storage is the best way to go for these monumental tasks.

 When Object Trumps File
 The go-to architecture for unstructured data has traditionally been an application-centric system
 containing the operating system, the application, and a NAS filer using hierarchical file architecture.
 This infrastructure works acceptably well in a slow-growth, consistent workload setting; although
 even then it is far too easy to add complexity along with additional systems and filers.
 However, business needs have evolved far beyond this sleepy storage model. Unstructured data
 now comprises a massive portion of large data growth, and hierarchical file systems are difficult to
 optimize and scale. For example, file system-based storage requires near-constant provisioning. As
 storage requests grow (which they inevitably do), IT administrators must manually provision
 storage to meet the expanded requirements. Meanwhile, large volume and spiky workloads make
 provisioning both “up” and “down” an expensive and time-consuming proposition.
 And difficult provisioning is hardly the only problem: siloed data protection with individual backup,
 replication and archiving applications steadily raises OPEX. Scaling is an issue as well. Large critical
 big data applications may warrant scale-out or scale-up file systems (which are challenges in and of


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                      1 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557   www.tanejagroup.com
Technology in Brief



themselves). Most do not rate this architecture, and instead reside on poorly scalable systems. The
number of these systems grows as applications come online, making it even harder for IT and
application owners to administrate and for users to get the value from the application that they
need. This already difficult scenario gets even worse when NAS storage is used for what is
essentially a cloud use case, such as extending existing assets over the cloud.




                                      Figure: Traditional NAS infrastructure 3

In contrast to hierarchical file system-based storage silos, object-based storage opens up a whole
new range of dynamic functionality. Object-based storage assigns unique object IDs to access data
across all federated locations. This goes a long way towards eliminating traditional, time-
consuming storage management tasks like LUN creation and RAID groups. Active archives and
applications needing fast global access particularly benefit from global namespaces and location
transparency. The flat, universal namespace allows global access to stored content from anywhere
the distributed application runs. Applications can also efficiently associate metadata with stored
objects without using a dedicated database. Sharing vast storage resources means application
administrators do not need to modify application files. Object-based storage usually has elements of
file systems in order to handle processes like file archiving, but it is not founded on that
architecture and its drawbacks.
Object-based storage originally developed as a type of specialized NAS storage where the
hierarchical system was replaced with an object-oriented system that made file storage far more
secure and scalable. One of its most popular incarnations is still going strong today: Content-
Addressable Storage (CAS). A subset of object-oriented storage, CAS ensures there is only one ID for
any object. When the CAS object is retrieved, it can be hashed again and checked against its ID to
verify identity. CAS de-dupes at the object level for copy control.




Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       2 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief



TABLE: CONTRASTING FILE SYSTEMS WITH OBJECT STORAGE
        Characteristic                                  File                                       Object
                                    File    systems      implement      a            Object metadata is stored along
                                    centralized file layer metadata                  with the object data to avoid
                                    service that tracks directory                    metadata service bottlenecks. This
Metadata                            structures, permissions, and on-disk             ID may be used to also uniquely
                                    locations of files. All file requests            verify and validate the data being
                                    must access metadata first for                   stored.
                                    permission and file information.

                                    File    systems    have       built-in           Object storage provides a single
                                    namespace constraints for files and              flat namespace for objects.
                                    directories they can store and                   Replacing path and filenames with
Namespace                           manage. Hierarchical directory                   object identifiers makes the
                                    structures can become unwieldy,                  address space practically infinite
                                    performing poorly at navigating                  with very fast performance for
                                    large numbers of users or files.                 users and applications.
                                    File systems are designed to offer               Objects are inherently immutable
                                    in-place editing and updating of                 once stored under a unique ID,
                                    files using sophisticated, yet highly            and can be easily replicated and
                                    complex,          locking        and             accessed globally. Programming
Interaction
                                    synchronization mechanisms. These                for object storage leads to simpler,
                                    methods make it difficult to                     supportable, and more reliable
                                    distribute or extend file systems                programs.
                                    across multiple locations.
                                    File systems present a real                      Object stores are simple, clean and
                                    challenge for cloud-based archival               quick to access. Since objects are
                                    management         and       mobile              easily distributed, replicated, and
                                    application      delivery.     Poor              globally accessible in the cloud,
Cloud Applications                  scalability, lagging performance,                they are ideal for active global
                                    and        complex       application             archives and distributed mobile
                                    development make traditional file                applications.
                                    systems a poor choice for
                                    compelling new cloud usages.


Object-based storage both on-premise and in the cloud require certain key capabilities. On-premise
object storage has great benefits for local file storage including multiple application access, massive
scaling, high availability; and in some architectures, information governance as well.
   Multiple application access. Applications simultaneously leverage the same centralized
    object-based storage infrastructure. This enables local object-based storage to execute
    application-specific archiving management attributes for a complete chain of information
    custody.
   Massive scaling. Massive scaling is problematical with file-based archive solutions. As the file
    system reaches its maximum capacity, administrators must expand the entire system’s
    operating system, file system and application in order to scale the archive. By contrast, object-



Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                                      3 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557                   www.tanejagroup.com
Technology in Brief



    based storage can expand in an open fashion into multiple petabytes due to their flat address
    space.
   High availability. Object storage often archives data that has heavy retention and government
    requirements. In this environment, 5 9’s or higher availability (99.999%) is a necessity.
    Mirroring and parity help to protect availability; other beneficial features include self-healing,
    detecting and fixing soft corruptions in the background, and addressing hardware failures
    before they impact data availability.
   Information governance. A subset of object-based storage, Content-Addressable Storage
    (CAS) is purpose-built for long-term defensible retention of fixed files and data. As opposed to
    other archival storage methods like tape or monolithic “tar” files that bundle data up and/or
    move it offline, CAS stores data as objects that can be strictly and individually managed for
    governance and compliance and yet remain actively accessible on-line.

Best Practices: Object and the Cloud
We strongly support on-premise object storage such as CAS for local space savings, performance
and information governance. However, we find that object storage is roaring to life in the cloud,
where cloud-based active archiving and application development require highly distributed and
single namespace storage for unstructured content. These critical usage cases benefit far more from
object-based storage than they do from traditional file systems. Let’s look at best practices
architectural features for object-based storage in the cloud.

DATA AND METADATA
When data is stored as an object, a unique object identifier is created out of a single universal global
namespace. The object ID is retained by the client application and used to subsequently retrieve
that object. Objects can effectively live anywhere in the cloud-wide system without the storage
client needing to know about actual data locations, file system structures or LUN details. This
provides a complete location transparency that serves to reduce intentional storage management
and inherently supports globally distributed access by web and mobile applications.
Because of the location transparency provided by the object storage layer, objects can be
automatically load-balanced across nodes, and replicated within and across sites without
disrupting applications or users. Wide data distribution and federation can be managed through
systematic policies to meet various service level goals for access, high availability, protection, cost
and performance.
The object layer abstraction also provides a great benefit to applications that previously might have
had to be intimately storage aware to avoid running out of space or had to otherwise actively
manage data locations. Because applications written to leverage object storage don’t have to embed
rules or code specific knowledge of storage infrastructure details, they avoid having to be re-
written or re-architected for “changing” storage assignments as users spread, features expand, and
data sets grow.

MULTI-TENANCY
Secure multi-tenancy is a key requirement of cloud object storage, which should support two levels
of multi-tenancy: tenants and sub-tenants. Tenants are top-level entities that each has its own
access points, security controls and master storage policies. Tenants share nothing with other
tenants and are fully isolated. Every node gets assigned to a specific tenant; tenants do not share
nodes and therefore each tenant has its own dedicated access points and storage. Within a large
company, a tenant could be set up for independently managed divisions or subsidiaries. In a service
provider implementation, the tenant might be mapped to a broad storage service offering.


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       4 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief



Sub-tenants are then created within each tenant with security controls and defined management
policies assigned by the tenant. Each sub-tenancy defines a distinct storage environment with
isolated management for its own users, object namespace, and defined shares. A sub-tenant within
a company might correspond to a department, while a storage provider's sub-tenant might track to
a specific client account.
This highly functional multi-tenancy capability makes it easy to create private sandboxes or
implement a global content delivery scheme. With some planning, this scheme could enable large
corporations to facilitate aggregating “big data” distributed across the enterprise.

ACCESS FROM ANYWHERE
As a cloud object storage service with a flat global namespace, an object can be accessed through
any site (although for performance, policies might strive to replicate objects to sites closer to where
they will be read). In addition, object storage for the cloud must present a broad range of access
methods including both web services and traditional file services.
REST (and SOAP) web services are key APIs. REST is the most common cloud storage access
method for browser and custom mobile applications. REST as a protocol over HTTP was designed
to optimize web-style remote access to “resources”, and is an ideal match to object storage where
each object can be easily treated as a REST resource.




                             Figure: Typical cloud-based object storage deployment

POLICY DRIVEN MANAGEMENT
A key benefit of object storage is the ability to use metadata to drive automatic data management
policies. Policies should support service levels, and should be triggered when data objects are
created, objects hit certain ages, or upon metadata updates. Policies can control data protection
operations including the number, type and target locations for replicas, inherent storage features
for striping, compression and de-duplication, retention locks and automatic deletion, and shifting
objects into different policies over time.


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       5 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief



The policy mechanism should be highly flexible, targeting policies to any group of objects based on
both system and user defined metadata. Policies can be used to build service levels by defining the
amount of replication, implement archive rules for compliance, and optimize capacity and
performance as items age.

Primary Object Use Cases in the Cloud
Cloud-based archiving, particularly medical and file archiving, forms the primary use case for
object-based storage. Web application development is surging forward, and Archive-as-a-Service
and its providers round out the fastest-growing use cases.

PRIMARY USE CASE: ACTIVE ARCHIVES
Archived information is playing a more strategic role in workflows and business processes. On-
premise archiving is essentially static and used to reduce storage costs, improve operational
efficiency, retention and compliance, and enable the business to use archived data to make better
business decisions. Cloud-based archiving retains elements of these features but adds new dynamic
ones: instant access from any device, archive as a service and federating to private or public cloud.
Atmos provides both the static and dynamic features that massive active archives require.
   Federate to public or private clouds. Federation enables companies to treat on-premise and
    cloud object storage as a single efficient infrastructure. Companies may pool distributed storage
    assets including data, applications and policies to take full advantage of the cloud’s massive
    scalability and global access features. Federation also lowers cost and risk: application
    workloads run on cloud resources with a low execution cost, and if a cloud-based storage
    system goes down the distributed workload remains protected. Federation extends internal
    policies to cloud-based storage environments by applying existing policies and settings to
    cloud-based storage.
   Use metadata to drive business and storage decisions. We expect the use of metadata to
    expand quickly to directly feed business exploitation processes, as well as support more
    automatic and intelligent storage management decisions. A singly managed distributed system
    that maintains directly accessible object metadata yields rich support for business decisions.
    Object-based storage also enables IT to automate information lifecycle management across the
    entire distributed data store, not just by storage silo. Policies should be flexible enough to be set
    at the object, tenant or system levels, to automate archive decisions, set and manage retention,
    expiration, and disposition.
   Multi-tenancy for secure shared storage. Multiple applications can safely co-exist as separate
    tenants. Isolation by tenant protects security while enabling the sharing of system-wide
    resources and capacity. Multi-tenancy is also efficient since it is subscribed to a highly scalable
    pool of storage, which can flexibly up-scale and down-scale on demand.
   Massive scalability. Unstructured data storage is growing so fast that traditional storage
    systems are straining purchase, maintenance and management resources to the brink.
    Distributed object-based architecture yields near-limitless scale. Object also allows for
    automatic load balancing whenever new objects are stored, which protects high performance
    across the entire distributed system.
   Multi-site active/active. Multi-site active/active architecture is an important component of
    object-based storage, especially in the cloud. Cloud object storage systems span multiple sites
    and provide for multi-site direct access to objects through both synchronous and asynchronous
    replications. This model replicates between multiple storage nodes and sites, which not only
    increases distributed availability and content distribution, but also supports disaster recovery.



Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       6 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief



   Archive-as-a-service. The most agile and flexible way for IT to deliver archive services is with
    the cloud model of self-service portals. This model manages and meters utilization and
    bandwidth and supports third-party chargeback. Within an enterprise this flexibility and
    instant storage relieves users of the temptation of using commercial cloud services simply
    because they can get the storage they need fast – even though security might not be in place.
    This approach also enables ISVs and MSPs to extend archive requirements and offerings.
   Reduce manual tasks and provisioning across multiple archives. Cloud-based archives
    must be easy to set-up and for reliability and consistency must not require long or deep manual
    configuration. They should also automate underlying complexities including security, audit,
    retention, performance, and capacity growth. Atmos provides these features and more,
    relieving the cloud administrator of enormous burdens. Distributed systems may be managed
    as a single entity with policies to automate hundreds of management and data protection tasks.
    And perhaps the most important of all, object-based systems like Atmos offer massive
    scalability of capacity and performance thanks to their unique architecture.

FAST-GROWING USE CASE: WEB AND MOBILE APPLICATION DEVELOPMENT
Web and mobile applications development using unstructured data also has driving needs that
object-based cloud storage meets. Web application development requires quick access to storage
resources, test/dev environments capable of storing multiple copies of large data sets, and the
ability to test web applications in real-time online environments. These requirements are
understandably hard to achieve in traditional using file-based storage systems.
Applications written to leverage object storage won’t need to be rewritten or even taken offline as
the object storage seamlessly (or elastically) expands over time. Atmos provides the key
capabilities that web application development require, including location transparency, self-
managing storage and REST APIs.
   Enable instant access to data from any device. Web and mobile applications are inherently
    geographically distributed, yet file systems are usually limited in both effective access points
    (location) and number of files that they can manage. Object-based storage abstracts its storage
    from physical locations, providing a secure access point in place of device-specific mount points.
    Web services APIs and file-based access allow approved users to easily access their archives
    from computers and a broad array of mobile devices. Integrated web services over REST and
    SOAP are key to this instant access. Other support components are file-based access (CIFS / NFS
    / IFS / CAS), and expanded access via ISV applications.
   Self-managing storage. In traditional development, applications have often been hard-coded
    to specific data stores through pointers to identified LUN’s or file system navigation paths. In
    contrast, object storage provides a clean mapping from application to data through a simple
    REST API with an immutable unique object ID to the stored object. This goes a long way
    towards eliminating traditional, time-consuming storage management tasks like LUN creation
    and RAID groups. Cloud owners may choose to extend self-management options to customers,
    making it simple for users to grow storage capacity on demand.
   Broad API support. Cloud object storage is basically shared storage accessed through web-
    based services. Atmos’ architecture supports rapid web application development with a broad
    API set including REST and S3. REST API leverages HTTP operations on objects that are directly
    addressed, which reduces code complexity and provides the kind of easy, automatically
    distributed, protected, persistent storage the developer needs. In addition to the REST API,
    EMC Atmos also natively supports the Amazon S3 API. This provides customers with the ability
    to simply point S3 applications to Atmos and seamlessly migrate their applications to any of the
    more than 40 Atmos powered public clouds around the globe.


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       7 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief




EMC and Object-Based Storage
EMC first introduced Centera CAS for archiving in 2002. Centera offers 5 9’s data availability with
its redundant array of independent nodes (RAIN) is interconnected via cube switches, protecting
data across independent nodes in a cube. Mirroring and parity provide additional protection and
availability.
Centera’s CAS architecture keeps the retained data from being compromised or deleted before the
end of its retention period. Centera assigns unique hash-code identifiers specific to each unique
object including content elements, metadata, and data/metadata relationships. This inextricably
links content elements with their metadata, which are stored within a flat address space – no need
for a separate database. This architecture ensures authenticity of the archived objects. Centera
abstracts the unique objects from their generating applications and operating systems, which
enables Centera to flexibly act as the single, highly optimized data store for previously siloed
archives.
Centera retains single instances of archived objects. In the case of multiple users of the same file –
such as a PowerPoint file sent over a distribution list – Centera retains metadata with information
about each user’s interaction with the file, but points to the single instance of the object. By cutting
down on data copies, this results in dramatic reductions in the quantity of archive storage.
Centera searches using metadata, rather than opening up the content objects on application-specific
storage. This results in much faster and more efficient searches without using application cycles.
This is possible because content and metadata stored on Centera is application, file and operating
system independent; and Centera offers is a search engine right in its repository.
Centera’s content-based addressing integrates directly with application environments via APIs,
with no need for kernel level dependencies. This means that multiple applications can
simultaneously use Centera, and that specific archiving management attributes – such as data aging
and data protection -- can be executed per application. These capabilities create a complete chain of
custody once the data leaves the primary application to be archived on Centera. Media
independence also leverages Centera’s application support. Centera objects are independent of
specific storage media and protocols, which means that the storage system can migrate to new
storage media over time without disturbing the integrity of the archived objects. For long term
disk-based archiving, this represents significant risk mitigation and investment protection.
Centera architecture is highly scalable and self-managing. Traditional file systems scale based on
the amount of stored data versus remaining available address space – which may not be much. As
the file system reaches its maximum capacity, administrators must expand the entire file system
including operating system, file system, and application in order to scale the archive. In contrast,
Centera expands to petabyte-high capacities due to their flat address space. It also leverages its
architecture to distribute management controls across the entire archive infrastructure. For
example, if a Centera disk or node fails, the archive cluster knows how to self heal without manual
intervention. This distributed management structure extends to cover the deployment, scaling,
recovery and protection of all the archival objects being stored by Centera.
Centera optimizes archiving, information governance and compliance. Users may choose from 300
native, integrated archiving applications to manage archival needs for email, files, medical imaging,
content management, video, voice, and more on the single Centera archiving platform. In addition,
Centera offers Compliance Edition Plus for compliance and eDiscovery, and Governance Edition for
data retention management.



Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       8 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief



Centera Compliance Edition Plus captures and preserves original content, protecting data and
proving chain of custody for legal eDiscovery and litigation. Retention classes assign a logical
reference to each electronic record object; policies enforce data retention and safe disposition.
Centera Governance Edition enforces internal policies for data retention and disposition. Policies
may be organizational or application-specific, which improves corporate accountability, reduces the
cost of eDiscovery and compliance, and proves the integrity of governance controls.

To the Cloud: Atmos Architecture
EMC’s Atmos supports the same CAS API as Centera for seamless migration, and brings object
storage into the cloud with massive scalability and geographic federation supported with multi-
tenancy, cloud provisioning and global access features. While Atmos is readily leveraged to extend
active global archives, it also offers an exceptional platform for web and mobile application
development. Atmos even enables new opportunities for global “big” data aggregation and
distribution.
Atmos is at heart a software storage system for building private and public cloud storage. Atmos
implementations are available from EMC either already integrated into pre-packaged physical
building blocks or as a virtual machine solution for VMware vSphere that can leverage other EMC or
3rd party storage resources. Additionally, there is a rich ecosystem of service providers providing
Atmos as cloud Storage-as-a-Service directly. Any and all of these options can be federated together
as needed within and across a given organization.
EMC uses REST and SOAP web services, and has also implemented file services on top of Atmos to
serve underlying objects through the lens of either an NFS or CIFS file server. When NFS or CIFS
shares are defined, they are assigned to specific Atmos nodes (or dedicated pairs for HA) and utilize
the Atmos node’s inherent Linux capabilities (leveraging an Installable File System with the FUSE
extension). Layering a file system over Atmos imposes some constraints regarding universal access,
but also enables both traditional and transitional applications and file system type usage.
EMC Atmos Windows and Linux users can also leverage the EMC GeoDrive add-on that installs on a
single user workstation or server to provide remote virtual NFS/CIFS style access (over REST) to
Atmos object storage. GeoDrive supports local caching of files for offline use and eventual
synchronization on reconnection. One of the major benefits of GeoDrive is enabling a user to access
large amounts of protected storage from anywhere. It can also be used for the disaster recovery of
files pushed or mirrored into Atmos.
Atmos technically maintains a given piece of data as an object with associated metadata that
includes the object ID, system and user-defined metadata fields and the internal object layout
information (and parent/child information for objects saved through a file system “namespace”
interface). Applications and users can store arbitrary metadata with each object that can be
leveraged by group management policies. Policies can be created at the tenant level as a design
scheme to provide various service levels of performance access, and data protection based on some
awareness of the multi-site architecture of the cloud implementation. They are then assigned to
subtenants, who need to not be aware of the underlying implementation, to apply as target service
levels to their objects. For example, the power to explicitly enforce compression of image files (e.g.
jpegs) after a number of days would present a significant capacity optimization for a web-based
application dealing with millions of images.
In addition to supporting compliance and retention policies, metadata can be used to drive
automated file distribution, access control and data protection activities optimizing for the
appropriate level of data resiliency, performance and availability. For most applications, thoughtful
use of user metadata can remove any need to implement a separate management tracking database
for stored objects.


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                       9 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557    www.tanejagroup.com
Technology in Brief



Replication is controlled by automated policies which can mirror data objects at many points in an
object’s lifecycle both within and across multiple sites. Within a data center site, replication might
for example be set to happen synchronously upon ingestion while between replication between
sites might be set asynchronously and launched with an arbitrary delay to allow for data settling.
Replications can be targeted to specific locations, or abstractly sent to “other” sites as the system
decides.
For performance and availability, replicas are all active for read access (objects are inherently
immutable so there is no issue with having to manage distributed locking mechanisms). Because it
is “multi-site active/active”, any site can fulfill new object write requests when the local primary
site is unavailable.
In addition to full replication, EMC also provides an erasure coding option called GeoParity. Instead
of keeping two or more full 100% copies, “9/12” erasure coding enables storing an “expanded”
object containing only 33% additional encoded “redundant” data broken up into 12 segments. By
using erasure coding, the original data can be reconstructed dynamically from any 9 of the
segments. These segments are cleverly distributed so that the object can survive (and even be
accessed during) multiple failures. For greater protection there is also a “10/16” coding with a 60%
capacity overhead. Erasure coding does impact access performance, especially at ingestion, but
provides great fault tolerance with much lower capacity utilization. Of course, policies can be
written to convert replicated objects to erasure coded schemes as they age appropriately.
With object stores there is generally no need for low-level RAID or disk level protection and Atmos
is no exception. Upon hardware failures, replications and/or GeoParity across nodes (RAIN)
combined with built-in node auto-healing features suffice to provide the full data protection as
determined by the service level “policies” implemented for each type of data object. Atmos can
withstand the loss of any disk, node, rack, or even site.

Atmos Pre-built Hardware Configurations
EMC Atmos pre-configured hardware “appliances” consists of a rack/cabinet containing from 4 to
16 Atmos nodes in various configurations and disk capacities. Flexible configurations enable
smooth scalability, and allow for mixes of capacity and performance in and across Atmos sites. An
Atmos storage node consists of a 1GbE server front-end running the Atmos storage services
connected to one or more SAS attached disk array enclosures (DAE), each containing 15 1-3TB
7200RPM disks. Every node runs all object storage services (the first two nodes in each site also
run the site metadata locator service that indexes which node contains which objects) supporting
tremendous horizontal system scalability.
EMC has also introduced their new Atmos G3 series for new levels of density and energy efficiency.
G3-Dense-480 is the first in the Atmos G3 series and consists of 4, 6, or 8 nodes with 480 disks in
40U, and 3TB drives.

TABLE: ALIGNING TOP CLOUD USE CASES WITH EMC ATMOS
     Use case                          Challenge                                        Benefits

 Medical               Over 800 million medical imaging                Vendor Neutral Archive (VNA) on Atmos:
 Archiving             procedures a year require huge                  integrates with EMR/EHR and improves
                       storage scalability; collaboration and          PACs for better patient care and
                       compliance increase complexity.                 collaboration, improves data lifecycle
                                                                       management, reduces IT costs, and
                                                                       preserves HIPAA compliance.
 File Archiving        Corporate file sharing is popular with          With EMC Sync & Share, users can securely


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                                10 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557             www.tanejagroup.com
Technology in Brief



                       employees but syncing and sharing               share Atmos files across mobile devices,
                       are hard to manage. Employees will              Linux and Windows. GeoDrive creates a
                       frequently share files anyway over              Dropbox-like service that is secure and
                       mobile devices, leaving corporations            manageable, powered by Atmos’ fast
                       accountable for risky behavior.                 performance. Atmos policies monitor
                                                                       changes to data and provide access control,
                                                                       benefitting regulated verticals like finance.
 Archive as a          Both the enterprise and storage                 The Atmos Cloud Delivery Platform enables
 Service               service providers struggle to provide           corporations and service providers to meter
                       IT services to their respective                 capacity, bandwidth, and usage across
                       customers. Provisioning,                        tenants. Provisioning is automated by
                       maintenance, and security are all               tenant, and Atmos allows tenants to safely
                       difficult issues in traditional storage         self-manage and access their own storage.
                       offerings.
 Managed               Many MSPs suffer from narrow profit             Atmos lets MSPs efficiently offer storage as
 Service               margins because of the expense of               a service and better monetize new service
 Providers             delivering storage to customers.                offerings. MSPs can monitor capacity and
                       Managing multiple tenants, manual               usage for chargeback, reduce provisioning
                       provisioning and maintaining service            costs, and replace multiple tenant manage-
                       level agreements all cut into revenue           ment systems with a single system. Dynamic
                       and make it too expensive to add                scaling, high availability and security cost-
                       new storage services.                           effectively meet service level requirements.
 Content-Rich          Traditional storage is a poor                   Atmos provides location transparency for
 Web                   environment for Web application                 global applications and a highly mobile user
 Applications          development, which needs highly                 base. The single namespace means that
                       scalable capacity for multiple large            application developers never need to recode
                       data sets, a secure environment for             pathnames and locations, and do not need
                       test/dev and application testing in             to code for limited storage environments.
                       real-time environments.                         Self-management options make it easy for
                                                                       customers to provision their own storage,
                                                                       and REST APIs reduce application
                                                                       complexity.


Taneja Group Opinion
When on-premise archive solutions smoothly integrate with federated storage, then public and
private clouds provide extensive scalability and global availability. Yet we see too many end-users
treating the cloud as just another storage tier for low value retained data. This is a huge waste of
cloud possibilities but we understand why it happens: cloud platforms with poor performance and
delivery mechanisms can make cloud-based storage more trouble than it’s worth.
But when we talk about EMC Atmos we are not talking about a low-cost storage tier, far from it. We
are describing the heart of business innovation based on highly secure and highly accessible global
data stores. EMC’s long expertise with object-based storage has kept Centera relevant and has
extended dynamic data management to the cloud with Atmos. The Atmos-fueled cloud replaces
hierarchical file storage while allowing the secure flow of information between the data center, the
distributed cloud, and global access points. Customers profit from greatly improved application and
data delivery, and the deep business value inherent in their valuable data.


Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                                  11 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557               www.tanejagroup.com
Technology in Brief



When a company is dealing with geographic reach and large growing volumes of rich content, then
they should look to object-based storage in the cloud. We fully support EMC in its push to scale
capacity, performance, availability and management far beyond what traditional file systems are
capable of, and more massively than ever before.


.NOTICE: The information and product recommendations made by Taneja Group are based upon public
information and sources and may also include personal opinions both of Taneja Group and others, all of which we
believe to be accurate and reliable. However, as market conditions change and not within our control, the
information and recommendations are made without warranty of any kind. All product names used and
mentioned herein are the trademarks of their respective owners. Taneja Group, Inc. assumes no responsibility or
liability for any damages whatsoever (including incidental, consequential or otherwise), caused by your use of, or
reliance upon, the information and recommendations presented herein, nor for any inadvertent errors that may
appear in this document.




Copyright The TANEJA Group, Inc. 2012. All Rights Reserved.                                              12 of 12
87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557           www.tanejagroup.com

More Related Content

What's hot

Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012Marc Villemade
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32jujukoko
 
Digital preservation geoscinfo
Digital preservation geoscinfoDigital preservation geoscinfo
Digital preservation geoscinfosmtcd
 
Msc Proposal Presentation
Msc Proposal PresentationMsc Proposal Presentation
Msc Proposal PresentationLighton Phiri
 
Storage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkStorage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkCSCJournals
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012Amazon Web Services
 
Research on big data
Research on big dataResearch on big data
Research on big dataRoby Chen
 
Scality presentation cloud Computing Expo NY 2012 v1.0
Scality presentation cloud Computing Expo NY 2012 v1.0Scality presentation cloud Computing Expo NY 2012 v1.0
Scality presentation cloud Computing Expo NY 2012 v1.0Marc Villemade
 
Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4ReadWrite
 

What's hot (13)

gfs-sosp2003
gfs-sosp2003gfs-sosp2003
gfs-sosp2003
 
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
Panzura & Scality - Cloud Storage made seamless - Cloud Expo New York City 2012
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32
 
Digital preservation geoscinfo
Digital preservation geoscinfoDigital preservation geoscinfo
Digital preservation geoscinfo
 
Msc Proposal Presentation
Msc Proposal PresentationMsc Proposal Presentation
Msc Proposal Presentation
 
Storage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable FrameworkStorage Virtualization: Towards an Efficient and Scalable Framework
Storage Virtualization: Towards an Efficient and Scalable Framework
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
Mcae brief
Mcae briefMcae brief
Mcae brief
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
AWS Partner Presentation – Panzura – AWS Cloud Storage for the Enterprise 2012
 
Research on big data
Research on big dataResearch on big data
Research on big data
 
Scality presentation cloud Computing Expo NY 2012 v1.0
Scality presentation cloud Computing Expo NY 2012 v1.0Scality presentation cloud Computing Expo NY 2012 v1.0
Scality presentation cloud Computing Expo NY 2012 v1.0
 
Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4Filename intelvmwaresolutionbrief asset4
Filename intelvmwaresolutionbrief asset4
 

Viewers also liked

Data Explosion in Medical Imaging
Data Explosion in Medical ImagingData Explosion in Medical Imaging
Data Explosion in Medical ImagingShourya Sarcar
 
发音要领
发音要领发音要领
发音要领efsee
 
Substitutes income effect
Substitutes income effectSubstitutes income effect
Substitutes income effectTravis Klein
 
Personality test
Personality testPersonality test
Personality testshibi225
 
It’s a Jungle Out There - Improving Communications with Your Volunteers
It’s a Jungle Out There - Improving Communications with Your VolunteersIt’s a Jungle Out There - Improving Communications with Your Volunteers
It’s a Jungle Out There - Improving Communications with Your VolunteersLaurel Gerdine
 
Tuesday marginal analysis
Tuesday marginal analysisTuesday marginal analysis
Tuesday marginal analysisTravis Klein
 
Wed demand consumer surplus
Wed demand consumer surplusWed demand consumer surplus
Wed demand consumer surplusTravis Klein
 
Automatic Annotation in UniProtKB
Automatic Annotation in UniProtKBAutomatic Annotation in UniProtKB
Automatic Annotation in UniProtKBEBI
 
2012 key financial numbers report
2012 key financial numbers report2012 key financial numbers report
2012 key financial numbers reportAngelson Group
 
International Conference on Cloud and Big Data Analytics ICCBDA 2013
International Conference on Cloud and Big Data Analytics ICCBDA 2013 International Conference on Cloud and Big Data Analytics ICCBDA 2013
International Conference on Cloud and Big Data Analytics ICCBDA 2013 EMC
 
vCloud Air Network Has Arrived
vCloud Air Network Has ArrivedvCloud Air Network Has Arrived
vCloud Air Network Has ArrivedEMC
 

Viewers also liked (20)

Data Explosion in Medical Imaging
Data Explosion in Medical ImagingData Explosion in Medical Imaging
Data Explosion in Medical Imaging
 
Sub formulario2
Sub formulario2Sub formulario2
Sub formulario2
 
Gambia23
Gambia23Gambia23
Gambia23
 
发音要领
发音要领发音要领
发音要领
 
10 roses for_u
10 roses for_u10 roses for_u
10 roses for_u
 
Edul
EdulEdul
Edul
 
Substitutes income effect
Substitutes income effectSubstitutes income effect
Substitutes income effect
 
Finland
FinlandFinland
Finland
 
Personality test
Personality testPersonality test
Personality test
 
Changes to SRAD
Changes to SRADChanges to SRAD
Changes to SRAD
 
It’s a Jungle Out There - Improving Communications with Your Volunteers
It’s a Jungle Out There - Improving Communications with Your VolunteersIt’s a Jungle Out There - Improving Communications with Your Volunteers
It’s a Jungle Out There - Improving Communications with Your Volunteers
 
Tuesday marginal analysis
Tuesday marginal analysisTuesday marginal analysis
Tuesday marginal analysis
 
Wed demand consumer surplus
Wed demand consumer surplusWed demand consumer surplus
Wed demand consumer surplus
 
Atlassian Crowd
Atlassian CrowdAtlassian Crowd
Atlassian Crowd
 
Automatic Annotation in UniProtKB
Automatic Annotation in UniProtKBAutomatic Annotation in UniProtKB
Automatic Annotation in UniProtKB
 
2012 key financial numbers report
2012 key financial numbers report2012 key financial numbers report
2012 key financial numbers report
 
International Conference on Cloud and Big Data Analytics ICCBDA 2013
International Conference on Cloud and Big Data Analytics ICCBDA 2013 International Conference on Cloud and Big Data Analytics ICCBDA 2013
International Conference on Cloud and Big Data Analytics ICCBDA 2013
 
vCloud Air Network Has Arrived
vCloud Air Network Has ArrivedvCloud Air Network Has Arrived
vCloud Air Network Has Arrived
 
Amarnath darshan
Amarnath darshanAmarnath darshan
Amarnath darshan
 
20121025cafesemi
20121025cafesemi20121025cafesemi
20121025cafesemi
 

Similar to The Object Evolution - EMC Object-Based Storage for Active Archiving and Application Development

Survey of distributed storage system
Survey of distributed storage systemSurvey of distributed storage system
Survey of distributed storage systemZhichao Liang
 
What is Object storage ?
What is Object storage ?What is Object storage ?
What is Object storage ?Nabil Kassi
 
Dynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File SystemsDynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File SystemsIJERA Editor
 
Cloud Storage Adoption, Practice, and Deployment
Cloud Storage Adoption, Practice, and DeploymentCloud Storage Adoption, Practice, and Deployment
Cloud Storage Adoption, Practice, and DeploymentGlusterFS
 
Cloud File System and Cloud Data Management Interface (CDMI)
Cloud File System and Cloud Data Management Interface (CDMI)Cloud File System and Cloud Data Management Interface (CDMI)
Cloud File System and Cloud Data Management Interface (CDMI)Calsoft Inc.
 
Software Developer Conference 2012 - Paper Presentation - Cloud File Systems
Software Developer Conference 2012 - Paper Presentation - Cloud File SystemsSoftware Developer Conference 2012 - Paper Presentation - Cloud File Systems
Software Developer Conference 2012 - Paper Presentation - Cloud File SystemsAbhijeet Kulkarni
 
Novell File Management Suite Use Cases
Novell File Management Suite Use CasesNovell File Management Suite Use Cases
Novell File Management Suite Use CasesNovell
 
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics
 
AWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloudAWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloudAmazon Web Services
 
[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage ReductionPerforce
 
Building a Resilient, Scalable, Storage System with OpenStack
Building a Resilient, Scalable, Storage System with OpenStackBuilding a Resilient, Scalable, Storage System with OpenStack
Building a Resilient, Scalable, Storage System with OpenStackCloudian
 
Liberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open SourceLiberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open SourceIsaac Christoffersen
 
Consolidating File Servers into the Cloud
Consolidating File Servers into the CloudConsolidating File Servers into the Cloud
Consolidating File Servers into the CloudBuurst
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureDenodo
 
EV.Cloud Email Archiving
EV.Cloud Email ArchivingEV.Cloud Email Archiving
EV.Cloud Email Archivingcrussell79
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAlluxio, Inc.
 
Infinit: Modern Storage Platform for Container Environments
Infinit: Modern Storage Platform for Container EnvironmentsInfinit: Modern Storage Platform for Container Environments
Infinit: Modern Storage Platform for Container EnvironmentsDocker, Inc.
 
A novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computingA novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computingJoão Gabriel Lima
 

Similar to The Object Evolution - EMC Object-Based Storage for Active Archiving and Application Development (20)

Survey of distributed storage system
Survey of distributed storage systemSurvey of distributed storage system
Survey of distributed storage system
 
What is Object storage ?
What is Object storage ?What is Object storage ?
What is Object storage ?
 
GPFS Solution Brief
GPFS Solution BriefGPFS Solution Brief
GPFS Solution Brief
 
Dynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File SystemsDynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File Systems
 
Cloud Storage Adoption, Practice, and Deployment
Cloud Storage Adoption, Practice, and DeploymentCloud Storage Adoption, Practice, and Deployment
Cloud Storage Adoption, Practice, and Deployment
 
Cloud File System and Cloud Data Management Interface (CDMI)
Cloud File System and Cloud Data Management Interface (CDMI)Cloud File System and Cloud Data Management Interface (CDMI)
Cloud File System and Cloud Data Management Interface (CDMI)
 
Software Developer Conference 2012 - Paper Presentation - Cloud File Systems
Software Developer Conference 2012 - Paper Presentation - Cloud File SystemsSoftware Developer Conference 2012 - Paper Presentation - Cloud File Systems
Software Developer Conference 2012 - Paper Presentation - Cloud File Systems
 
Novell File Management Suite Use Cases
Novell File Management Suite Use CasesNovell File Management Suite Use Cases
Novell File Management Suite Use Cases
 
IBM SONAS Brochure
IBM SONAS BrochureIBM SONAS Brochure
IBM SONAS Brochure
 
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
 
AWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloudAWS Summit 2011: Architecting in the cloud
AWS Summit 2011: Architecting in the cloud
 
[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction[IC Manage] Workspace Acceleration & Network Storage Reduction
[IC Manage] Workspace Acceleration & Network Storage Reduction
 
Building a Resilient, Scalable, Storage System with OpenStack
Building a Resilient, Scalable, Storage System with OpenStackBuilding a Resilient, Scalable, Storage System with OpenStack
Building a Resilient, Scalable, Storage System with OpenStack
 
Liberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open SourceLiberate Your Files with a Private Cloud Storage Solution powered by Open Source
Liberate Your Files with a Private Cloud Storage Solution powered by Open Source
 
Consolidating File Servers into the Cloud
Consolidating File Servers into the CloudConsolidating File Servers into the Cloud
Consolidating File Servers into the Cloud
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
 
EV.Cloud Email Archiving
EV.Cloud Email ArchivingEV.Cloud Email Archiving
EV.Cloud Email Archiving
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
 
Infinit: Modern Storage Platform for Container Environments
Infinit: Modern Storage Platform for Container EnvironmentsInfinit: Modern Storage Platform for Container Environments
Infinit: Modern Storage Platform for Container Environments
 
A novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computingA novel solution of distributed memory no sql database for cloud computing
A novel solution of distributed memory no sql database for cloud computing
 

More from EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 

More from EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Recently uploaded

Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 

Recently uploaded (20)

Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

The Object Evolution - EMC Object-Based Storage for Active Archiving and Application Development

  • 1. TECHNOLOGY IN BRIEF THE OBJECT EVOLUTION EMC OBJECT-BASED STORAGE FOR ACTIVE ARCHIVING AND APPLICATION DEVELOPMENT NOVEMBER 2012 A few years ago, object-based storage made a huge splash on-premise with the promise of meaningful data relationships, information accessibility and strong compliance. It remains an important component for information management based on compliance and single-tenant architectures. However, the evolution of object-based storage has big implications for the cloud and unstructured data: new approaches to active archiving, web/mobile application development and a changing model for cloud storage service providers. Object storage is optimal for the web. It has a very different architecture from file systems, which are frankly overkill for most cloud storage. On-premise can be a different story; having data close to hand under single-tenant access control is right for some data storage. But on-premise stored data requires that the enterprise maintain a primary data center, a cold data center for DR, replication, continuous data protection, and so on. Given the right set of needs this is a fine trade-off of course and we certainly do not counsel people to get rid of their internal data centers and redundant systems. However, cloud-based object architecture offers big benefits for storing unstructured data for active archiving, global access to data, fast application development and much lower cost compared to the high computing and data protection costs of on-premise NAS. EMC has engineered Atmos to provide these capabilities and many more as a massively scalable, distributed cloud-based system. In this Technology in Brief we will examine the fast-changing world of archiving and development on the web, and how object-based storage is the best way to go for these monumental tasks. When Object Trumps File The go-to architecture for unstructured data has traditionally been an application-centric system containing the operating system, the application, and a NAS filer using hierarchical file architecture. This infrastructure works acceptably well in a slow-growth, consistent workload setting; although even then it is far too easy to add complexity along with additional systems and filers. However, business needs have evolved far beyond this sleepy storage model. Unstructured data now comprises a massive portion of large data growth, and hierarchical file systems are difficult to optimize and scale. For example, file system-based storage requires near-constant provisioning. As storage requests grow (which they inevitably do), IT administrators must manually provision storage to meet the expanded requirements. Meanwhile, large volume and spiky workloads make provisioning both “up” and “down” an expensive and time-consuming proposition. And difficult provisioning is hardly the only problem: siloed data protection with individual backup, replication and archiving applications steadily raises OPEX. Scaling is an issue as well. Large critical big data applications may warrant scale-out or scale-up file systems (which are challenges in and of Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 1 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 2. Technology in Brief themselves). Most do not rate this architecture, and instead reside on poorly scalable systems. The number of these systems grows as applications come online, making it even harder for IT and application owners to administrate and for users to get the value from the application that they need. This already difficult scenario gets even worse when NAS storage is used for what is essentially a cloud use case, such as extending existing assets over the cloud. Figure: Traditional NAS infrastructure 3 In contrast to hierarchical file system-based storage silos, object-based storage opens up a whole new range of dynamic functionality. Object-based storage assigns unique object IDs to access data across all federated locations. This goes a long way towards eliminating traditional, time- consuming storage management tasks like LUN creation and RAID groups. Active archives and applications needing fast global access particularly benefit from global namespaces and location transparency. The flat, universal namespace allows global access to stored content from anywhere the distributed application runs. Applications can also efficiently associate metadata with stored objects without using a dedicated database. Sharing vast storage resources means application administrators do not need to modify application files. Object-based storage usually has elements of file systems in order to handle processes like file archiving, but it is not founded on that architecture and its drawbacks. Object-based storage originally developed as a type of specialized NAS storage where the hierarchical system was replaced with an object-oriented system that made file storage far more secure and scalable. One of its most popular incarnations is still going strong today: Content- Addressable Storage (CAS). A subset of object-oriented storage, CAS ensures there is only one ID for any object. When the CAS object is retrieved, it can be hashed again and checked against its ID to verify identity. CAS de-dupes at the object level for copy control. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 2 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 3. Technology in Brief TABLE: CONTRASTING FILE SYSTEMS WITH OBJECT STORAGE Characteristic File Object File systems implement a Object metadata is stored along centralized file layer metadata with the object data to avoid service that tracks directory metadata service bottlenecks. This Metadata structures, permissions, and on-disk ID may be used to also uniquely locations of files. All file requests verify and validate the data being must access metadata first for stored. permission and file information. File systems have built-in Object storage provides a single namespace constraints for files and flat namespace for objects. directories they can store and Replacing path and filenames with Namespace manage. Hierarchical directory object identifiers makes the structures can become unwieldy, address space practically infinite performing poorly at navigating with very fast performance for large numbers of users or files. users and applications. File systems are designed to offer Objects are inherently immutable in-place editing and updating of once stored under a unique ID, files using sophisticated, yet highly and can be easily replicated and complex, locking and accessed globally. Programming Interaction synchronization mechanisms. These for object storage leads to simpler, methods make it difficult to supportable, and more reliable distribute or extend file systems programs. across multiple locations. File systems present a real Object stores are simple, clean and challenge for cloud-based archival quick to access. Since objects are management and mobile easily distributed, replicated, and application delivery. Poor globally accessible in the cloud, Cloud Applications scalability, lagging performance, they are ideal for active global and complex application archives and distributed mobile development make traditional file applications. systems a poor choice for compelling new cloud usages. Object-based storage both on-premise and in the cloud require certain key capabilities. On-premise object storage has great benefits for local file storage including multiple application access, massive scaling, high availability; and in some architectures, information governance as well.  Multiple application access. Applications simultaneously leverage the same centralized object-based storage infrastructure. This enables local object-based storage to execute application-specific archiving management attributes for a complete chain of information custody.  Massive scaling. Massive scaling is problematical with file-based archive solutions. As the file system reaches its maximum capacity, administrators must expand the entire system’s operating system, file system and application in order to scale the archive. By contrast, object- Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 3 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 4. Technology in Brief based storage can expand in an open fashion into multiple petabytes due to their flat address space.  High availability. Object storage often archives data that has heavy retention and government requirements. In this environment, 5 9’s or higher availability (99.999%) is a necessity. Mirroring and parity help to protect availability; other beneficial features include self-healing, detecting and fixing soft corruptions in the background, and addressing hardware failures before they impact data availability.  Information governance. A subset of object-based storage, Content-Addressable Storage (CAS) is purpose-built for long-term defensible retention of fixed files and data. As opposed to other archival storage methods like tape or monolithic “tar” files that bundle data up and/or move it offline, CAS stores data as objects that can be strictly and individually managed for governance and compliance and yet remain actively accessible on-line. Best Practices: Object and the Cloud We strongly support on-premise object storage such as CAS for local space savings, performance and information governance. However, we find that object storage is roaring to life in the cloud, where cloud-based active archiving and application development require highly distributed and single namespace storage for unstructured content. These critical usage cases benefit far more from object-based storage than they do from traditional file systems. Let’s look at best practices architectural features for object-based storage in the cloud. DATA AND METADATA When data is stored as an object, a unique object identifier is created out of a single universal global namespace. The object ID is retained by the client application and used to subsequently retrieve that object. Objects can effectively live anywhere in the cloud-wide system without the storage client needing to know about actual data locations, file system structures or LUN details. This provides a complete location transparency that serves to reduce intentional storage management and inherently supports globally distributed access by web and mobile applications. Because of the location transparency provided by the object storage layer, objects can be automatically load-balanced across nodes, and replicated within and across sites without disrupting applications or users. Wide data distribution and federation can be managed through systematic policies to meet various service level goals for access, high availability, protection, cost and performance. The object layer abstraction also provides a great benefit to applications that previously might have had to be intimately storage aware to avoid running out of space or had to otherwise actively manage data locations. Because applications written to leverage object storage don’t have to embed rules or code specific knowledge of storage infrastructure details, they avoid having to be re- written or re-architected for “changing” storage assignments as users spread, features expand, and data sets grow. MULTI-TENANCY Secure multi-tenancy is a key requirement of cloud object storage, which should support two levels of multi-tenancy: tenants and sub-tenants. Tenants are top-level entities that each has its own access points, security controls and master storage policies. Tenants share nothing with other tenants and are fully isolated. Every node gets assigned to a specific tenant; tenants do not share nodes and therefore each tenant has its own dedicated access points and storage. Within a large company, a tenant could be set up for independently managed divisions or subsidiaries. In a service provider implementation, the tenant might be mapped to a broad storage service offering. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 4 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 5. Technology in Brief Sub-tenants are then created within each tenant with security controls and defined management policies assigned by the tenant. Each sub-tenancy defines a distinct storage environment with isolated management for its own users, object namespace, and defined shares. A sub-tenant within a company might correspond to a department, while a storage provider's sub-tenant might track to a specific client account. This highly functional multi-tenancy capability makes it easy to create private sandboxes or implement a global content delivery scheme. With some planning, this scheme could enable large corporations to facilitate aggregating “big data” distributed across the enterprise. ACCESS FROM ANYWHERE As a cloud object storage service with a flat global namespace, an object can be accessed through any site (although for performance, policies might strive to replicate objects to sites closer to where they will be read). In addition, object storage for the cloud must present a broad range of access methods including both web services and traditional file services. REST (and SOAP) web services are key APIs. REST is the most common cloud storage access method for browser and custom mobile applications. REST as a protocol over HTTP was designed to optimize web-style remote access to “resources”, and is an ideal match to object storage where each object can be easily treated as a REST resource. Figure: Typical cloud-based object storage deployment POLICY DRIVEN MANAGEMENT A key benefit of object storage is the ability to use metadata to drive automatic data management policies. Policies should support service levels, and should be triggered when data objects are created, objects hit certain ages, or upon metadata updates. Policies can control data protection operations including the number, type and target locations for replicas, inherent storage features for striping, compression and de-duplication, retention locks and automatic deletion, and shifting objects into different policies over time. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 5 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 6. Technology in Brief The policy mechanism should be highly flexible, targeting policies to any group of objects based on both system and user defined metadata. Policies can be used to build service levels by defining the amount of replication, implement archive rules for compliance, and optimize capacity and performance as items age. Primary Object Use Cases in the Cloud Cloud-based archiving, particularly medical and file archiving, forms the primary use case for object-based storage. Web application development is surging forward, and Archive-as-a-Service and its providers round out the fastest-growing use cases. PRIMARY USE CASE: ACTIVE ARCHIVES Archived information is playing a more strategic role in workflows and business processes. On- premise archiving is essentially static and used to reduce storage costs, improve operational efficiency, retention and compliance, and enable the business to use archived data to make better business decisions. Cloud-based archiving retains elements of these features but adds new dynamic ones: instant access from any device, archive as a service and federating to private or public cloud. Atmos provides both the static and dynamic features that massive active archives require.  Federate to public or private clouds. Federation enables companies to treat on-premise and cloud object storage as a single efficient infrastructure. Companies may pool distributed storage assets including data, applications and policies to take full advantage of the cloud’s massive scalability and global access features. Federation also lowers cost and risk: application workloads run on cloud resources with a low execution cost, and if a cloud-based storage system goes down the distributed workload remains protected. Federation extends internal policies to cloud-based storage environments by applying existing policies and settings to cloud-based storage.  Use metadata to drive business and storage decisions. We expect the use of metadata to expand quickly to directly feed business exploitation processes, as well as support more automatic and intelligent storage management decisions. A singly managed distributed system that maintains directly accessible object metadata yields rich support for business decisions. Object-based storage also enables IT to automate information lifecycle management across the entire distributed data store, not just by storage silo. Policies should be flexible enough to be set at the object, tenant or system levels, to automate archive decisions, set and manage retention, expiration, and disposition.  Multi-tenancy for secure shared storage. Multiple applications can safely co-exist as separate tenants. Isolation by tenant protects security while enabling the sharing of system-wide resources and capacity. Multi-tenancy is also efficient since it is subscribed to a highly scalable pool of storage, which can flexibly up-scale and down-scale on demand.  Massive scalability. Unstructured data storage is growing so fast that traditional storage systems are straining purchase, maintenance and management resources to the brink. Distributed object-based architecture yields near-limitless scale. Object also allows for automatic load balancing whenever new objects are stored, which protects high performance across the entire distributed system.  Multi-site active/active. Multi-site active/active architecture is an important component of object-based storage, especially in the cloud. Cloud object storage systems span multiple sites and provide for multi-site direct access to objects through both synchronous and asynchronous replications. This model replicates between multiple storage nodes and sites, which not only increases distributed availability and content distribution, but also supports disaster recovery. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 6 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 7. Technology in Brief  Archive-as-a-service. The most agile and flexible way for IT to deliver archive services is with the cloud model of self-service portals. This model manages and meters utilization and bandwidth and supports third-party chargeback. Within an enterprise this flexibility and instant storage relieves users of the temptation of using commercial cloud services simply because they can get the storage they need fast – even though security might not be in place. This approach also enables ISVs and MSPs to extend archive requirements and offerings.  Reduce manual tasks and provisioning across multiple archives. Cloud-based archives must be easy to set-up and for reliability and consistency must not require long or deep manual configuration. They should also automate underlying complexities including security, audit, retention, performance, and capacity growth. Atmos provides these features and more, relieving the cloud administrator of enormous burdens. Distributed systems may be managed as a single entity with policies to automate hundreds of management and data protection tasks. And perhaps the most important of all, object-based systems like Atmos offer massive scalability of capacity and performance thanks to their unique architecture. FAST-GROWING USE CASE: WEB AND MOBILE APPLICATION DEVELOPMENT Web and mobile applications development using unstructured data also has driving needs that object-based cloud storage meets. Web application development requires quick access to storage resources, test/dev environments capable of storing multiple copies of large data sets, and the ability to test web applications in real-time online environments. These requirements are understandably hard to achieve in traditional using file-based storage systems. Applications written to leverage object storage won’t need to be rewritten or even taken offline as the object storage seamlessly (or elastically) expands over time. Atmos provides the key capabilities that web application development require, including location transparency, self- managing storage and REST APIs.  Enable instant access to data from any device. Web and mobile applications are inherently geographically distributed, yet file systems are usually limited in both effective access points (location) and number of files that they can manage. Object-based storage abstracts its storage from physical locations, providing a secure access point in place of device-specific mount points. Web services APIs and file-based access allow approved users to easily access their archives from computers and a broad array of mobile devices. Integrated web services over REST and SOAP are key to this instant access. Other support components are file-based access (CIFS / NFS / IFS / CAS), and expanded access via ISV applications.  Self-managing storage. In traditional development, applications have often been hard-coded to specific data stores through pointers to identified LUN’s or file system navigation paths. In contrast, object storage provides a clean mapping from application to data through a simple REST API with an immutable unique object ID to the stored object. This goes a long way towards eliminating traditional, time-consuming storage management tasks like LUN creation and RAID groups. Cloud owners may choose to extend self-management options to customers, making it simple for users to grow storage capacity on demand.  Broad API support. Cloud object storage is basically shared storage accessed through web- based services. Atmos’ architecture supports rapid web application development with a broad API set including REST and S3. REST API leverages HTTP operations on objects that are directly addressed, which reduces code complexity and provides the kind of easy, automatically distributed, protected, persistent storage the developer needs. In addition to the REST API, EMC Atmos also natively supports the Amazon S3 API. This provides customers with the ability to simply point S3 applications to Atmos and seamlessly migrate their applications to any of the more than 40 Atmos powered public clouds around the globe. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 7 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 8. Technology in Brief EMC and Object-Based Storage EMC first introduced Centera CAS for archiving in 2002. Centera offers 5 9’s data availability with its redundant array of independent nodes (RAIN) is interconnected via cube switches, protecting data across independent nodes in a cube. Mirroring and parity provide additional protection and availability. Centera’s CAS architecture keeps the retained data from being compromised or deleted before the end of its retention period. Centera assigns unique hash-code identifiers specific to each unique object including content elements, metadata, and data/metadata relationships. This inextricably links content elements with their metadata, which are stored within a flat address space – no need for a separate database. This architecture ensures authenticity of the archived objects. Centera abstracts the unique objects from their generating applications and operating systems, which enables Centera to flexibly act as the single, highly optimized data store for previously siloed archives. Centera retains single instances of archived objects. In the case of multiple users of the same file – such as a PowerPoint file sent over a distribution list – Centera retains metadata with information about each user’s interaction with the file, but points to the single instance of the object. By cutting down on data copies, this results in dramatic reductions in the quantity of archive storage. Centera searches using metadata, rather than opening up the content objects on application-specific storage. This results in much faster and more efficient searches without using application cycles. This is possible because content and metadata stored on Centera is application, file and operating system independent; and Centera offers is a search engine right in its repository. Centera’s content-based addressing integrates directly with application environments via APIs, with no need for kernel level dependencies. This means that multiple applications can simultaneously use Centera, and that specific archiving management attributes – such as data aging and data protection -- can be executed per application. These capabilities create a complete chain of custody once the data leaves the primary application to be archived on Centera. Media independence also leverages Centera’s application support. Centera objects are independent of specific storage media and protocols, which means that the storage system can migrate to new storage media over time without disturbing the integrity of the archived objects. For long term disk-based archiving, this represents significant risk mitigation and investment protection. Centera architecture is highly scalable and self-managing. Traditional file systems scale based on the amount of stored data versus remaining available address space – which may not be much. As the file system reaches its maximum capacity, administrators must expand the entire file system including operating system, file system, and application in order to scale the archive. In contrast, Centera expands to petabyte-high capacities due to their flat address space. It also leverages its architecture to distribute management controls across the entire archive infrastructure. For example, if a Centera disk or node fails, the archive cluster knows how to self heal without manual intervention. This distributed management structure extends to cover the deployment, scaling, recovery and protection of all the archival objects being stored by Centera. Centera optimizes archiving, information governance and compliance. Users may choose from 300 native, integrated archiving applications to manage archival needs for email, files, medical imaging, content management, video, voice, and more on the single Centera archiving platform. In addition, Centera offers Compliance Edition Plus for compliance and eDiscovery, and Governance Edition for data retention management. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 8 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 9. Technology in Brief Centera Compliance Edition Plus captures and preserves original content, protecting data and proving chain of custody for legal eDiscovery and litigation. Retention classes assign a logical reference to each electronic record object; policies enforce data retention and safe disposition. Centera Governance Edition enforces internal policies for data retention and disposition. Policies may be organizational or application-specific, which improves corporate accountability, reduces the cost of eDiscovery and compliance, and proves the integrity of governance controls. To the Cloud: Atmos Architecture EMC’s Atmos supports the same CAS API as Centera for seamless migration, and brings object storage into the cloud with massive scalability and geographic federation supported with multi- tenancy, cloud provisioning and global access features. While Atmos is readily leveraged to extend active global archives, it also offers an exceptional platform for web and mobile application development. Atmos even enables new opportunities for global “big” data aggregation and distribution. Atmos is at heart a software storage system for building private and public cloud storage. Atmos implementations are available from EMC either already integrated into pre-packaged physical building blocks or as a virtual machine solution for VMware vSphere that can leverage other EMC or 3rd party storage resources. Additionally, there is a rich ecosystem of service providers providing Atmos as cloud Storage-as-a-Service directly. Any and all of these options can be federated together as needed within and across a given organization. EMC uses REST and SOAP web services, and has also implemented file services on top of Atmos to serve underlying objects through the lens of either an NFS or CIFS file server. When NFS or CIFS shares are defined, they are assigned to specific Atmos nodes (or dedicated pairs for HA) and utilize the Atmos node’s inherent Linux capabilities (leveraging an Installable File System with the FUSE extension). Layering a file system over Atmos imposes some constraints regarding universal access, but also enables both traditional and transitional applications and file system type usage. EMC Atmos Windows and Linux users can also leverage the EMC GeoDrive add-on that installs on a single user workstation or server to provide remote virtual NFS/CIFS style access (over REST) to Atmos object storage. GeoDrive supports local caching of files for offline use and eventual synchronization on reconnection. One of the major benefits of GeoDrive is enabling a user to access large amounts of protected storage from anywhere. It can also be used for the disaster recovery of files pushed or mirrored into Atmos. Atmos technically maintains a given piece of data as an object with associated metadata that includes the object ID, system and user-defined metadata fields and the internal object layout information (and parent/child information for objects saved through a file system “namespace” interface). Applications and users can store arbitrary metadata with each object that can be leveraged by group management policies. Policies can be created at the tenant level as a design scheme to provide various service levels of performance access, and data protection based on some awareness of the multi-site architecture of the cloud implementation. They are then assigned to subtenants, who need to not be aware of the underlying implementation, to apply as target service levels to their objects. For example, the power to explicitly enforce compression of image files (e.g. jpegs) after a number of days would present a significant capacity optimization for a web-based application dealing with millions of images. In addition to supporting compliance and retention policies, metadata can be used to drive automated file distribution, access control and data protection activities optimizing for the appropriate level of data resiliency, performance and availability. For most applications, thoughtful use of user metadata can remove any need to implement a separate management tracking database for stored objects. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 9 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 10. Technology in Brief Replication is controlled by automated policies which can mirror data objects at many points in an object’s lifecycle both within and across multiple sites. Within a data center site, replication might for example be set to happen synchronously upon ingestion while between replication between sites might be set asynchronously and launched with an arbitrary delay to allow for data settling. Replications can be targeted to specific locations, or abstractly sent to “other” sites as the system decides. For performance and availability, replicas are all active for read access (objects are inherently immutable so there is no issue with having to manage distributed locking mechanisms). Because it is “multi-site active/active”, any site can fulfill new object write requests when the local primary site is unavailable. In addition to full replication, EMC also provides an erasure coding option called GeoParity. Instead of keeping two or more full 100% copies, “9/12” erasure coding enables storing an “expanded” object containing only 33% additional encoded “redundant” data broken up into 12 segments. By using erasure coding, the original data can be reconstructed dynamically from any 9 of the segments. These segments are cleverly distributed so that the object can survive (and even be accessed during) multiple failures. For greater protection there is also a “10/16” coding with a 60% capacity overhead. Erasure coding does impact access performance, especially at ingestion, but provides great fault tolerance with much lower capacity utilization. Of course, policies can be written to convert replicated objects to erasure coded schemes as they age appropriately. With object stores there is generally no need for low-level RAID or disk level protection and Atmos is no exception. Upon hardware failures, replications and/or GeoParity across nodes (RAIN) combined with built-in node auto-healing features suffice to provide the full data protection as determined by the service level “policies” implemented for each type of data object. Atmos can withstand the loss of any disk, node, rack, or even site. Atmos Pre-built Hardware Configurations EMC Atmos pre-configured hardware “appliances” consists of a rack/cabinet containing from 4 to 16 Atmos nodes in various configurations and disk capacities. Flexible configurations enable smooth scalability, and allow for mixes of capacity and performance in and across Atmos sites. An Atmos storage node consists of a 1GbE server front-end running the Atmos storage services connected to one or more SAS attached disk array enclosures (DAE), each containing 15 1-3TB 7200RPM disks. Every node runs all object storage services (the first two nodes in each site also run the site metadata locator service that indexes which node contains which objects) supporting tremendous horizontal system scalability. EMC has also introduced their new Atmos G3 series for new levels of density and energy efficiency. G3-Dense-480 is the first in the Atmos G3 series and consists of 4, 6, or 8 nodes with 480 disks in 40U, and 3TB drives. TABLE: ALIGNING TOP CLOUD USE CASES WITH EMC ATMOS Use case Challenge Benefits Medical Over 800 million medical imaging Vendor Neutral Archive (VNA) on Atmos: Archiving procedures a year require huge integrates with EMR/EHR and improves storage scalability; collaboration and PACs for better patient care and compliance increase complexity. collaboration, improves data lifecycle management, reduces IT costs, and preserves HIPAA compliance. File Archiving Corporate file sharing is popular with With EMC Sync & Share, users can securely Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 10 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 11. Technology in Brief employees but syncing and sharing share Atmos files across mobile devices, are hard to manage. Employees will Linux and Windows. GeoDrive creates a frequently share files anyway over Dropbox-like service that is secure and mobile devices, leaving corporations manageable, powered by Atmos’ fast accountable for risky behavior. performance. Atmos policies monitor changes to data and provide access control, benefitting regulated verticals like finance. Archive as a Both the enterprise and storage The Atmos Cloud Delivery Platform enables Service service providers struggle to provide corporations and service providers to meter IT services to their respective capacity, bandwidth, and usage across customers. Provisioning, tenants. Provisioning is automated by maintenance, and security are all tenant, and Atmos allows tenants to safely difficult issues in traditional storage self-manage and access their own storage. offerings. Managed Many MSPs suffer from narrow profit Atmos lets MSPs efficiently offer storage as Service margins because of the expense of a service and better monetize new service Providers delivering storage to customers. offerings. MSPs can monitor capacity and Managing multiple tenants, manual usage for chargeback, reduce provisioning provisioning and maintaining service costs, and replace multiple tenant manage- level agreements all cut into revenue ment systems with a single system. Dynamic and make it too expensive to add scaling, high availability and security cost- new storage services. effectively meet service level requirements. Content-Rich Traditional storage is a poor Atmos provides location transparency for Web environment for Web application global applications and a highly mobile user Applications development, which needs highly base. The single namespace means that scalable capacity for multiple large application developers never need to recode data sets, a secure environment for pathnames and locations, and do not need test/dev and application testing in to code for limited storage environments. real-time environments. Self-management options make it easy for customers to provision their own storage, and REST APIs reduce application complexity. Taneja Group Opinion When on-premise archive solutions smoothly integrate with federated storage, then public and private clouds provide extensive scalability and global availability. Yet we see too many end-users treating the cloud as just another storage tier for low value retained data. This is a huge waste of cloud possibilities but we understand why it happens: cloud platforms with poor performance and delivery mechanisms can make cloud-based storage more trouble than it’s worth. But when we talk about EMC Atmos we are not talking about a low-cost storage tier, far from it. We are describing the heart of business innovation based on highly secure and highly accessible global data stores. EMC’s long expertise with object-based storage has kept Centera relevant and has extended dynamic data management to the cloud with Atmos. The Atmos-fueled cloud replaces hierarchical file storage while allowing the secure flow of information between the data center, the distributed cloud, and global access points. Customers profit from greatly improved application and data delivery, and the deep business value inherent in their valuable data. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 11 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 12. Technology in Brief When a company is dealing with geographic reach and large growing volumes of rich content, then they should look to object-based storage in the cloud. We fully support EMC in its push to scale capacity, performance, availability and management far beyond what traditional file systems are capable of, and more massively than ever before. .NOTICE: The information and product recommendations made by Taneja Group are based upon public information and sources and may also include personal opinions both of Taneja Group and others, all of which we believe to be accurate and reliable. However, as market conditions change and not within our control, the information and recommendations are made without warranty of any kind. All product names used and mentioned herein are the trademarks of their respective owners. Taneja Group, Inc. assumes no responsibility or liability for any damages whatsoever (including incidental, consequential or otherwise), caused by your use of, or reliance upon, the information and recommendations presented herein, nor for any inadvertent errors that may appear in this document. Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 12 of 12 87 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com