SlideShare a Scribd company logo
1 of 43
Download to read offline
“Big Data” and “The Cloud”
             Robert J. Abate,          CBIP, CDMP
             Independent Consultant




             Webinar:   March 20th, 2012
                        2PM EST / 11AM PST
“Big Data” And “The Cloud” - Agenda

                                    The Industry Is A Buzz…

                                    The Challenges Of Big
                                    Data

                                    Architectural Solutions &
                                    The Cloud

                                    It’s A Brave New World

                                    Case Studies

                                    Questions & Answers
2               Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
The Industry Is A Buzz…
           “Despite the hype,
           most firms find the
           technology useful to
           operate on data they
           already have”
              Source: Forrester, June 2011
Everyone Is Talking About Big Data…




     “Big data will represent a hugely disruptive force during the next five
     years – enabling levels of insight – that are currently unachievable through
     any other means”                                                           Gartner: May 2011
     “Big Data: Huge Management Implications with Enormous Returns” IDC: March 2011
     “Big data is still in mostly unchartered territory, but a surprise number is
     actually doing something with it”                                        Forrester: June 2011
     “61% of respondents feel big data will fundamentally change the way their
     business works                                                   CIO/Insight: November 2010

     “Most enterprise data warehouse (EDW) and BI teams currently lack a
     clear understanding of big data technologies, potential application areas,
     and why ‘big data BI’ contrasts with traditional BI tools. It differs
     dramatically from traditional BI in terms of both capabilities and in the
     technologies used to achieve those capability breakthroughs”           Gartner: January 2012
4                               Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
What Are The Drivers For Big Data/Cloud

     We Are In The Information Age
       Every corporation today is in the “Data Business”
     We Are Inundated In Data
       Types
       Sources
       Varieties
     Data Is Growing Exponentially
       So are the challenges
     Data Complexity Is Increasing
       Causing insight to be lost

5                    Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Pictorial Representation Of Information




6                Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Big Data Is More Than Just Volume
     Consider: Master Data,
     Fidelity, Complexity,
     Validity, Perishability,
     Linking Data                                                                           Transactional
                                                                                                Data
     Structured Data: POS                                             Industry-
     transactions, call detail                                        specific
                                                      Web traffic                                                      Video
     records, credit card                                               Velocity                      Volume
     transactions, shipping
     updates, purchase orders,
     payments, shipments,
     account transactions
     Unstructured Data: Web                   Social
     logs, newsfeeds, social                                                                                                Text
     media, geo-location,
     mobile, consumer
     comments, claims,
     doctor’s notes, clinical                                  Variety                                     Complexity
     studies, images, video,                     Sensor/
     audio                                      location-
     Device-generated Data:                       based                                                                      Audio
     Device-
     RFID sensors, smart
     meters, smart grids, GPS                                        Documents                          Images
     spatial, micro-payments                                                       Smart Grid

7                          Big Data & The Cloud – March 20th, 2012                              © 2012 – Dataversity & Robert J. Abate
Big Data’s Potential Is Limitless

    TODAY                                                      TOMORROW
     Less than 10% of enterprises                                   Vast majority of available
     information                                                    sources and external data
     “Rear-view” mirror reporting,                                  Forward looking or
     dashboards and analysis                                        “Windshield-view” predictions
        Days, weeks, months, or                                     with recommendations
        even quarters old                                              Real-time near real-time
     Incomplete, inaccurate, and                                    Correlated, high confidence,
     disjointed data                                                governed data
     Architectures and methods                                      Vastly accelerated time to
     that take 6 to 18 months to                                    market
     exploit
8                         Big Data & The Cloud – March 20th, 2012                © 2012 – Dataversity & Robert J. Abate
Time Really Is Money!

    Value                                                “THE TIME VALUE CURVE”
                                                         © 2007 - Dr. Richard Hackathorn, Bolder Technology, Inc., All Rights Reserved. Used with Permission.




                           Business Event


                     Capture
        Value Lost




                     Latency          Data Ready For Analysis


                                Analysis
                                Latency
                                                                          Information Delivered

                                                                                                                                          Action
                                                                                Decision
                                                                                                                                          Taken
                                                                                Latency


                                                    Action Time
                                                                                                                                                     Time

    Data
    Lifecycle
9                               Big Data & The Cloud – March 20th, 2012                                            © 2012 – Dataversity & Robert J. Abate
Data Is Coming At Us Faster

          In A Recent TDWI Survey Of 450 CIO’s
               17% have a real time data warehouse
               90% plan on having a real time warehouse
               75% will replace to get to a real-time solution
          Big Data Projects Are Enterprise-Scale
               When asked:                                                               Enterprise                                 65%




               “What Is The Scope Of                                                Line of business    8%



                                                                                      Departmental      8%

               Your Big Data Initiative?”
                                                                                      Project-based     8%



                                                                                           Regional    5%



                                                                                              Other    5%
     Source: Forrester® June 2011 Global Big Data Online Survey
10                                        Big Data & The Cloud – March 20th, 2012                            © 2012 – Dataversity & Robert J. Abate
Data Is Coming From All Directions…

      Data is now commonly entering into
      the enterprise from external sources
        Government (Census, Revenues, …)
        Neilson, NPD Group (Sales)
        Bloomberg, NYSE (Financial Position)
        Experian, TransUnion, Equifax (Credit Reporting)
        Google Maps, MapInfo (Geospatial, …)
        Radian 6, Biz360, … (Client Trend Data)
        Etc.

11                   Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Need For “Trust In Data”

      Compliance with laws
        Sarbanes Oxley [SOX], BASIL II, HIPAA, etc.
      Lack of confidence in the data
        Reports utilizing same data do not report same totals or
        computations
      Data not defined and readily available
        Multiple sources of data have to be rationalized at each project start-
        up thereby wasting valuable time & $ on every project
      Data timeliness
        Manual process to collect, analyze and provide results
      Data integrity
        Unknown filters, varying calculation/computations, fields used for
        data not indicative of field names, data passed along from one
        person to another to another to another…..
12                         Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Summation Of Industry “Buzz”

      Business mandate to obtain more value out
      of the data (get answers)
      Variety of sources, amounts, types and
      granularity of data that customers want to
      integrate is growing exponentially
      Need to shrink the latency between the
      business event and the data availability for
      analysis and decision-making
      Advancing agility of information is key
      Need for Data trust and Compliance with
      regulations
13                  Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
The Challenges Of Big Data

            “If It Was That
            Easy, Everyone
            Would Be Doing It”
                     Source: Unknown
The Information Issue Is?

      Too many organizations are not using
      information to its full advantage!
        1 in 3 business leaders frequently make critical
        decisions without the information they need
        1 in 2 business leaders do not have access to the
        information across their organization needed to do
        their jobs.
        3 in 4 business leaders say more predictive
        information would drive better decisions
         Source: IBM Institute for Business Value, March 2009

15                                  Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Business Alignment & Trust

         A Recent CIO:INSIGHT Poll of CIO’s Found
               56% of respondents say they feel overwhelmed by the
               amount of data their enterprise manages
               33% of respondents want even more sources of data, despite
               their feelings of being overwhelmed by it
               62% of respondents say they’re frequently interrupted by
               irrelevant incoming data
               43% of respondents say they’re dissatisfied with the current
               tools they use to filter out irrelevant data
               46% of respondents say they’ve made inaccurate business
               decisions as a result of bad or outdated data
               One in Three report that they “can’t find the right people with
               the right data”
     Source: “The Big Data Conundrum”, http://www.cioinsight.com/c/a/Storage/The-Big-Data-Conundrum-568229/
16                                       Big Data & The Cloud – March 20th, 2012         © 2012 – Dataversity & Robert J. Abate
Viewed Another Way…

                                              If a football team had
                                              these players on the
                                              field:
                                                          Only 4 of the 11 players on
                                                          the field would know which
                                                          goal is theirs
                                                          Only 6 of the 11 would care
                                                          Only 3 of the 11 would know
                                                          what position they play and
                                                          what they are supposed to do
                                                          9 players out of 11 would, in
                                                          some way, be competing
                                                          against their own team rather
                                                          than the opponent
17              Big Data & The Cloud – March 20th, 2012                 © 2012 – Dataversity & Robert J. Abate
BI Perception Is Complicated & Slow

      BI/DW is perceived as not “enabling” the business
        Inhibitor to corporate progress IT systems cannot be changed
        fast enough to meet market demands, seize opportunity or comply
        with a new requirement.
        Weak alignment between IT and business strategy Marked by
        an intractable language barrier.
        Business not always sure what information or dimensions they
        want or need To answer questions about what to do next
        BI/DW has not been known as a source of innovations
      The complexity of systems has caused BI/DW to be
      reactive rather than proactive
        Silo’d solutions, db’s and applications with trapped business rules
        Multiple sources of information and no single “truth”
        No “Architectural Blueprints” to the enterprise…

18                         Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
BI & D/W – The “Old Way”
      Data Chaos                          Master Data                                     Business Intelligence
      • Same type data is different       • Publish and subscribe to                      • Analyzing the data by
        in diverse systems                  master data                                     looking into history
      • EG: AT&T is the same as           • EG: Single view of                            • Viewing graphs of
        AT&T Inc                            customer across all                             historical information
                                            information systems

     PROCESSES         Data Discovery    DQ / Data Governance              Data Integration        BI & Data Mining



                  Data           Defined           Master                Integrated  Business     D/W KPI’s
                 Chaos            Data              Data                Information Intelligence Dashboards


       TOOLS             Profiling      Metadata / MDM                Data Modeling & ETL            BI / DW / OLAP


       Defined Data                        Integrated Information                         D/W KPI’s & Dashboards
       • Defined common                    • Bring metadata together                      • Drilling into information to find
         meanings                            with modeled information                       and analyze trends
       • EG: Determine the                   for reporting (BI) and                       • KPI’s and metrics that offer a
         sources, types, and                 warehousing (drilling and                      glimpse into historical
         properties of grouped (i.e.:        hierarchies).                                  performance
         customer) records                                                                • Exception reporting and alerts

19                                        Big Data & The Cloud – March 20th, 2012                          © 2012 – Dataversity & Robert J. Abate
The “Intelligence” Maturity Model




20                Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Advancing The Maturity Of BI




21                Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
The Big Data Method
      Data Chaos                           Data Matching                                      Data Analytics
      • Same type data is different        • Profiling of information to                      • Using Data Scientists,
        in diverse systems                   determine quality                                  evaluate data utilizing
      • EG: AT&T is the same as            • Automated analysis to                              mathematical algorithms
        AT&T Inc                             match information                                  and visualization toolsets


     PROCESSES         Data Discovery     DQ / Data Governance                       Analytics Utilizing Data Scientists



                  Data          Data     Data                                                                     Business
                                                                          Integrated             Data
                                                                                                                 Performance
                 Chaos         Analysis Matching                         Information            Analytics        Optimization



       TOOLS              Profiling & Matching / DQ                      Query Federation                        “R”,


       Defined Data                         Integrated Information                            Performance Optimization
       • Defined common                     • Bring metadata together                         • Using analytics, changes to
         meanings                             from matching into data                           business models are made
       • EG: Determine the                    stores and sharing with                         • Analysis of models improve
         sources, types, and                  analysis toolsets                                 business and optimize business
         properties of grouped (i.e.:       • Organizing information for                        performance
         customer) records                    rapid retrieval

22                                         Big Data & The Cloud – March 20th, 2012                               © 2012 – Dataversity & Robert J. Abate
Architectural Solutions &
       The Cloud
            “You never change things by
            fighting the existing reality.
            To change something, build a
            new model that makes the
            existing model obsolete.”
                  Richard Buckminster Fuller
Big Data Required A Big Change

      Consider 100 GB would store the entire US Census DB
      “basic” information set for every living human being on
      the planet:
         Age, Sex, Income, Ethnicity, Language, Religion, Housing Status, Location
         into a 128 bit set
         That equates to about 6.75 millions rows of about 10 columns
      Consider the Large Hadron Collinder within the CERN
      Laboratories
         Expected to produce 150,000 times as much raw data each year
      What makes large data sets are repeated observations
      over time / space (spatial or temporal dimensions)
         Web log has Millions [M] of visits over a handful pages
         Retailer has 100K products, M customers, but Billions of transactions
         Hi-Res Scientific like fMRI 1K-GB per view
      Cardinalities (distinct observations) was usually small
      with regard to total # of observations
         This was starting to change with the advent of device supplied information,
         sensors and other semi and unstructured data sources
24                            Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
A Change In Technology Was Needed

         Consider that
         Relational technologies
         were invented to get
         data in and organized,
         not designed nor
         organized to get it out
               RDBMS’s were designed for
               efficient transactions
               processing on large data
               sets
                    Adding, Updating
                    Searching for & retrieving
                    small amounts of data
     Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09

25                                       Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Data Warehousing Was A “Fix”

         DW was classically designed as “copy of
         transaction data specifically structured for
         query and analysis”
               General approach was bulk ETL into a DB designed for
               queries
         Big data caused this “Fix” to break
               “Traditional RDBMS-based dimensional modeling and cube-
               based OLAP turns out to be to slow or to limited to support
               asking the really interesting questions of warehoused data”

         “To achieve acceptable performance for highly order-dependent
           queries on truly large data, one must be willing to consider
                abandoning the purely relational database model”

     Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09

26                                       Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Then Change Came In Technologies…

      The advent of cloud and storage costs
        Infrastructure utilization increased dramatically
        Low TCO and cost of storage and memory dropped
        significantly spawning powerful computing
        paradigms and appliances
      The advent of commodity-based
      processing in a grid or MPP config
        Usage of existing hardware in a grid paradigm
        supporting queries across entire datasets
        “Hadoop” & MPP Shared Nothing Architectures

27                   Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Technology Solutions Appeared

      Massively Parallel
      Processing
        Teradata, Greenplum, etc.
      Grid
        Hadoop, MapReduce,
        Cassandra, etc.
      Columnar
        ParAccel, Vertica, Sybase,
        Sand Technologies, etc.
      Hardware
      Appliances
                                                  A visualization of a network of Facebook connections, from
        DATAllegro, Netezza,                      previous related research by Mucha and others.
        Oracle Exadata, etc.                      Credit: Amanda L. Traud, Christina Frost, UNC-Chapel Hill.
                                                  Source: http://www.physorg.com/news192985912.html

28                      Big Data & The Cloud – March 20th, 2012                    © 2012 – Dataversity & Robert J. Abate
Virtualization & The Cloud




29                Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Data Virtualization In The Cloud




30                 Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Advances Provided Answers To Silos

          “What Areas Do Your Big Data
          Initiatives Address?”




     Source: Forrester® June 2011 Global Big Data Online Survey
31                                        Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
It’s A Brave New World…

           “Who Owns Or Drives
           Your Big Data
           Initiatives?”
            Source: Forrester, June 2011



                         Business/IT collaboration               70%


           Mostly business-driven, with minimal IT
                                                           15%
                        involvement

           Mostly IT-driven, with minimal business
                                                          12%
                         involvement


                                      Don’t know     2%


                                            Other    2%
From The Old Stack To A New Ecosystem

      Data integration without pre-processing
         Ability to locate and to query federated sources of data and content without costly data
         modeling and ETL transformation
      Variety of sources (Mergers & Acquisitions, Growth, Services)
         Inability to rapidly add new data sources because of tightly coupled business rules
      Need for flexible data structures
         Current structures are rigid and are views of the sources or the business requirements
      Incorporation of unstructured data including social media
         Need tools to integrate and analyze unstructured sources that are not currently used
      Need to incorporate and utilize metadata
         Metadata is disjointed, confined and incompatible – need uniformed, agile approach
      Dynamic information with views for a reason
         Need creation and structuring of views that support dynamic information for purpose
      Information management and governance in a regulated world
         Security and entitlement checking integrated with query processing
         Information grants handled thru XACML obligations


33                               Big Data & The Cloud – March 20th, 2012       © 2012 – Dataversity & Robert J. Abate
The New “Data Fabric” Transformation

      Coordinates ingestion of information
      no matter what the source
      Micro-batch takes the place of batch
      Tagging replaces transformation
      Federated query replaces ETL
      Query direction removes the need for
      optimization of data stores
      Purposeful view is the new master
      data repository
34               Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Newest Trends In Big Data & The Cloud

          Compelling Analytics Provide Extreme ROI
                Data Visualization Technologies
                      Heat, Clouds, Clusters, Flows
                Mixing Structured, Semi and Unstructured Sources
                Self-service analytics - Build your own sandbox!
         Data visualization is the study of the visual representation of data, meaning
         "information that has been abstracted in some schematic form, including
         attributes or variables for the units of information"

          Big Data Cloud Encircled Warehouses
                Data Virtualization
                      Abstracting the data from the systems
                      Complements existing data warehouses
                Many times the size of structured warehouse
                Provides for rapid analytic iterations
     Source: Wikipedia - http://en.wikipedia.org/wiki/Data_visualization
35                                           Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Data Visualization In Practice
                                                                                              WorldWideWeb
                                                                                              Around
                                                                                              Wikipedia
                                                                                              - Wikipedia as
                                                                                              part of the world
                                                                                              wide web
                                                                                              Created by Chris
                                                                                              73 | Talk 09:56,
                                                                                              18 Jul 2004
                                                                                              (UTC) using
                                                                                              TouchGraph
                                                                                              GoogleBrowser
                                                                                              V1.01




     Source: Wikipedia - http://en.wikipedia.org/wiki/Data_visualization
36                                           Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
A Picture Is Worth A Thousand Words




     Source: Greenplum, An EMC Corporation
37                                     Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Mixing Structured, Semi & Unstructured Sources…




38   Source: Information Builders
                                    Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Big Data Cloud Encircled Warehouses




     Source: EMC Corporation
39                             Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate
Case Studies

     In the real world, we
     find out the reasons
     why Murphy’s Law is
     so prevalent…
Telecomm Provider Finds Answers…

         Before investing tens of millions in infrastructure, a
         telecomm firm learned where to invest their monies…
                                        Challenge
                                                   100TB Traditional EDW, Single Source Of Truth
                                                   Operational Reporting & Financial Consolidation
                                                   Heavy Governance And Control
                                                   Unable To Support Critical Business Initiatives
                                                   Customer Loyalty And Churn The #1 Business
                                                   Initiative From The CEO

                                             Enterprise Big Data Cloud
                                             Surrounded Warehouse
                                                   Extracted Data From EDW & Other Sources
                                                   Generated Social Graph From Call Detail
                                                   And Subscriber Data
                                                   Within 2 Weeks Found “Connected” Subscribers
                                                   7X More Likely To Churn Than Average Users
                                                   Now Deploying 1PB Production
     Source: Greenplum, an EMC Corporation
41                                     Big Data & The Cloud – March 20th, 2012     © 2012 – Dataversity & Robert J. Abate
Questions & Answers


                                                     Open Exchange Of
                                                     Ideas

                                                     Speaker Contact
                                                     Information:

                                                     Robert J. Abate
                                                     r.j.abate@att.net
                                                     (201) 745-7680


42               Big Data & The Cloud – March 20th, 2012        © 2012 – Dataversity & Robert J. Abate
Curriculum Vitae Of Presenter

      Robert J. Abate, CBIP, CDMP
      As a hands-on, accomplished Information Technology
      professional, Mr. Abate offers 30 years of experience in
      Architectures, Applications, Business Intelligence & Analytics,
      Infrastructure, and IT strategy. He is credited as one of the first to
      publish on Services Oriented Architectures (1996), and a
      respected IT thought leader within the field. He holds a
      Bachelors of Science in Electrical Engineering, and is a Certified
      Business Intelligence Professional and a Certified Data
      Management Professional in four disciplines. Mr. Abate both
      chairs and presents at global conferences and a member of the
      board of DAMA and is a respected author and industry thought-
      leader. Mr. Abate frequently can be heard giving talks on topics
      such as “The Convergence Of SOA & BI,” “Best Practices In
      Enterprise Information Management,” “Making Big Data Analytics
      Actionable”, and “Data Services & Virtualization”.
43                          Big Data & The Cloud – March 20th, 2012   © 2012 – Dataversity & Robert J. Abate

More Related Content

What's hot

Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the CloudNati Shalom
 
Big Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation SlideBig Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation SlideSlideTeam
 
Cloud-Based Big Data Analytics
Cloud-Based Big Data AnalyticsCloud-Based Big Data Analytics
Cloud-Based Big Data AnalyticsSateeshreddy N
 
NextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataNextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataEd Dodds
 
Cloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
Cloud Computing, SDN, Big Data and Internet of Everything - Lew TuckerCloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
Cloud Computing, SDN, Big Data and Internet of Everything - Lew TuckerLew Tucker
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data TechnologiesDATAVERSITY
 
Introduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big DataIntroduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big Datawaheed751
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop SampleAlan Quayle
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data ReferencesRob Thomas
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Andrei Khurshudov
 
The importance of data
The importance of dataThe importance of data
The importance of dataAPNIC
 
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Edureka!
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Edureka!
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Dr. Anita Goel
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesShilpi Sharma
 
ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?
ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?
ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?DATAVERSITY
 

What's hot (20)

Big Data in the Cloud
Big Data in the CloudBig Data in the Cloud
Big Data in the Cloud
 
Big Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation SlideBig Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation Slide
 
Cloud-Based Big Data Analytics
Cloud-Based Big Data AnalyticsCloud-Based Big Data Analytics
Cloud-Based Big Data Analytics
 
NextGen Infrastructure for Big Data
NextGen Infrastructure for Big DataNextGen Infrastructure for Big Data
NextGen Infrastructure for Big Data
 
Cloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
Cloud Computing, SDN, Big Data and Internet of Everything - Lew TuckerCloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
Cloud Computing, SDN, Big Data and Internet of Everything - Lew Tucker
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data Technologies
 
Introduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big DataIntroduction to Cloud Computing and Big Data
Introduction to Cloud Computing and Big Data
 
The promise and challenge of Big Data
The promise and challenge of Big DataThe promise and challenge of Big Data
The promise and challenge of Big Data
 
Telco Big Data Workshop Sample
Telco Big Data Workshop SampleTelco Big Data Workshop Sample
Telco Big Data Workshop Sample
 
IBM Big Data References
IBM Big Data ReferencesIBM Big Data References
IBM Big Data References
 
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...Short introduction to Big Data Analytics, the Internet of Things, and their s...
Short introduction to Big Data Analytics, the Internet of Things, and their s...
 
The importance of data
The importance of dataThe importance of data
The importance of data
 
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collection
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
Big Data introduction - Café Numérique Bruxelles
Big Data introduction - Café Numérique BruxellesBig Data introduction - Café Numérique Bruxelles
Big Data introduction - Café Numérique Bruxelles
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
Big data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & ChallengesBig data - Key Enablers, Drivers & Challenges
Big data - Key Enablers, Drivers & Challenges
 
ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?
ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?
ADV Slides: The World in 2045 – What Has Artificial Intelligence Created?
 

Viewers also liked

Cloud Computing and Big Data
Cloud Computing and Big DataCloud Computing and Big Data
Cloud Computing and Big DataRobert Keahey
 
Why Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueDataWhy Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueDataData Con LA
 
Dell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonDell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonGreg Kirchoff
 
BlueData Isilon Validation Brief
BlueData Isilon Validation BriefBlueData Isilon Validation Brief
BlueData Isilon Validation BriefBoni Bruno
 
BlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData, Inc.
 
PaaS Emerging Technologies - October 2015
PaaS Emerging Technologies - October 2015PaaS Emerging Technologies - October 2015
PaaS Emerging Technologies - October 2015Krishna-Kumar
 
BlueData EPIC 2.0 Overview
BlueData EPIC 2.0 OverviewBlueData EPIC 2.0 Overview
BlueData EPIC 2.0 OverviewBlueData, Inc.
 
How to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentHow to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentBlueData, Inc.
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big DataMrinal Kumar
 
Big Data and the Cloud a Best Friend Story
Big Data and the Cloud a Best Friend StoryBig Data and the Cloud a Best Friend Story
Big Data and the Cloud a Best Friend StoryAmazon Web Services
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data ArchitecturesLynn Langit
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerYahoo Developer Network
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessAjay Ohri
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBlueData, Inc.
 

Viewers also liked (14)

Cloud Computing and Big Data
Cloud Computing and Big DataCloud Computing and Big Data
Cloud Computing and Big Data
 
Why Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueDataWhy Virtualization is important by Tom Phelan of BlueData
Why Virtualization is important by Tom Phelan of BlueData
 
Dell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with IsilonDell/EMC Technical Validation of BlueData EPIC with Isilon
Dell/EMC Technical Validation of BlueData EPIC with Isilon
 
BlueData Isilon Validation Brief
BlueData Isilon Validation BriefBlueData Isilon Validation Brief
BlueData Isilon Validation Brief
 
BlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for Hadoop
 
PaaS Emerging Technologies - October 2015
PaaS Emerging Technologies - October 2015PaaS Emerging Technologies - October 2015
PaaS Emerging Technologies - October 2015
 
BlueData EPIC 2.0 Overview
BlueData EPIC 2.0 OverviewBlueData EPIC 2.0 Overview
BlueData EPIC 2.0 Overview
 
How to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environmentHow to deploy Apache Spark in a multi-tenant, on-premises environment
How to deploy Apache Spark in a multi-tenant, on-premises environment
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
 
Big Data and the Cloud a Best Friend Story
Big Data and the Cloud a Best Friend StoryBig Data and the Cloud a Best Friend Story
Big Data and the Cloud a Best Friend Story
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with Docker
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Big Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 TelcoBig Data Case Study: Fortune 100 Telco
Big Data Case Study: Fortune 100 Telco
 

Similar to Big Data & the Cloud

Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analyticsdmurph4
 
Big Data: A Big Trap for Product Development
Big Data: A Big Trap for Product DevelopmentBig Data: A Big Trap for Product Development
Big Data: A Big Trap for Product DevelopmentStrategy 2 Market, Inc,
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012
Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012
Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012James Mailley
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesCRISIL Limited
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightHortonworks
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntelAPAC
 
Scenari evolutivi nello snellimento dei sistemi informativi
Scenari evolutivi nello snellimento dei sistemi informativiScenari evolutivi nello snellimento dei sistemi informativi
Scenari evolutivi nello snellimento dei sistemi informativiFondazione CUOA
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotInside Analysis
 
Big data appliances for BI on Cloud
Big data appliances for BI on CloudBig data appliances for BI on Cloud
Big data appliances for BI on Cloudtdwiindia
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
OpTier McKinsey Big Data Overview
OpTier McKinsey Big Data OverviewOpTier McKinsey Big Data Overview
OpTier McKinsey Big Data Overviewnickychu
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntelAPAC
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Enabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessEnabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessTharindu Mathew
 

Similar to Big Data & the Cloud (20)

Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data: A Big Trap for Product Development
Big Data: A Big Trap for Product DevelopmentBig Data: A Big Trap for Product Development
Big Data: A Big Trap for Product Development
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012
Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012
Big Data and Mobile Recruitment - Irish Recruiters Conf Dec 2012
 
Big Data’s Big Impact on Businesses
Big Data’s Big Impact on BusinessesBig Data’s Big Impact on Businesses
Big Data’s Big Impact on Businesses
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick Knupffer
 
Scenari evolutivi nello snellimento dei sistemi informativi
Scenari evolutivi nello snellimento dei sistemi informativiScenari evolutivi nello snellimento dei sistemi informativi
Scenari evolutivi nello snellimento dei sistemi informativi
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
 
Big data appliances for BI on Cloud
Big data appliances for BI on CloudBig data appliances for BI on Cloud
Big data appliances for BI on Cloud
 
DAMA Presentation
DAMA PresentationDAMA Presentation
DAMA Presentation
 
Big Data a big deal?
Big Data a big deal?Big Data a big deal?
Big Data a big deal?
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
OpTier McKinsey Big Data Overview
OpTier McKinsey Big Data OverviewOpTier McKinsey Big Data Overview
OpTier McKinsey Big Data Overview
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
Forrester
ForresterForrester
Forrester
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big Data
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Enabling a Data Driven Agile Business
Enabling a Data Driven Agile BusinessEnabling a Data Driven Agile Business
Enabling a Data Driven Agile Business
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Recently uploaded (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Big Data & the Cloud

  • 1. “Big Data” and “The Cloud” Robert J. Abate, CBIP, CDMP Independent Consultant Webinar: March 20th, 2012 2PM EST / 11AM PST
  • 2. “Big Data” And “The Cloud” - Agenda The Industry Is A Buzz… The Challenges Of Big Data Architectural Solutions & The Cloud It’s A Brave New World Case Studies Questions & Answers 2 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 3. The Industry Is A Buzz… “Despite the hype, most firms find the technology useful to operate on data they already have” Source: Forrester, June 2011
  • 4. Everyone Is Talking About Big Data… “Big data will represent a hugely disruptive force during the next five years – enabling levels of insight – that are currently unachievable through any other means” Gartner: May 2011 “Big Data: Huge Management Implications with Enormous Returns” IDC: March 2011 “Big data is still in mostly unchartered territory, but a surprise number is actually doing something with it” Forrester: June 2011 “61% of respondents feel big data will fundamentally change the way their business works CIO/Insight: November 2010 “Most enterprise data warehouse (EDW) and BI teams currently lack a clear understanding of big data technologies, potential application areas, and why ‘big data BI’ contrasts with traditional BI tools. It differs dramatically from traditional BI in terms of both capabilities and in the technologies used to achieve those capability breakthroughs” Gartner: January 2012 4 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 5. What Are The Drivers For Big Data/Cloud We Are In The Information Age Every corporation today is in the “Data Business” We Are Inundated In Data Types Sources Varieties Data Is Growing Exponentially So are the challenges Data Complexity Is Increasing Causing insight to be lost 5 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 6. Pictorial Representation Of Information 6 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 7. Big Data Is More Than Just Volume Consider: Master Data, Fidelity, Complexity, Validity, Perishability, Linking Data Transactional Data Structured Data: POS Industry- transactions, call detail specific Web traffic Video records, credit card Velocity Volume transactions, shipping updates, purchase orders, payments, shipments, account transactions Unstructured Data: Web Social logs, newsfeeds, social Text media, geo-location, mobile, consumer comments, claims, doctor’s notes, clinical Variety Complexity studies, images, video, Sensor/ audio location- Device-generated Data: based Audio Device- RFID sensors, smart meters, smart grids, GPS Documents Images spatial, micro-payments Smart Grid 7 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 8. Big Data’s Potential Is Limitless TODAY TOMORROW Less than 10% of enterprises Vast majority of available information sources and external data “Rear-view” mirror reporting, Forward looking or dashboards and analysis “Windshield-view” predictions Days, weeks, months, or with recommendations even quarters old Real-time near real-time Incomplete, inaccurate, and Correlated, high confidence, disjointed data governed data Architectures and methods Vastly accelerated time to that take 6 to 18 months to market exploit 8 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 9. Time Really Is Money! Value “THE TIME VALUE CURVE” © 2007 - Dr. Richard Hackathorn, Bolder Technology, Inc., All Rights Reserved. Used with Permission. Business Event Capture Value Lost Latency Data Ready For Analysis Analysis Latency Information Delivered Action Decision Taken Latency Action Time Time Data Lifecycle 9 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 10. Data Is Coming At Us Faster In A Recent TDWI Survey Of 450 CIO’s 17% have a real time data warehouse 90% plan on having a real time warehouse 75% will replace to get to a real-time solution Big Data Projects Are Enterprise-Scale When asked: Enterprise 65% “What Is The Scope Of Line of business 8% Departmental 8% Your Big Data Initiative?” Project-based 8% Regional 5% Other 5% Source: Forrester® June 2011 Global Big Data Online Survey 10 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 11. Data Is Coming From All Directions… Data is now commonly entering into the enterprise from external sources Government (Census, Revenues, …) Neilson, NPD Group (Sales) Bloomberg, NYSE (Financial Position) Experian, TransUnion, Equifax (Credit Reporting) Google Maps, MapInfo (Geospatial, …) Radian 6, Biz360, … (Client Trend Data) Etc. 11 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 12. Need For “Trust In Data” Compliance with laws Sarbanes Oxley [SOX], BASIL II, HIPAA, etc. Lack of confidence in the data Reports utilizing same data do not report same totals or computations Data not defined and readily available Multiple sources of data have to be rationalized at each project start- up thereby wasting valuable time & $ on every project Data timeliness Manual process to collect, analyze and provide results Data integrity Unknown filters, varying calculation/computations, fields used for data not indicative of field names, data passed along from one person to another to another to another….. 12 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 13. Summation Of Industry “Buzz” Business mandate to obtain more value out of the data (get answers) Variety of sources, amounts, types and granularity of data that customers want to integrate is growing exponentially Need to shrink the latency between the business event and the data availability for analysis and decision-making Advancing agility of information is key Need for Data trust and Compliance with regulations 13 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 14. The Challenges Of Big Data “If It Was That Easy, Everyone Would Be Doing It” Source: Unknown
  • 15. The Information Issue Is? Too many organizations are not using information to its full advantage! 1 in 3 business leaders frequently make critical decisions without the information they need 1 in 2 business leaders do not have access to the information across their organization needed to do their jobs. 3 in 4 business leaders say more predictive information would drive better decisions Source: IBM Institute for Business Value, March 2009 15 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 16. Business Alignment & Trust A Recent CIO:INSIGHT Poll of CIO’s Found 56% of respondents say they feel overwhelmed by the amount of data their enterprise manages 33% of respondents want even more sources of data, despite their feelings of being overwhelmed by it 62% of respondents say they’re frequently interrupted by irrelevant incoming data 43% of respondents say they’re dissatisfied with the current tools they use to filter out irrelevant data 46% of respondents say they’ve made inaccurate business decisions as a result of bad or outdated data One in Three report that they “can’t find the right people with the right data” Source: “The Big Data Conundrum”, http://www.cioinsight.com/c/a/Storage/The-Big-Data-Conundrum-568229/ 16 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 17. Viewed Another Way… If a football team had these players on the field: Only 4 of the 11 players on the field would know which goal is theirs Only 6 of the 11 would care Only 3 of the 11 would know what position they play and what they are supposed to do 9 players out of 11 would, in some way, be competing against their own team rather than the opponent 17 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 18. BI Perception Is Complicated & Slow BI/DW is perceived as not “enabling” the business Inhibitor to corporate progress IT systems cannot be changed fast enough to meet market demands, seize opportunity or comply with a new requirement. Weak alignment between IT and business strategy Marked by an intractable language barrier. Business not always sure what information or dimensions they want or need To answer questions about what to do next BI/DW has not been known as a source of innovations The complexity of systems has caused BI/DW to be reactive rather than proactive Silo’d solutions, db’s and applications with trapped business rules Multiple sources of information and no single “truth” No “Architectural Blueprints” to the enterprise… 18 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 19. BI & D/W – The “Old Way” Data Chaos Master Data Business Intelligence • Same type data is different • Publish and subscribe to • Analyzing the data by in diverse systems master data looking into history • EG: AT&T is the same as • EG: Single view of • Viewing graphs of AT&T Inc customer across all historical information information systems PROCESSES Data Discovery DQ / Data Governance Data Integration BI & Data Mining Data Defined Master Integrated Business D/W KPI’s Chaos Data Data Information Intelligence Dashboards TOOLS Profiling Metadata / MDM Data Modeling & ETL BI / DW / OLAP Defined Data Integrated Information D/W KPI’s & Dashboards • Defined common • Bring metadata together • Drilling into information to find meanings with modeled information and analyze trends • EG: Determine the for reporting (BI) and • KPI’s and metrics that offer a sources, types, and warehousing (drilling and glimpse into historical properties of grouped (i.e.: hierarchies). performance customer) records • Exception reporting and alerts 19 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 20. The “Intelligence” Maturity Model 20 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 21. Advancing The Maturity Of BI 21 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 22. The Big Data Method Data Chaos Data Matching Data Analytics • Same type data is different • Profiling of information to • Using Data Scientists, in diverse systems determine quality evaluate data utilizing • EG: AT&T is the same as • Automated analysis to mathematical algorithms AT&T Inc match information and visualization toolsets PROCESSES Data Discovery DQ / Data Governance Analytics Utilizing Data Scientists Data Data Data Business Integrated Data Performance Chaos Analysis Matching Information Analytics Optimization TOOLS Profiling & Matching / DQ Query Federation “R”, Defined Data Integrated Information Performance Optimization • Defined common • Bring metadata together • Using analytics, changes to meanings from matching into data business models are made • EG: Determine the stores and sharing with • Analysis of models improve sources, types, and analysis toolsets business and optimize business properties of grouped (i.e.: • Organizing information for performance customer) records rapid retrieval 22 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 23. Architectural Solutions & The Cloud “You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete.” Richard Buckminster Fuller
  • 24. Big Data Required A Big Change Consider 100 GB would store the entire US Census DB “basic” information set for every living human being on the planet: Age, Sex, Income, Ethnicity, Language, Religion, Housing Status, Location into a 128 bit set That equates to about 6.75 millions rows of about 10 columns Consider the Large Hadron Collinder within the CERN Laboratories Expected to produce 150,000 times as much raw data each year What makes large data sets are repeated observations over time / space (spatial or temporal dimensions) Web log has Millions [M] of visits over a handful pages Retailer has 100K products, M customers, but Billions of transactions Hi-Res Scientific like fMRI 1K-GB per view Cardinalities (distinct observations) was usually small with regard to total # of observations This was starting to change with the advent of device supplied information, sensors and other semi and unstructured data sources 24 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 25. A Change In Technology Was Needed Consider that Relational technologies were invented to get data in and organized, not designed nor organized to get it out RDBMS’s were designed for efficient transactions processing on large data sets Adding, Updating Searching for & retrieving small amounts of data Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09 25 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 26. Data Warehousing Was A “Fix” DW was classically designed as “copy of transaction data specifically structured for query and analysis” General approach was bulk ETL into a DB designed for queries Big data caused this “Fix” to break “Traditional RDBMS-based dimensional modeling and cube- based OLAP turns out to be to slow or to limited to support asking the really interesting questions of warehoused data” “To achieve acceptable performance for highly order-dependent queries on truly large data, one must be willing to consider abandoning the purely relational database model” Source: ACM Website “The Pathologies of Big Data”, Adam Jacobs, 7/6/09 26 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 27. Then Change Came In Technologies… The advent of cloud and storage costs Infrastructure utilization increased dramatically Low TCO and cost of storage and memory dropped significantly spawning powerful computing paradigms and appliances The advent of commodity-based processing in a grid or MPP config Usage of existing hardware in a grid paradigm supporting queries across entire datasets “Hadoop” & MPP Shared Nothing Architectures 27 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 28. Technology Solutions Appeared Massively Parallel Processing Teradata, Greenplum, etc. Grid Hadoop, MapReduce, Cassandra, etc. Columnar ParAccel, Vertica, Sybase, Sand Technologies, etc. Hardware Appliances A visualization of a network of Facebook connections, from DATAllegro, Netezza, previous related research by Mucha and others. Oracle Exadata, etc. Credit: Amanda L. Traud, Christina Frost, UNC-Chapel Hill. Source: http://www.physorg.com/news192985912.html 28 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 29. Virtualization & The Cloud 29 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 30. Data Virtualization In The Cloud 30 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 31. Advances Provided Answers To Silos “What Areas Do Your Big Data Initiatives Address?” Source: Forrester® June 2011 Global Big Data Online Survey 31 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 32. It’s A Brave New World… “Who Owns Or Drives Your Big Data Initiatives?” Source: Forrester, June 2011 Business/IT collaboration 70% Mostly business-driven, with minimal IT 15% involvement Mostly IT-driven, with minimal business 12% involvement Don’t know 2% Other 2%
  • 33. From The Old Stack To A New Ecosystem Data integration without pre-processing Ability to locate and to query federated sources of data and content without costly data modeling and ETL transformation Variety of sources (Mergers & Acquisitions, Growth, Services) Inability to rapidly add new data sources because of tightly coupled business rules Need for flexible data structures Current structures are rigid and are views of the sources or the business requirements Incorporation of unstructured data including social media Need tools to integrate and analyze unstructured sources that are not currently used Need to incorporate and utilize metadata Metadata is disjointed, confined and incompatible – need uniformed, agile approach Dynamic information with views for a reason Need creation and structuring of views that support dynamic information for purpose Information management and governance in a regulated world Security and entitlement checking integrated with query processing Information grants handled thru XACML obligations 33 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 34. The New “Data Fabric” Transformation Coordinates ingestion of information no matter what the source Micro-batch takes the place of batch Tagging replaces transformation Federated query replaces ETL Query direction removes the need for optimization of data stores Purposeful view is the new master data repository 34 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 35. Newest Trends In Big Data & The Cloud Compelling Analytics Provide Extreme ROI Data Visualization Technologies Heat, Clouds, Clusters, Flows Mixing Structured, Semi and Unstructured Sources Self-service analytics - Build your own sandbox! Data visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information" Big Data Cloud Encircled Warehouses Data Virtualization Abstracting the data from the systems Complements existing data warehouses Many times the size of structured warehouse Provides for rapid analytic iterations Source: Wikipedia - http://en.wikipedia.org/wiki/Data_visualization 35 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 36. Data Visualization In Practice WorldWideWeb Around Wikipedia - Wikipedia as part of the world wide web Created by Chris 73 | Talk 09:56, 18 Jul 2004 (UTC) using TouchGraph GoogleBrowser V1.01 Source: Wikipedia - http://en.wikipedia.org/wiki/Data_visualization 36 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 37. A Picture Is Worth A Thousand Words Source: Greenplum, An EMC Corporation 37 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 38. Mixing Structured, Semi & Unstructured Sources… 38 Source: Information Builders Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 39. Big Data Cloud Encircled Warehouses Source: EMC Corporation 39 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 40. Case Studies In the real world, we find out the reasons why Murphy’s Law is so prevalent…
  • 41. Telecomm Provider Finds Answers… Before investing tens of millions in infrastructure, a telecomm firm learned where to invest their monies… Challenge 100TB Traditional EDW, Single Source Of Truth Operational Reporting & Financial Consolidation Heavy Governance And Control Unable To Support Critical Business Initiatives Customer Loyalty And Churn The #1 Business Initiative From The CEO Enterprise Big Data Cloud Surrounded Warehouse Extracted Data From EDW & Other Sources Generated Social Graph From Call Detail And Subscriber Data Within 2 Weeks Found “Connected” Subscribers 7X More Likely To Churn Than Average Users Now Deploying 1PB Production Source: Greenplum, an EMC Corporation 41 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 42. Questions & Answers Open Exchange Of Ideas Speaker Contact Information: Robert J. Abate r.j.abate@att.net (201) 745-7680 42 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate
  • 43. Curriculum Vitae Of Presenter Robert J. Abate, CBIP, CDMP As a hands-on, accomplished Information Technology professional, Mr. Abate offers 30 years of experience in Architectures, Applications, Business Intelligence & Analytics, Infrastructure, and IT strategy. He is credited as one of the first to publish on Services Oriented Architectures (1996), and a respected IT thought leader within the field. He holds a Bachelors of Science in Electrical Engineering, and is a Certified Business Intelligence Professional and a Certified Data Management Professional in four disciplines. Mr. Abate both chairs and presents at global conferences and a member of the board of DAMA and is a respected author and industry thought- leader. Mr. Abate frequently can be heard giving talks on topics such as “The Convergence Of SOA & BI,” “Best Practices In Enterprise Information Management,” “Making Big Data Analytics Actionable”, and “Data Services & Virtualization”. 43 Big Data & The Cloud – March 20th, 2012 © 2012 – Dataversity & Robert J. Abate