SlideShare a Scribd company logo
1 of 94
THE 3V’S OF BIG DATA: VARIETY, VELOCIT
and VOLUME
    SPEAKER: Nick Weir
               CEO
               ChoozON


Friday, July 27, 2012
Mining	
  Big	
  Data	
  and	
  the	
  3V’s:
            The	
  Business	
  and	
  Science	
  of	
  Big	
  Data
                                                             Nick	
  Weir,	
  Ph.D.	
  
                                                                               CEO
                                                             ChoozOn	
  Corpora,on
                                                               Twi$er:	
  @usamaf

                                                                      March	
  21,	
  2012
                                             GigaOM	
  Structure:	
  Data	
  Conference
                                                                        New	
  York,	
  NY




Friday, July 27, 2012
Outline
        • Big	
  Data	
  all	
  around	
  us
        • The	
  role	
  of	
  Data	
  Mining	
  and	
  Predic6ve	
  
          Analy6cs
        • On-­‐line	
  data	
  and	
  facts
        • Case	
  studies	
  from	
  mul6ple	
  ver6cals:
               1. Yahoo!	
  Big	
  Data
               2. Social	
  Network	
  Data
               3. ChoozOn	
  Big	
  Data	
  from	
  offers



Friday, July 27, 2012
What	
  MaAers	
  in	
  the	
  Age	
  of	
  Big	
  Data?	
  
               1.Being	
  able	
  to	
  exploit	
  all	
  the	
  data	
  that	
  is	
  available	
  
                        • not	
  just	
  what	
  you've	
  got	
  available	
  
                        • what	
  you	
  can	
  acquire	
  and	
  use	
  to	
  enhance	
  your	
  acFons

               2.	
  Prolifera:ng	
  analy:cs	
  throughout	
  your	
  organiza:on
                        • make	
  every	
  part	
  of	
  your	
  business	
  smarter

               3.	
  Driving	
  significant	
  business	
  value	
  
                        • embedding	
  analyFcs	
  into	
  every	
  area	
  of	
  your	
  business	
  
                          can	
  help	
  you	
  drive	
  top	
  line	
  revenues	
  and/or	
  boJom	
  line	
  
                          cost	
  efficiencies	
  


Friday, July 27, 2012
Why	
  Big	
  Data?




Friday, July 27, 2012
Why	
  Big	
  Data?
    A	
  new	
  term,	
  with	
  associated	
  “Data	
  Scien:st”	
  posi:ons:




Friday, July 27, 2012
Why	
  Big	
  Data?
    A	
  new	
  term,	
  with	
  associated	
  “Data	
  Scien:st”	
  posi:ons:
    • Big	
  Data:	
  is	
  a	
  mix	
  of	
  structured,	
  semi-­‐structured,	
  and	
  
      unstructured	
  data:
           –Typically	
  breaks	
  barriers	
  for	
  tradi:onal	
  RDB	
  storage
           –Typically	
  breaks	
  limits	
  of	
  indexing	
  by	
  “rows”
           –Typically	
  requires	
  intensive	
  pre-­‐processing	
  before	
  each	
  
            query	
  to	
  extract	
  “some	
  structure”	
  –	
  usually	
  using	
  Map-­‐
            Reduce	
  type	
  opera:ons




Friday, July 27, 2012
Why	
  Big	
  Data?
    A	
  new	
  term,	
  with	
  associated	
  “Data	
  Scien:st”	
  posi:ons:
    • Big	
  Data:	
  is	
  a	
  mix	
  of	
  structured,	
  semi-­‐structured,	
  and	
  
      unstructured	
  data:
           –Typically	
  breaks	
  barriers	
  for	
  tradi:onal	
  RDB	
  storage
           –Typically	
  breaks	
  limits	
  of	
  indexing	
  by	
  “rows”
           –Typically	
  requires	
  intensive	
  pre-­‐processing	
  before	
  each	
  
            query	
  to	
  extract	
  “some	
  structure”	
  –	
  usually	
  using	
  Map-­‐
            Reduce	
  type	
  opera:ons
    • Above	
  leads	
  to	
  “messy”	
  situa6ons	
  with	
  no	
  standard	
  
      recipes	
  or	
  architecture:	
  hence	
  the	
  need	
  for	
  “data	
  
      scien6sts”	
  
           –Conduct	
  “Data	
  Expedi:ons”	
  

Friday, July 27, 2012
What	
  Makes	
  Data	
  “Big	
  Data”?




Friday, July 27, 2012
What	
  Makes	
  Data	
  “Big	
  Data”?
    • Big	
  Data	
  is	
  Characterized	
  by	
  the	
  3-­‐V’s:
      –Volume:	
  larger	
  than	
  “normal”	
  –	
  challenging	
  to	
  load	
  and	
  
               process
                  • Expensive	
  to	
  do	
  ETL
                  • Expensive	
  to	
  figure	
  out	
  how	
  to	
  index	
  and	
  retrieve
                  • MulQple	
  dimensions	
  that	
  are	
  “key”
           –Velocity:	
  Rate	
  of	
  arrival	
  poses	
  real-­‐,me	
  constraints	
  on	
  what	
  
               are	
  typically	
  “batch	
  ETL”	
  opera,ons
                  • If	
  you	
  fall	
  behind	
  catching	
  up	
  is	
  extremely	
  expensive	
  (replicate	
  very	
  expensive	
  
                    systems)
                  • Must	
  keep	
  up	
  with	
  rate	
  and	
  service	
  queries	
  on-­‐the-­‐fly
           –Variety:	
  Mix	
  of	
  data	
  types	
  and	
  varying	
  degrees	
  of	
  structure
                  • Non-­‐standard	
  schema
                  • Lots	
  of	
  BLOB’s	
  and	
  CLOB’s
                  • DB	
  queries	
  don’t	
  know	
  what	
  to	
  do	
  with	
  semi-­‐structured	
  and	
  unstructured	
  

Friday, July 27, 2012
The	
  DisQncQon	
  between	
  “Data”	
  and	
  “Big	
  
                           Data”	
  is	
  fast	
  disappearing
        • Most	
  real	
  data	
  sets	
  nowadays	
  come	
  with	
  a	
  serious	
  mix	
  of	
  semi-­‐
          structured	
  and	
  unstructured	
  components:
               –   Images
               –   Video
               –   Text	
  descrip:ons	
  and	
  news,	
  blogs,	
  etc…
               –   User	
  and	
  customer	
  commentary
               –   Reac:ons	
  on	
  social	
  media:	
  e.g.	
  TwiTer	
  is	
  a	
  mix	
  of	
  data	
  anyway
        • Using	
  standard	
  transforms,	
  enQty	
  extracQon,	
  and	
  new	
  
          generaQon	
  tools	
  to	
  transform	
  unstructured	
  raw	
  data	
  into	
  semi-­‐
          structured	
  analyzable	
  data	
  
        • Hadoop	
  vs.	
  Not	
  Hadoop	
  -­‐	
  	
  when	
  to	
  use	
  what	
  kind	
  of	
  techniques	
  
          requiring	
  Map-­‐Reduce	
  and	
  grid	
  compuQng



Friday, July 27, 2012
Text	
  Data:	
  The	
  Big	
  Driver

        • While	
  we	
  speak	
  of	
  “big	
  data”	
  and	
  the	
  “Variety”	
  in	
  3-­‐
          V’s
        • Reality:	
  biggest	
  driver	
  of	
  growth	
  of	
  Big	
  Data	
  has	
  been	
  
          text	
  data
        • In	
  fact	
  Map-­‐Reduce	
  became	
  popularized	
  by	
  Google	
  to	
  
          address	
  the	
  problem	
  of	
  processing	
  large	
  amounts	
  of	
  
          text	
  data:	
  
               – Indexing	
  a	
  full	
  copy	
  of	
  the	
  web
               – Frequent	
  re-­‐indexing
               – Many	
  operaQons	
  with	
  each	
  being	
  a	
  simple	
  operaQon	
  but	
  



Friday, July 27, 2012
To	
  Hadoop	
  or	
  not	
  to	
  Hadoop?
     When	
  to	
  use	
  techniques	
  requiring	
  Map-­‐Reduce	
  and	
  grid	
  compu?ng?
     • Typically	
  organizaFons	
  try	
  to	
  use	
  Map-­‐Reduce	
  for	
  everything	
  to	
  do	
  with	
  
       Big	
  Data
            – Inefficient	
  and	
  oIen	
  irra6onal
            – Certain	
  opera6ons	
  require	
  specialized	
  storage
                  • UpdaFng	
  segment	
  memberships	
  over	
  large	
  numbers	
  of	
  users
                  • Defining	
  new	
  segments	
  on	
  user	
  or	
  usage	
  data
     • Map-­‐Reduce	
  is	
  useful	
  when	
  a	
  very	
  simple	
  operaFon	
  is	
  to	
  be	
  applied	
  on	
  
       a	
  large	
  body	
  of	
  unstructured	
  data
            – Typically	
  this	
  is	
  during	
  en6ty	
  and	
  aAribute	
  extrac6on
            – S6ll	
  need	
  Big	
  Data	
  analysis	
  post-­‐Hadoop
     • Map-­‐Reduce	
  is	
  not	
  efficient	
  or	
  effecFve	
  for	
  tasks	
  involving	
  deeper	
  
       staFsFcal	
  modeling
            – 	
  Good	
  for	
  gathering	
  counts	
  and	
  simple	
  (sufficient)	
  sta6s6cs
                  • E.g.	
  how	
  many	
  Fmes	
  a	
  keyword	
  occurs,	
  quick	
  aggregaFon	
  of	
  simple	
  facts	
  in	
  unstructured	
  
                    data,	
  esFmates	
  of	
  variances,	
  density,	
  etc…
            – Mostly	
  pre-­‐processing	
  for	
  Data	
  Mining
Friday, July 27, 2012
Hadoop	
  Use	
  Cases	
  by	
  Data	
  Type




Friday, July 27, 2012
Friday, July 27, 2012
Friday, July 27, 2012
Big	
  Data	
  Applica:ons	
  and	
  Uses

                                                                              100%	
  Capture
                                IT	
  Log	
  &	
  Security	
  
                              Forensics	
  &	
  AnalyFcs         Find	
  New	
  Signal   Predict	
  Events


                                                                           Product	
  Planning
                             Automated	
  Device	
  Data
                                                                            Failure	
  Analysis
                                   AnalyFcs
                                                                            ProacFve	
  Fixes
                                                                             RecommendaQon
                                     AdverFsing                               SegmentaQon
                                      AnalyFcs                                 Social	
  Media

                                                                          Hadoop	
  +	
  	
  MPP	
  +	
  EDW
                                   Big	
  Data                   Cost	
  ReducQon Ad	
  Hoc	
  Insight
                              Warehouse	
  AnalyFcs                       PredicQve	
  AnalyQcs




Friday, July 27, 2012
So	
  this	
  Internet	
  thing	
  is	
  going	
  to	
  
                                 be	
  big!
                        Big	
  Opportunity,	
  Big	
  Data,	
  Big	
  
                                   Challenges!



Friday, July 27, 2012
Stats	
  about	
  on-­‐line	
  usage




Friday, July 27, 2012
Stats	
  about	
  on-­‐line	
  usage
       • How	
  many	
  people	
  are	
  on-­‐line	
  today?	
  
             –2.1	
  Billion	
  (per	
  Comscore	
  es:mates)
             –30%	
  of	
  world	
  Popula:on




Friday, July 27, 2012
Stats	
  about	
  on-­‐line	
  usage
       • How	
  many	
  people	
  are	
  on-­‐line	
  today?	
  
             –2.1	
  Billion	
  (per	
  Comscore	
  es:mates)
             –30%	
  of	
  world	
  Popula:on

       • How	
  much	
  Fme	
  is	
  spent	
  on-­‐line	
  per	
  month	
  by	
  the	
  
         whole	
  world?
             –4M	
  person-­‐years	
  per	
  month




Friday, July 27, 2012
Stats	
  about	
  on-­‐line	
  usage
       • How	
  many	
  people	
  are	
  on-­‐line	
  today?	
  
             –2.1	
  Billion	
  (per	
  Comscore	
  es:mates)
             –30%	
  of	
  world	
  Popula:on

       • How	
  much	
  Fme	
  is	
  spent	
  on-­‐line	
  per	
  month	
  by	
  the	
  
         whole	
  world?
             –4M	
  person-­‐years	
  per	
  month

       • How	
  many	
  hours	
  per	
  month	
  per	
  Internet	
  User?
             –16	
  hours	
  (global	
  average)
             –32	
  hours	
  (U.S.	
  Average)


Friday, July 27, 2012
Stats	
  about	
  on-­‐line	
  usage
       • How	
  many	
  people	
  are	
  on-­‐line	
  today?	
  
             –2.1	
  Billion	
  (per	
  Comscore	
  es:mates)
             –30%	
  of	
  world	
  Popula:on

       • How	
  much	
  Fme	
  is	
  spent	
  on-­‐line	
  per	
  month	
  by	
  the	
  
         whole	
  world?
             –4M	
  person-­‐years	
  per	
  month

       • How	
  many	
  hours	
  per	
  month	
  per	
  Internet	
  User?
             –16	
  hours	
  (global	
  average)
             –32	
  hours	
  (U.S.	
  Average)
         *Sources: Feb.2012 - from Go-Gulf.com compiled from Comscoredatamine.com, Nielsen.com, thisDigitalLife.com,
         PewInternet.org



Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M
   • Facebook:	
  Updates	
  per	
  day?




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M
   • Facebook:	
  Updates	
  per	
  day?
          – More	
  than	
  800M




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M
   • Facebook:	
  Updates	
  per	
  day?
          – More	
  than	
  800M
   • YouTube:	
  Views/day




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M
   • Facebook:	
  Updates	
  per	
  day?
          – More	
  than	
  800M
   • YouTube:	
  Views/day
          – 4	
  Billion	
  views
          – 60	
  hours	
  of	
  video	
  uploaded	
  every	
  minute!




Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M
   • Facebook:	
  Updates	
  per	
  day?
          – More	
  than	
  800M
   • YouTube:	
  Views/day
          – 4	
  Billion	
  views
          – 60	
  hours	
  of	
  video	
  uploaded	
  every	
  minute!
   • Social	
  Networks:	
  users	
  who	
  have	
  used	
  sites	
  for	
  spying	
  
     on	
  their	
  partners?
     –56%


Friday, July 27, 2012
Volume,	
  Velocity,	
  Variety
   • Google:	
  	
  How	
  many	
  queries	
  per	
  day?
          – More	
  than	
  1	
  Billion
   • Twi$er:	
  How	
  many	
  Tweets/day?
          – More	
  than	
  250M
   • Facebook:	
  Updates	
  per	
  day?
          – More	
  than	
  800M
   • YouTube:	
  Views/day
          – 4	
  Billion	
  views
          – 60	
  hours	
  of	
  video	
  uploaded	
  every	
  minute!
   • Social	
  Networks:	
  users	
  who	
  have	
  used	
  sites	
  for	
  spying	
  
     on	
  their	
  partners?
     –56%
                 *Sources: Feb.2012 - from Go-Gulf.com compiled from Comscoredatamine.com, Nielsen.com,
                 thisDigitalLife.com, PewInternet.org




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                             Value




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                 Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?
                        • 	
  Is	
  it	
  negaFve	
  or	
  posiFve?




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                 Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?
                        • 	
  Is	
  it	
  negaFve	
  or	
  posiFve?
                        • 	
  What	
  is	
  the	
  health	
  of	
  my	
  brand	
  online?




Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                 Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?
                        • 	
  Is	
  it	
  negaFve	
  or	
  posiFve?
                        • 	
  What	
  is	
  the	
  health	
  of	
  my	
  brand	
  online?
                        Do	
  we	
  understand	
  context	
  and	
  content?



Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                 Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?
                        • 	
  Is	
  it	
  negaFve	
  or	
  posiFve?
                        • 	
  What	
  is	
  the	
  health	
  of	
  my	
  brand	
  online?
                        Do	
  we	
  understand	
  context	
  and	
  content?
                        • 	
   What	
  are	
  appropriate	
  ads?


Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                 Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?
                        • 	
  Is	
  it	
  negaFve	
  or	
  posiFve?
                        • 	
  What	
  is	
  the	
  health	
  of	
  my	
  brand	
  online?
                        Do	
  we	
  understand	
  context	
  and	
  content?
                        • 	
   What	
  are	
  appropriate	
  ads?
                        • 	
  Is	
  it	
  Ok	
  to	
  associate	
  my	
  brand	
  with	
  this	
  content?

Friday, July 27, 2012
Turning	
  the	
  three	
  Vs	
  of	
  Big	
  Data	
  into	
  
                                                 Value
                        Do	
  we	
  understand	
  what	
  each	
  individual	
  is	
  trying	
  to	
  
                          achieve?
                        • What	
  is	
  user	
  intent?
                        • Cri,cal	
  in	
  mone,za,on,	
  adver,sing,	
  etc…
                        Do	
  we	
  understand	
  what	
  a	
  community’s	
  sen:ment	
  is?
                        • 	
   What	
  is	
  the	
  emoFon?
                        • 	
  Is	
  it	
  negaFve	
  or	
  posiFve?
                        • 	
  What	
  is	
  the	
  health	
  of	
  my	
  brand	
  online?
                        Do	
  we	
  understand	
  context	
  and	
  content?
                        • 	
   What	
  are	
  appropriate	
  ads?
                        • 	
  Is	
  it	
  Ok	
  to	
  associate	
  my	
  brand	
  with	
  this	
  content?
                        • Is	
  content	
  sad?,	
  happy?,	
  serious?,	
  informaFve?
Friday, July 27, 2012
Case	
  Studies
                          Yahoo!	
  Big	
  Data




                                                  18
Friday, July 27, 2012
Yahoo!	
  –	
  One	
  of	
  Largest	
  Des:na:ons	
  
                          on	
  the	
  Web
                                                                                                                     More people visited
                                                                                                                        Yahoo! in the past
                                                                                                                        month than:
    80%	
  of	
  the	
  U.S.	
  Internet	
  popula4on	
  uses	
  Yahoo!	
                                            • Use coupons
    –	
  Over	
  600	
  million	
  users	
  per	
  month	
  globally!                                                • Vote
    • Global	
  network	
  of	
  content,	
  commerce,	
  media,	
  search	
  and	
  access	
                        • Recycle
      products                                                                                                       • Exercise regularly
    • 100+	
  properFes	
  including	
  mail,	
  TV,	
  news,	
  shopping,	
  finance,	
  autos,	
                    • Have children
      travel,	
  games,	
  movies,	
  health,	
  etc.                                                                  living at home
                                                                                                                     • Wear sunscreen
    • 25+	
  terabytes	
  of	
  data	
  collected	
  each	
  day                                                       regularly
          • Represen:ng	
  1000’s	
  of	
  cataloged	
  consumer	
  behaviors

      Data is used to develop content, consumer, category and campaign insights for
      our key content partners and large advertisers

                                Sources: Mediamark Research, Spring 2004 and comScore Media Metrix, February 2005.


Friday, July 27, 2012
Yahoo!	
  Big	
  Data	
  –	
  A	
  league	
  of	
  its	
  own…




                         GRAND CHALLENGE PROBLEMS OF DATA PROCESSING
                        TRAVEL, CREDIT CARD PROCESSING, STOCK EXCHANGE, RETAIL, INTERNET

                          Y! Data Challenge Exceeds others by 2 orders of magnitude



Friday, July 27, 2012
Behavioral	
  Targe:ng	
  (BT)




Friday, July 27, 2012
Behavioral	
  Targe:ng	
  (BT)




                                           Targe:ng	
  your	
  ads	
  to	
  consumers	
  
                                             whose	
  recent	
  behaviors	
  online	
  
                                                 indicate	
  that	
  your	
  product	
  
                                               category	
  is	
  relevant	
  to	
  them

Friday, July 27, 2012
Yahoo!	
  User	
  DNA




    • On a per consumer basis: maintain a
      behavioral/interests profile and profitability (user
      value and LTV) metrics
Friday, July 27, 2012
How	
  it	
  works	
  |	
  Network	
  +	
  Interests	
  +	
  Modelling




Friday, July 27, 2012
How	
  it	
  works	
  |	
  Network	
  +	
  Interests	
  +	
  Modelling
                                                          Analyze predictive patterns for purchase
                                                          cycles in over 100 product categories
                        Varying Product Purchase Cycles




Friday, July 27, 2012
How	
  it	
  works	
  |	
  Network	
  +	
  Interests	
  +	
  Modelling
                                                    Analyze predictive patterns for purchase
                                                    cycles in over 100 product categories
                        Match Users to the Models


                                                    In each category, build models to describe
                                                    behaviour most likely to lead to an ad
                                                    response (i.e. click).




Friday, July 27, 2012
How	
  it	
  works	
  |	
  Network	
  +	
  Interests	
  +	
  Modelling
                                                   Analyze predictive patterns for purchase
                                                   cycles in over 100 product categories
                        Rewarding Good Behaviour


                                                   In each category, build models to describe
                                                   behaviour most likely to lead to an ad
                                                   response (i.e. click).


                                                   Score each user for fit with every category…
                                                   daily.




Friday, July 27, 2012
How	
  it	
  works	
  |	
  Network	
  +	
  Interests	
  +	
  Modelling
                                                       Analyze predictive patterns for purchase
                                                       cycles in over 100 product categories
                        Identify Most Relevant Users


                                                       In each category, build models to describe
                                                       behaviour most likely to lead to an ad
                                                       response (i.e. click).


                                                       Score each user for fit with every category…
                                                       daily.


                                                       Target ads to users who get highest
                                                       ‘relevance’ scores in the targeting categories




Friday, July 27, 2012
Recency	
  MaTers,	
  So	
  Does	
  Intensity

            Active now…                …and with feeling




Friday, July 27, 2012
DifferenQaQon	
  |	
  Category	
  specific	
  modelling




Friday, July 27, 2012
DifferenQaQon	
  |	
  Category	
  specific	
  modelling




Friday, July 27, 2012
DifferenQaQon	
  |	
  Category	
  specific	
  modelling




Friday, July 27, 2012
DifferenQaQon	
  |	
  Category	
  specific	
  modelling




Friday, July 27, 2012
Automobile	
  Purchase	
  Intender	
  Example

    • A	
  test	
  ad-­‐campaign	
  with	
  a	
  major	
  Euro	
  automobile	
  
      manufacturer
          – Designed	
  a	
  test	
  that	
  served	
  the	
  same	
  ad	
  creaQve	
  to	
  test	
  and	
  
            control	
  groups	
  on	
  Yahoo
          – Success	
  metric:	
  performing	
  specific	
  acQons	
  on	
  Jaguar	
  website

    • Test	
  results:	
  900%	
  conversion	
  lij	
  vs.	
  control	
  group
          – Purchase	
  Intenders	
  were	
  9	
  Qmes	
  more	
  likely	
  to	
  configure	
  a	
  
            vehicle,	
  request	
  a	
  price	
  quote	
  or	
  locate	
  a	
  dealer	
  than	
  consumers	
  
            in	
  the	
  control	
  group
          – ~3x	
  higher	
  click	
  through	
  rates	
  vs.	
  control	
  group

Friday, July 27, 2012
Mortgage	
  Intender	
  Example
                                                     Results from a client campaign on Yahoo!

   Example: Mortgages                                                 Network

                                                                                                                            +626%
                                                                                                                           Conv Lift




   We found:                                                                              +122%
                                                                                         CTR Lift
   1,900,000 people
                                       Y! Finance Audience
                                                       Mortgage Loans Target
                                                                       Mortgage Loans Target
   looking for
   mortgage loans.                            Example search terms qualified for this target:

                                              Mortgages                  Home Loans Refinancing                             Ditech

                                              Example Yahoo! Pages visited:
                                              Financing section in Real Estate
                                              Mortgage Loans area in Finance
                                              Real Estate section in Yellow Pages


                                        Source: Campaign Click thru Rate lift is determined by Yahoo! Internal research.
                                        Conversion is the number of qualified leads from clicks over number of
                                        impressions served. Audience size represents the audience within this
                                        behavioral interest category that has the highest propensity to engage
   Date: March 2006                     with a brand or product and to click on an offer.


Friday, July 27, 2012
Experience summary at Yahoo!
             • Dealing with the largest data source in the
               world (25 Terabyte per day)
             • BT business was grown from $20M to about
               $500M in 3 years of investment!
             • Building the largest database systems:
                    – World’s largest Oracle data warehouse
                    – World’s largest single DB
                    – Over 300 data mart data
                    – Analytics with thousands of KPI’s
                    – Over 5000 users of reports
                    – Largest targeting system in the world
             • Big demands for grid computing (Hadoop)
Friday, July 27, 2012
Social	
  Network
                        Social	
  Network	
  Marke:ng
                        Understanding	
  Context	
  for	
  
                                      Ads


Friday, July 27, 2012
Case Study: TWITTER Social Marketing?
                    Viacom’s VH1 Twitter campaign on ANVIL (the movie)
                              (week of May 14th , 2009 – see AdAge article)

          Marketing ANVIL movie
          Very niche audience, how do you reach them?




Friday, July 27, 2012
Case Study: TWITTER Social Marketing?
                    Viacom’s VH1 Twitter campaign on ANVIL (the movie)
                                 (week of May 14th , 2009 – see AdAge article)

          Marketing ANVIL movie
          Very niche audience, how do you reach them?

                 Diffusion 	

                  Give 20 free DVD’s to major related artists/groups, ask them
                 To notify Twitter groups – reached over 2M people




Friday, July 27, 2012
Case Study: TWITTER Social Marketing?
                    Viacom’s VH1 Twitter campaign on ANVIL (the movie)
                                 (week of May 14th , 2009 – see AdAge article)

          Marketing ANVIL movie
          Very niche audience, how do you reach them?

                 Diffusion 	

                  Give 20 free DVD’s to major related artists/groups, ask them
                 To notify Twitter groups – reached over 2M people
                 Social Identity: power of word of mouth...
                 What is the Cost to VH-1?
                 Compare with traditional approach: TV commercials to promote a documentary
                 film?




Friday, July 27, 2012
The	
  Display	
  Ads	
  Challenge	
  Today
       Damaging to Brand                        Irrelevant and Damaging to Brand




                                                Mismatch




                        Completely Irrelevant




Friday, July 27, 2012
NetSeer:	
  Intent	
  for	
  Display
  • And	
  Currently	
  Processing	
  4	
  Billion	
  Impressions	
  per	
  
    Day




Friday, July 27, 2012
NetSeer:	
  Intent	
  for	
  Display
  • And	
  Currently	
  Processing	
  4	
  Billion	
  Impressions	
  per	
  
    Day




Friday, July 27, 2012
NetSeer:	
  Intent	
  for	
  Display
  • And	
  Currently	
  Processing	
  4	
  Billion	
  Impressions	
  per	
  
    Day




Friday, July 27, 2012
NetSeer:	
  Intent	
  for	
  Display
  • And	
  Currently	
  Processing	
  4	
  Billion	
  Impressions	
  per	
  
    Day




Friday, July 27, 2012
Problem:	
  Hard	
  to	
  Understand	
  User	
  Intent
                        Contextual	
  Ad	
  served	
  by	
  Google




Friday, July 27, 2012
Problem:	
  Hard	
  to	
  Understand	
  User	
  Intent
                        Contextual	
  Ad	
  served	
  by	
  Google   What	
  NetSeer	
  Sees:	
  




Friday, July 27, 2012
Specialized	
  Search	
  through	
  Big	
  Data	
  
                         AnalyFcs	
  over	
  the	
  Offers	
  Universe




Friday, July 27, 2012
Chaos	
  for	
  
         Consumers	
  




                             Consumer




Friday, July 27, 2012
Chaos	
  for	
  
         Consumers	
  


                                              Deep
                                            Discount
                                              Sites        Coupon
                               Flash
                                                            Sites
                               Sale
                               Sites
                                                                        Loyalty
                                                                       Program
                                                                          s
                             Daily
                             Deals
                             Sites
                                             Consumer                Online
                                                                    Loyalty
                                                                    Program
                                                        Social         s
                                       Search
                                                       Network
                                       Engines
                                                          s




Friday, July 27, 2012
The	
  Business	
  Model	
  behind	
  ChoozOn’s	
  Use	
  of	
  Big	
  
                                       Data




Friday, July 27, 2012
The	
  Business	
  Model	
  behind	
  ChoozOn’s	
  Use	
  of	
  Big	
  
                                       Data


                                Consumers
                                • Value from brands they love
                                • Tame the deal chaos




Friday, July 27, 2012
The	
  Business	
  Model	
  behind	
  ChoozOn’s	
  Use	
  of	
  Big	
  
                                       Data


                                Consumers
                                • Value from brands they love
                                • Tame the deal chaos




                               Marketers
                               • Reach targeted consumers
                               • Build loyalty
                               • Create brand evangelists



Friday, July 27, 2012
The	
  Business	
  Model	
  behind	
  ChoozOn’s	
  Use	
  of	
  Big	
  
                                       Data


                                Consumers
                                • Value from brands they love
                                • Tame the deal chaos




                               Marketers
                               • Reach targeted consumers
                               • Build loyalty
                               • Create brand evangelists



Friday, July 27, 2012
Carol’s	
  Consumer	
  Network




Friday, July 27, 2012
Carol’s	
  Consumer	
  Network

                        “Chozen”	
  Brands




Friday, July 27, 2012
Carol’s	
  Consumer	
  Network

                        “Chozen”	
  Brands               Interests	
  (Intent)




Friday, July 27, 2012
Carol’s	
  Consumer	
  Network

                        “Chozen”	
  Brands               Interests	
  (Intent)




                                                        Shopping	
  Pals




Friday, July 27, 2012
Carol’s	
  Consumer	
  Network

                        “Chozen”	
  Brands               Interests	
  (Intent)




                                                        Shopping	
  Pals
                             Deal	
  Clubs



Friday, July 27, 2012
What	
  Carol	
  Gets

                        Carol’s	
  Consumer	
  Network




Friday, July 27, 2012
What	
  Carol	
  Gets

                        Carol’s	
  Consumer	
  Network             The	
  Universe	
  of	
  Deals
                                                         Affiliate	
  Deals
                                                         Loyalty	
  Programs
                                                         Daily	
  Deals
                                                         Flash	
  Sales




                                               Intelligent Matching



                        A	
  Personal	
  Shopper	
  for	
  Deals

Friday, July 27, 2012
What	
  Carol	
  Gets

                        Carol’s	
  Consumer	
  Network             The	
  Universe	
  of	
  Deals
                                                         Affiliate	
  Deals                    1,600+	
  brands
                                                         Loyalty	
  Programs
                                                                                            100,000+	
  offers
                                                         Daily	
  Deals
                                                         Flash	
  Sales




                                               Intelligent Matching



                        A	
  Personal	
  Shopper	
  for	
  Deals

Friday, July 27, 2012
What	
  Carol	
  Gets

                        Carol’s	
  Consumer	
  Network             The	
  Universe	
  of	
  Deals
                                                         Affiliate	
  Deals                    1,600+	
  brands
                                                         Loyalty	
  Programs
                                                                                            100,000+	
  offers
                                                         Daily	
  Deals
                                                         Flash	
  Sales
                                                          PLUS her                            Inbox™


                                               Intelligent Matching



                        A	
  Personal	
  Shopper	
  for	
  Deals

Friday, July 27, 2012
Thank	
  You!	
  
                        TwiAer:	
  @choozon

                          nick@choozon.net
                         www.ChoozOn.com

Friday, July 27, 2012
Friday, July 27, 2012

More Related Content

What's hot

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsYves Raimond
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetryVincent Behar
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability LibraryTonny Adhi Sabastian
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Jelena Zanko
 
Manage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityManage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityNGINX, Inc.
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performancesource{d}
 
8 aboriginal-ways-of-learning-factsheet
8 aboriginal-ways-of-learning-factsheet8 aboriginal-ways-of-learning-factsheet
8 aboriginal-ways-of-learning-factsheetClaudia Kopanovski
 
Accelerate Microservices Deployments with Automation
Accelerate Microservices Deployments with AutomationAccelerate Microservices Deployments with Automation
Accelerate Microservices Deployments with AutomationNGINX, Inc.
 
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Tonny Adhi Sabastian
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidImply
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkFabian Hueske
 
KCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfKCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfRui Liu
 
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixTableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixBlake Irvine
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...LibbySchulze
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioDevOpsDays Tel Aviv
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
 
ALD for energy application - Lithium ion battery and fuel cells
ALD for energy application - Lithium ion battery and fuel cellsALD for energy application - Lithium ion battery and fuel cells
ALD for energy application - Lithium ion battery and fuel cellsLaurent Lecordier
 
Next Gen Monitoring with INT
Next Gen Monitoring with INTNext Gen Monitoring with INT
Next Gen Monitoring with INTMyNOG
 

What's hot (20)

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetry
 
Introduction to Open Telemetry as Observability Library
Introduction to Open  Telemetry as Observability LibraryIntroduction to Open  Telemetry as Observability Library
Introduction to Open Telemetry as Observability Library
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
 
Manage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with ObservabilityManage Microservices Chaos and Complexity with Observability
Manage Microservices Chaos and Complexity with Observability
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performance
 
8 aboriginal-ways-of-learning-factsheet
8 aboriginal-ways-of-learning-factsheet8 aboriginal-ways-of-learning-factsheet
8 aboriginal-ways-of-learning-factsheet
 
Accelerate Microservices Deployments with Automation
Accelerate Microservices Deployments with AutomationAccelerate Microservices Deployments with Automation
Accelerate Microservices Deployments with Automation
 
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
Adopting Open Telemetry as Distributed Tracer on your Microservices at Kubern...
 
Building a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache DruidBuilding a Real-Time Gaming Analytics Service with Apache Druid
Building a Real-Time Gaming Analytics Service with Apache Druid
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
KCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdfKCD-OpenTelemetry.pdf
KCD-OpenTelemetry.pdf
 
Observability
ObservabilityObservability
Observability
 
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixTableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
 
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.ioTHE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
THE STATE OF OPENTELEMETRY, DOTAN HOROVITS, Logz.io
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
ALD for energy application - Lithium ion battery and fuel cells
ALD for energy application - Lithium ion battery and fuel cellsALD for energy application - Lithium ion battery and fuel cells
ALD for energy application - Lithium ion battery and fuel cells
 
Next Gen Monitoring with INT
Next Gen Monitoring with INTNext Gen Monitoring with INT
Next Gen Monitoring with INT
 

Viewers also liked

Project the travling
Project the travlingProject the travling
Project the travlingsaeed4u2ever
 
Trends: Results of the 2015 Inbound Marketing for Higher Education Survey
Trends: Results of the 2015 Inbound Marketing for Higher Education SurveyTrends: Results of the 2015 Inbound Marketing for Higher Education Survey
Trends: Results of the 2015 Inbound Marketing for Higher Education SurveyConverge Consulting
 
Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...
Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...
Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...Instituto i3G
 
CapstonePaperFinal
CapstonePaperFinalCapstonePaperFinal
CapstonePaperFinalLeia Dolphy
 
Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2
Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2
Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2Instituto i3G
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformMatteo Merli
 
Apache con2016final
Apache con2016final Apache con2016final
Apache con2016final Salesforce
 
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Richard Seymour
 
How to mea­sure and improve brain-based out­comes that mat­ter in health care
How to mea­sure and improve brain-based out­comes that mat­ter in health careHow to mea­sure and improve brain-based out­comes that mat­ter in health care
How to mea­sure and improve brain-based out­comes that mat­ter in health careSharpBrains
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
Enable breakthrough in Parkinson disease research- Ido Karavany-
Enable breakthrough in Parkinson disease research- Ido Karavany-Enable breakthrough in Parkinson disease research- Ido Karavany-
Enable breakthrough in Parkinson disease research- Ido Karavany-Spark Summit
 

Viewers also liked (14)

Affiliates brochure
Affiliates brochureAffiliates brochure
Affiliates brochure
 
Archivo6 drywall
Archivo6 drywallArchivo6 drywall
Archivo6 drywall
 
Project the travling
Project the travlingProject the travling
Project the travling
 
Trends: Results of the 2015 Inbound Marketing for Higher Education Survey
Trends: Results of the 2015 Inbound Marketing for Higher Education SurveyTrends: Results of the 2015 Inbound Marketing for Higher Education Survey
Trends: Results of the 2015 Inbound Marketing for Higher Education Survey
 
CV_Radoslavov_Rosen
CV_Radoslavov_RosenCV_Radoslavov_Rosen
CV_Radoslavov_Rosen
 
Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...
Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...
Indiopedia: a System to Acquire Legal Knowledge from the Crowd Using Games Te...
 
CapstonePaperFinal
CapstonePaperFinalCapstonePaperFinal
CapstonePaperFinal
 
Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2
Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2
Gamificação para o fortalecimento da cidadania: uma análise do Swapp mGov2
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
 
Apache con2016final
Apache con2016final Apache con2016final
Apache con2016final
 
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
 
How to mea­sure and improve brain-based out­comes that mat­ter in health care
How to mea­sure and improve brain-based out­comes that mat­ter in health careHow to mea­sure and improve brain-based out­comes that mat­ter in health care
How to mea­sure and improve brain-based out­comes that mat­ter in health care
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
Enable breakthrough in Parkinson disease research- Ido Karavany-
Enable breakthrough in Parkinson disease research- Ido Karavany-Enable breakthrough in Parkinson disease research- Ido Karavany-
Enable breakthrough in Parkinson disease research- Ido Karavany-
 

Similar to THE 3V’S OF BIG DATA: VARIETY, VELOCITY, and VOLUME

THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataPrakalp Agarwal
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeeling Cheung
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeckActian Corporation
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataIMC Institute
 
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseAnita Luthra
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Calpont Corporation
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1RUHULAMINHAZARIKA
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsCaserta
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 

Similar to THE 3V’S OF BIG DATA: VARIETY, VELOCITY, and VOLUME (20)

THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Big data and bi best practices slidedeck
Big data and bi best practices slidedeckBig data and bi best practices slidedeck
Big data and bi best practices slidedeck
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the EnterpriseSQL vs NoSQL: Big Data Adoption & Success in the Enterprise
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1
 
Architecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment OptionsArchitecting for Big Data: Trends, Tips, and Deployment Options
Architecting for Big Data: Trends, Tips, and Deployment Options
 
Dw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhanDw 07032018-dr pl pradhan
Dw 07032018-dr pl pradhan
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Big Data
Big DataBig Data
Big Data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 

More from Gigaom

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanGigaom
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceGigaom
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsGigaom
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionGigaom
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopGigaom
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryGigaom
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Gigaom
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Gigaom
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovGigaom
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Gigaom
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Gigaom
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherGigaom
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadGigaom
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Gigaom
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathGigaom
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellGigaom
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteGigaom
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013Gigaom
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013Gigaom
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013Gigaom
 

More from Gigaom (20)

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe Weinman
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - Rackspace
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey results
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad Competition
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshop
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - Battery
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013
 

Recently uploaded

Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 

Recently uploaded (20)

Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 

THE 3V’S OF BIG DATA: VARIETY, VELOCITY, and VOLUME

  • 1. THE 3V’S OF BIG DATA: VARIETY, VELOCIT and VOLUME SPEAKER: Nick Weir CEO ChoozON Friday, July 27, 2012
  • 2. Mining  Big  Data  and  the  3V’s: The  Business  and  Science  of  Big  Data Nick  Weir,  Ph.D.   CEO ChoozOn  Corpora,on Twi$er:  @usamaf March  21,  2012 GigaOM  Structure:  Data  Conference New  York,  NY Friday, July 27, 2012
  • 3. Outline • Big  Data  all  around  us • The  role  of  Data  Mining  and  Predic6ve   Analy6cs • On-­‐line  data  and  facts • Case  studies  from  mul6ple  ver6cals: 1. Yahoo!  Big  Data 2. Social  Network  Data 3. ChoozOn  Big  Data  from  offers Friday, July 27, 2012
  • 4. What  MaAers  in  the  Age  of  Big  Data?   1.Being  able  to  exploit  all  the  data  that  is  available   • not  just  what  you've  got  available   • what  you  can  acquire  and  use  to  enhance  your  acFons 2.  Prolifera:ng  analy:cs  throughout  your  organiza:on • make  every  part  of  your  business  smarter 3.  Driving  significant  business  value   • embedding  analyFcs  into  every  area  of  your  business   can  help  you  drive  top  line  revenues  and/or  boJom  line   cost  efficiencies   Friday, July 27, 2012
  • 6. Why  Big  Data? A  new  term,  with  associated  “Data  Scien:st”  posi:ons: Friday, July 27, 2012
  • 7. Why  Big  Data? A  new  term,  with  associated  “Data  Scien:st”  posi:ons: • Big  Data:  is  a  mix  of  structured,  semi-­‐structured,  and   unstructured  data: –Typically  breaks  barriers  for  tradi:onal  RDB  storage –Typically  breaks  limits  of  indexing  by  “rows” –Typically  requires  intensive  pre-­‐processing  before  each   query  to  extract  “some  structure”  –  usually  using  Map-­‐ Reduce  type  opera:ons Friday, July 27, 2012
  • 8. Why  Big  Data? A  new  term,  with  associated  “Data  Scien:st”  posi:ons: • Big  Data:  is  a  mix  of  structured,  semi-­‐structured,  and   unstructured  data: –Typically  breaks  barriers  for  tradi:onal  RDB  storage –Typically  breaks  limits  of  indexing  by  “rows” –Typically  requires  intensive  pre-­‐processing  before  each   query  to  extract  “some  structure”  –  usually  using  Map-­‐ Reduce  type  opera:ons • Above  leads  to  “messy”  situa6ons  with  no  standard   recipes  or  architecture:  hence  the  need  for  “data   scien6sts”   –Conduct  “Data  Expedi:ons”   Friday, July 27, 2012
  • 9. What  Makes  Data  “Big  Data”? Friday, July 27, 2012
  • 10. What  Makes  Data  “Big  Data”? • Big  Data  is  Characterized  by  the  3-­‐V’s: –Volume:  larger  than  “normal”  –  challenging  to  load  and   process • Expensive  to  do  ETL • Expensive  to  figure  out  how  to  index  and  retrieve • MulQple  dimensions  that  are  “key” –Velocity:  Rate  of  arrival  poses  real-­‐,me  constraints  on  what   are  typically  “batch  ETL”  opera,ons • If  you  fall  behind  catching  up  is  extremely  expensive  (replicate  very  expensive   systems) • Must  keep  up  with  rate  and  service  queries  on-­‐the-­‐fly –Variety:  Mix  of  data  types  and  varying  degrees  of  structure • Non-­‐standard  schema • Lots  of  BLOB’s  and  CLOB’s • DB  queries  don’t  know  what  to  do  with  semi-­‐structured  and  unstructured   Friday, July 27, 2012
  • 11. The  DisQncQon  between  “Data”  and  “Big   Data”  is  fast  disappearing • Most  real  data  sets  nowadays  come  with  a  serious  mix  of  semi-­‐ structured  and  unstructured  components: – Images – Video – Text  descrip:ons  and  news,  blogs,  etc… – User  and  customer  commentary – Reac:ons  on  social  media:  e.g.  TwiTer  is  a  mix  of  data  anyway • Using  standard  transforms,  enQty  extracQon,  and  new   generaQon  tools  to  transform  unstructured  raw  data  into  semi-­‐ structured  analyzable  data   • Hadoop  vs.  Not  Hadoop  -­‐    when  to  use  what  kind  of  techniques   requiring  Map-­‐Reduce  and  grid  compuQng Friday, July 27, 2012
  • 12. Text  Data:  The  Big  Driver • While  we  speak  of  “big  data”  and  the  “Variety”  in  3-­‐ V’s • Reality:  biggest  driver  of  growth  of  Big  Data  has  been   text  data • In  fact  Map-­‐Reduce  became  popularized  by  Google  to   address  the  problem  of  processing  large  amounts  of   text  data:   – Indexing  a  full  copy  of  the  web – Frequent  re-­‐indexing – Many  operaQons  with  each  being  a  simple  operaQon  but   Friday, July 27, 2012
  • 13. To  Hadoop  or  not  to  Hadoop? When  to  use  techniques  requiring  Map-­‐Reduce  and  grid  compu?ng? • Typically  organizaFons  try  to  use  Map-­‐Reduce  for  everything  to  do  with   Big  Data – Inefficient  and  oIen  irra6onal – Certain  opera6ons  require  specialized  storage • UpdaFng  segment  memberships  over  large  numbers  of  users • Defining  new  segments  on  user  or  usage  data • Map-­‐Reduce  is  useful  when  a  very  simple  operaFon  is  to  be  applied  on   a  large  body  of  unstructured  data – Typically  this  is  during  en6ty  and  aAribute  extrac6on – S6ll  need  Big  Data  analysis  post-­‐Hadoop • Map-­‐Reduce  is  not  efficient  or  effecFve  for  tasks  involving  deeper   staFsFcal  modeling –  Good  for  gathering  counts  and  simple  (sufficient)  sta6s6cs • E.g.  how  many  Fmes  a  keyword  occurs,  quick  aggregaFon  of  simple  facts  in  unstructured   data,  esFmates  of  variances,  density,  etc… – Mostly  pre-­‐processing  for  Data  Mining Friday, July 27, 2012
  • 14. Hadoop  Use  Cases  by  Data  Type Friday, July 27, 2012
  • 17. Big  Data  Applica:ons  and  Uses 100%  Capture IT  Log  &  Security   Forensics  &  AnalyFcs Find  New  Signal Predict  Events Product  Planning Automated  Device  Data Failure  Analysis AnalyFcs ProacFve  Fixes RecommendaQon AdverFsing SegmentaQon AnalyFcs Social  Media Hadoop  +    MPP  +  EDW Big  Data Cost  ReducQon Ad  Hoc  Insight Warehouse  AnalyFcs PredicQve  AnalyQcs Friday, July 27, 2012
  • 18. So  this  Internet  thing  is  going  to   be  big! Big  Opportunity,  Big  Data,  Big   Challenges! Friday, July 27, 2012
  • 19. Stats  about  on-­‐line  usage Friday, July 27, 2012
  • 20. Stats  about  on-­‐line  usage • How  many  people  are  on-­‐line  today?   –2.1  Billion  (per  Comscore  es:mates) –30%  of  world  Popula:on Friday, July 27, 2012
  • 21. Stats  about  on-­‐line  usage • How  many  people  are  on-­‐line  today?   –2.1  Billion  (per  Comscore  es:mates) –30%  of  world  Popula:on • How  much  Fme  is  spent  on-­‐line  per  month  by  the   whole  world? –4M  person-­‐years  per  month Friday, July 27, 2012
  • 22. Stats  about  on-­‐line  usage • How  many  people  are  on-­‐line  today?   –2.1  Billion  (per  Comscore  es:mates) –30%  of  world  Popula:on • How  much  Fme  is  spent  on-­‐line  per  month  by  the   whole  world? –4M  person-­‐years  per  month • How  many  hours  per  month  per  Internet  User? –16  hours  (global  average) –32  hours  (U.S.  Average) Friday, July 27, 2012
  • 23. Stats  about  on-­‐line  usage • How  many  people  are  on-­‐line  today?   –2.1  Billion  (per  Comscore  es:mates) –30%  of  world  Popula:on • How  much  Fme  is  spent  on-­‐line  per  month  by  the   whole  world? –4M  person-­‐years  per  month • How  many  hours  per  month  per  Internet  User? –16  hours  (global  average) –32  hours  (U.S.  Average) *Sources: Feb.2012 - from Go-Gulf.com compiled from Comscoredatamine.com, Nielsen.com, thisDigitalLife.com, PewInternet.org Friday, July 27, 2012
  • 25. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? Friday, July 27, 2012
  • 26. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion Friday, July 27, 2012
  • 27. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? Friday, July 27, 2012
  • 28. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M Friday, July 27, 2012
  • 29. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M • Facebook:  Updates  per  day? Friday, July 27, 2012
  • 30. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M • Facebook:  Updates  per  day? – More  than  800M Friday, July 27, 2012
  • 31. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M • Facebook:  Updates  per  day? – More  than  800M • YouTube:  Views/day Friday, July 27, 2012
  • 32. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M • Facebook:  Updates  per  day? – More  than  800M • YouTube:  Views/day – 4  Billion  views – 60  hours  of  video  uploaded  every  minute! Friday, July 27, 2012
  • 33. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M • Facebook:  Updates  per  day? – More  than  800M • YouTube:  Views/day – 4  Billion  views – 60  hours  of  video  uploaded  every  minute! • Social  Networks:  users  who  have  used  sites  for  spying   on  their  partners? –56% Friday, July 27, 2012
  • 34. Volume,  Velocity,  Variety • Google:    How  many  queries  per  day? – More  than  1  Billion • Twi$er:  How  many  Tweets/day? – More  than  250M • Facebook:  Updates  per  day? – More  than  800M • YouTube:  Views/day – 4  Billion  views – 60  hours  of  video  uploaded  every  minute! • Social  Networks:  users  who  have  used  sites  for  spying   on  their  partners? –56% *Sources: Feb.2012 - from Go-Gulf.com compiled from Comscoredatamine.com, Nielsen.com, thisDigitalLife.com, PewInternet.org Friday, July 27, 2012
  • 35. Turning  the  three  Vs  of  Big  Data  into   Value Friday, July 27, 2012
  • 36. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? Friday, July 27, 2012
  • 37. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? Friday, July 27, 2012
  • 38. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Friday, July 27, 2012
  • 39. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? Friday, July 27, 2012
  • 40. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? Friday, July 27, 2012
  • 41. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? •  Is  it  negaFve  or  posiFve? Friday, July 27, 2012
  • 42. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? •  Is  it  negaFve  or  posiFve? •  What  is  the  health  of  my  brand  online? Friday, July 27, 2012
  • 43. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? •  Is  it  negaFve  or  posiFve? •  What  is  the  health  of  my  brand  online? Do  we  understand  context  and  content? Friday, July 27, 2012
  • 44. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? •  Is  it  negaFve  or  posiFve? •  What  is  the  health  of  my  brand  online? Do  we  understand  context  and  content? •   What  are  appropriate  ads? Friday, July 27, 2012
  • 45. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? •  Is  it  negaFve  or  posiFve? •  What  is  the  health  of  my  brand  online? Do  we  understand  context  and  content? •   What  are  appropriate  ads? •  Is  it  Ok  to  associate  my  brand  with  this  content? Friday, July 27, 2012
  • 46. Turning  the  three  Vs  of  Big  Data  into   Value Do  we  understand  what  each  individual  is  trying  to   achieve? • What  is  user  intent? • Cri,cal  in  mone,za,on,  adver,sing,  etc… Do  we  understand  what  a  community’s  sen:ment  is? •   What  is  the  emoFon? •  Is  it  negaFve  or  posiFve? •  What  is  the  health  of  my  brand  online? Do  we  understand  context  and  content? •   What  are  appropriate  ads? •  Is  it  Ok  to  associate  my  brand  with  this  content? • Is  content  sad?,  happy?,  serious?,  informaFve? Friday, July 27, 2012
  • 47. Case  Studies Yahoo!  Big  Data 18 Friday, July 27, 2012
  • 48. Yahoo!  –  One  of  Largest  Des:na:ons   on  the  Web More people visited Yahoo! in the past month than: 80%  of  the  U.S.  Internet  popula4on  uses  Yahoo!   • Use coupons –  Over  600  million  users  per  month  globally! • Vote • Global  network  of  content,  commerce,  media,  search  and  access   • Recycle products • Exercise regularly • 100+  properFes  including  mail,  TV,  news,  shopping,  finance,  autos,   • Have children travel,  games,  movies,  health,  etc. living at home • Wear sunscreen • 25+  terabytes  of  data  collected  each  day regularly • Represen:ng  1000’s  of  cataloged  consumer  behaviors Data is used to develop content, consumer, category and campaign insights for our key content partners and large advertisers Sources: Mediamark Research, Spring 2004 and comScore Media Metrix, February 2005. Friday, July 27, 2012
  • 49. Yahoo!  Big  Data  –  A  league  of  its  own… GRAND CHALLENGE PROBLEMS OF DATA PROCESSING TRAVEL, CREDIT CARD PROCESSING, STOCK EXCHANGE, RETAIL, INTERNET Y! Data Challenge Exceeds others by 2 orders of magnitude Friday, July 27, 2012
  • 51. Behavioral  Targe:ng  (BT) Targe:ng  your  ads  to  consumers   whose  recent  behaviors  online   indicate  that  your  product   category  is  relevant  to  them Friday, July 27, 2012
  • 52. Yahoo!  User  DNA • On a per consumer basis: maintain a behavioral/interests profile and profitability (user value and LTV) metrics Friday, July 27, 2012
  • 53. How  it  works  |  Network  +  Interests  +  Modelling Friday, July 27, 2012
  • 54. How  it  works  |  Network  +  Interests  +  Modelling Analyze predictive patterns for purchase cycles in over 100 product categories Varying Product Purchase Cycles Friday, July 27, 2012
  • 55. How  it  works  |  Network  +  Interests  +  Modelling Analyze predictive patterns for purchase cycles in over 100 product categories Match Users to the Models In each category, build models to describe behaviour most likely to lead to an ad response (i.e. click). Friday, July 27, 2012
  • 56. How  it  works  |  Network  +  Interests  +  Modelling Analyze predictive patterns for purchase cycles in over 100 product categories Rewarding Good Behaviour In each category, build models to describe behaviour most likely to lead to an ad response (i.e. click). Score each user for fit with every category… daily. Friday, July 27, 2012
  • 57. How  it  works  |  Network  +  Interests  +  Modelling Analyze predictive patterns for purchase cycles in over 100 product categories Identify Most Relevant Users In each category, build models to describe behaviour most likely to lead to an ad response (i.e. click). Score each user for fit with every category… daily. Target ads to users who get highest ‘relevance’ scores in the targeting categories Friday, July 27, 2012
  • 58. Recency  MaTers,  So  Does  Intensity Active now… …and with feeling Friday, July 27, 2012
  • 59. DifferenQaQon  |  Category  specific  modelling Friday, July 27, 2012
  • 60. DifferenQaQon  |  Category  specific  modelling Friday, July 27, 2012
  • 61. DifferenQaQon  |  Category  specific  modelling Friday, July 27, 2012
  • 62. DifferenQaQon  |  Category  specific  modelling Friday, July 27, 2012
  • 63. Automobile  Purchase  Intender  Example • A  test  ad-­‐campaign  with  a  major  Euro  automobile   manufacturer – Designed  a  test  that  served  the  same  ad  creaQve  to  test  and   control  groups  on  Yahoo – Success  metric:  performing  specific  acQons  on  Jaguar  website • Test  results:  900%  conversion  lij  vs.  control  group – Purchase  Intenders  were  9  Qmes  more  likely  to  configure  a   vehicle,  request  a  price  quote  or  locate  a  dealer  than  consumers   in  the  control  group – ~3x  higher  click  through  rates  vs.  control  group Friday, July 27, 2012
  • 64. Mortgage  Intender  Example Results from a client campaign on Yahoo! Example: Mortgages Network +626% Conv Lift We found: +122% CTR Lift 1,900,000 people Y! Finance Audience Mortgage Loans Target Mortgage Loans Target looking for mortgage loans. Example search terms qualified for this target: Mortgages Home Loans Refinancing Ditech Example Yahoo! Pages visited: Financing section in Real Estate Mortgage Loans area in Finance Real Estate section in Yellow Pages Source: Campaign Click thru Rate lift is determined by Yahoo! Internal research. Conversion is the number of qualified leads from clicks over number of impressions served. Audience size represents the audience within this behavioral interest category that has the highest propensity to engage Date: March 2006 with a brand or product and to click on an offer. Friday, July 27, 2012
  • 65. Experience summary at Yahoo! • Dealing with the largest data source in the world (25 Terabyte per day) • BT business was grown from $20M to about $500M in 3 years of investment! • Building the largest database systems: – World’s largest Oracle data warehouse – World’s largest single DB – Over 300 data mart data – Analytics with thousands of KPI’s – Over 5000 users of reports – Largest targeting system in the world • Big demands for grid computing (Hadoop) Friday, July 27, 2012
  • 66. Social  Network Social  Network  Marke:ng Understanding  Context  for   Ads Friday, July 27, 2012
  • 67. Case Study: TWITTER Social Marketing? Viacom’s VH1 Twitter campaign on ANVIL (the movie) (week of May 14th , 2009 – see AdAge article) Marketing ANVIL movie Very niche audience, how do you reach them? Friday, July 27, 2012
  • 68. Case Study: TWITTER Social Marketing? Viacom’s VH1 Twitter campaign on ANVIL (the movie) (week of May 14th , 2009 – see AdAge article) Marketing ANVIL movie Very niche audience, how do you reach them? Diffusion Give 20 free DVD’s to major related artists/groups, ask them To notify Twitter groups – reached over 2M people Friday, July 27, 2012
  • 69. Case Study: TWITTER Social Marketing? Viacom’s VH1 Twitter campaign on ANVIL (the movie) (week of May 14th , 2009 – see AdAge article) Marketing ANVIL movie Very niche audience, how do you reach them? Diffusion Give 20 free DVD’s to major related artists/groups, ask them To notify Twitter groups – reached over 2M people Social Identity: power of word of mouth... What is the Cost to VH-1? Compare with traditional approach: TV commercials to promote a documentary film? Friday, July 27, 2012
  • 70. The  Display  Ads  Challenge  Today Damaging to Brand Irrelevant and Damaging to Brand Mismatch Completely Irrelevant Friday, July 27, 2012
  • 71. NetSeer:  Intent  for  Display • And  Currently  Processing  4  Billion  Impressions  per   Day Friday, July 27, 2012
  • 72. NetSeer:  Intent  for  Display • And  Currently  Processing  4  Billion  Impressions  per   Day Friday, July 27, 2012
  • 73. NetSeer:  Intent  for  Display • And  Currently  Processing  4  Billion  Impressions  per   Day Friday, July 27, 2012
  • 74. NetSeer:  Intent  for  Display • And  Currently  Processing  4  Billion  Impressions  per   Day Friday, July 27, 2012
  • 75. Problem:  Hard  to  Understand  User  Intent Contextual  Ad  served  by  Google Friday, July 27, 2012
  • 76. Problem:  Hard  to  Understand  User  Intent Contextual  Ad  served  by  Google What  NetSeer  Sees:   Friday, July 27, 2012
  • 77. Specialized  Search  through  Big  Data   AnalyFcs  over  the  Offers  Universe Friday, July 27, 2012
  • 78. Chaos  for   Consumers   Consumer Friday, July 27, 2012
  • 79. Chaos  for   Consumers   Deep Discount Sites Coupon Flash Sites Sale Sites Loyalty Program s Daily Deals Sites Consumer Online Loyalty Program Social s Search Network Engines s Friday, July 27, 2012
  • 80. The  Business  Model  behind  ChoozOn’s  Use  of  Big   Data Friday, July 27, 2012
  • 81. The  Business  Model  behind  ChoozOn’s  Use  of  Big   Data Consumers • Value from brands they love • Tame the deal chaos Friday, July 27, 2012
  • 82. The  Business  Model  behind  ChoozOn’s  Use  of  Big   Data Consumers • Value from brands they love • Tame the deal chaos Marketers • Reach targeted consumers • Build loyalty • Create brand evangelists Friday, July 27, 2012
  • 83. The  Business  Model  behind  ChoozOn’s  Use  of  Big   Data Consumers • Value from brands they love • Tame the deal chaos Marketers • Reach targeted consumers • Build loyalty • Create brand evangelists Friday, July 27, 2012
  • 85. Carol’s  Consumer  Network “Chozen”  Brands Friday, July 27, 2012
  • 86. Carol’s  Consumer  Network “Chozen”  Brands Interests  (Intent) Friday, July 27, 2012
  • 87. Carol’s  Consumer  Network “Chozen”  Brands Interests  (Intent) Shopping  Pals Friday, July 27, 2012
  • 88. Carol’s  Consumer  Network “Chozen”  Brands Interests  (Intent) Shopping  Pals Deal  Clubs Friday, July 27, 2012
  • 89. What  Carol  Gets Carol’s  Consumer  Network Friday, July 27, 2012
  • 90. What  Carol  Gets Carol’s  Consumer  Network The  Universe  of  Deals Affiliate  Deals Loyalty  Programs Daily  Deals Flash  Sales Intelligent Matching A  Personal  Shopper  for  Deals Friday, July 27, 2012
  • 91. What  Carol  Gets Carol’s  Consumer  Network The  Universe  of  Deals Affiliate  Deals 1,600+  brands Loyalty  Programs 100,000+  offers Daily  Deals Flash  Sales Intelligent Matching A  Personal  Shopper  for  Deals Friday, July 27, 2012
  • 92. What  Carol  Gets Carol’s  Consumer  Network The  Universe  of  Deals Affiliate  Deals 1,600+  brands Loyalty  Programs 100,000+  offers Daily  Deals Flash  Sales PLUS her Inbox™ Intelligent Matching A  Personal  Shopper  for  Deals Friday, July 27, 2012
  • 93. Thank  You!   TwiAer:  @choozon nick@choozon.net www.ChoozOn.com Friday, July 27, 2012