Big Data represent an opportunity for organizations with data analysis needs. Companies need to prepare a number of functions to address the Big Data Challenge.
The following presentation describes the Big Data landscape for marketing technology, introducing several applications, and describing the three key aspects a media agency must focus on when dealing with Big Data analysis applications.
5. #amecsummit
WiFi Access
User name:
Password:
Data Sources
Social Media
Highly unstructured / Heterogeneous
Fine grained (seconds, posts/events)
~ 1,200 M posts/year (1.56 TB)
~ 0.5 M social ad events/year (250 GB)
Digital Display Advertising
Structured / Heterogeneous
Fine grained (seconds, events)
~ 145,000 M ad serving events (11 TB)
Search Engine Marketing
Structured / Heterogeneous
Fine grained (seconds, events)
~ 191 M search events (20 GB)
Site Analytics
Structured / Heterogeneous
Data aggregated (daily, weekly, monthly)
~ 51 M records (11 GB)
Off-line Advertising
Structured / Highly Heterogeneous
Data aggregated (minutes, hourly, daily,
weekly, monthly, …)
Not So Big Data
(but require to be integrated with the rest
of data sources)
Customer data
(CRM, Sales, Visits to Store, …)
Structured? / Highly Heterogeneous
Fine grained and aggregated
Requires to be integrated with the rest of
data sources)
6. #amecsummit
WiFi Access
User name:
Password:
Data Sources
Data Heterogeneity
From structured to unstructured formats
Formats non-standardised
Different levels of aggregation
From fine-grained posts and events
To time-series aggregated at different time periods (every second, hourly, daily, etc.)
Different data sizes
Big Data vs. “not so big” Data (both useful)
Data Integration is a challenge
Even for the same kinds of data (Twitter vs. Facebook, Google vs. Affiliate Networks)
7. #amecsummit
WiFi Access
User name:
Password:
Example Applications
Brand Reputation Monitoring
Market Research
Social CRM
Measuring the Performance of Communication
Supporting Digital Media Planning and Buying
8. #amecsummit
WiFi Access
User name:
Password:
What are the 3 things that a company needs to focus on?
2. Alignment between technology and specific
customer needs
From data observations to actionable knowledge
Deliverables must go beyond basic metrics extracted from
social media monitoring
9. #amecsummit
WiFi Access
User name:
Password:
SoA: Brand Reputation Monitoring
Obtain brand reputation KPIs and monitor its evolution over time
Brand Popularity
Volume of mentions to a brand in social media
Brand Valuation
Sentiment of the opinions mentioning the brand
Brand Attributes
Topics mentioned when talking about the brand
11. #amecsummit
WiFi Access
User name:
Password:
SoA: Brand Reputation Monitoring
So what?
Are these basic insights enough?
Which is the status of my market by segments of
consumers?
How can I focus CRM in social media on the appropriate
targets?
Which is the performance of my advertising campaigns
on paid media?
12. #amecsummit
WiFi Access
User name:
Password:
From brand reputation to market monitoring
McKinsey (2009). The Consumer Decision Journey
The Consumer Decision Journey + Socio-demographic Segmentation
80% 20%
75% 25%
Gender
Age
Location
Purchase Intention
75% 25%
14. #amecsummit
WiFi Access
User name:
Password:
Social CRM
Challenge
Handling thousands of brand consumers in social media
Community managers cannot handle every social media
user
Approach
Dealing with communities
Focusing communication on relevant consumers
Discovering information propagation paths
Big graph analysis techniques
Thousands of brand consumers (nodes)
Millions of relationships (friend of, follows, …)
15. #amecsummit
WiFi Access
User name:
Password:
Measuring the Performance of Communication
Challenge
Measure the influence of paid media
over earned media
Approach
Generate time series for buzz and
advertising pressure
Detect events on time series
Find explanations to the events
Correlate buzz with advertisement
pressure
Brand popularity
Campaign GRPs
17. #amecsummit
WiFi Access
User name:
Password:
Cannot be automatized
Business objectives and
application success criteria
must be defined by the
Organization Strategist and
Customers
BIG Data application
requirements
Cross Industry Standard Process for Data Mining
Business Understanding
18. #amecsummit
WiFi Access
User name:
Password:
Data Understanding
Cannot be automatized
Domain Experts and Data
Analysts
To become familiar with the data
To identify data quality problems
and solutions
To asses that the data is valid for
achieving business objectives
Cross Industry Standard Process for Data Mining
19. #amecsummit
WiFi Access
User name:
Password:
Data preparation
Can be automatized
Examples:
Remove spam from content gathered
Filter content according to its
language
Filter content according to is context,
etc.
Application Developers
BIG Data Warehousing skills (e.g.,
HIVE, PIG, …)
AI skills (e.g., NLP)
20. #amecsummit
WiFi Access
User name:
Password:
Modeling
Can be automatized
Application Developers with expertise in
Data Mining and AI
Data mining frameworks
Machine learning classifiers
Clustering techniques
Statistical analysis tools
…
21. #amecsummit
WiFi Access
User name:
Password:
Evaluation
Cannot be automatized
Data Scientists are required to validate
the models
Assessing the correctness of the model
Validating that a correlation found imply
causality
Assessing that the sample used is
representative
…
22. #amecsummit
WiFi Access
User name:
Password:
Deployment
Cannot be automatized
Knowledge obtained must be translated
into a final report aligned with business
for consumers
Data insights enhanced by Consultants
work
From observations to decision support
From KPIs to recommendations
Editor's Notes
The presentation is structured according to the 3 key aspects that the company must focus on (in our opinion)The first one is technology.In this section I have characterised the data sources and enumerated a set of applications exploiting these data sources.Data sources are characterised by the huge volume of heterogeneous data they contain.Applications are characterised by applying heavyweight data analysis processes to the data extracted from such data sources.
Data sources used by marketing applications are diverse.Regarding data structure, such data sources range from structured ones (e.g., off-line advertising) to highly unstructured ones (e.g., social media)Regarding the format of the data they contain, data sources are typically heterogeneous (e.g., Twitter and Facebook data formats)Regarding the granularity of the data, data sources range from those with fine-grained data (e.g., at the level of impression or click in digital display advertising) to data aggregated monthly (e.g., off-line media audience studies)Regarding size, data sources range from Big DATA sources with several TB (e.g., Digital Display Advertising and Social Media) to small ones (e.g., off-line advertising)
Data HeterogeneityFrom structured to unstructured formatsFormats non-standardisedDifferent levels of aggregationFrom fine-grained posts and eventsTo time-series aggregated at different time periods (every second, hourly, daily, etc.)Different data sizesBig Data vs. “not so big” Data (both useful)Data Integration is a challengeEven for the same kinds of data (Twitter vs. Facebook, Google vs. Affiliate Networks)Social media content: Twitter vs. FacebookDigital display advertising: Google vs. Affiliate Networks
Different kind of applications can be developed by using the aforementioned data sources (individually or combining them)The first kind are Brand Reputation Monitoring applications, which make use of social media data.The second one are Market Research Applications, which typically combine different kinds of data sources, such as social media data for opinion studies and customer data (e.g., sales data)Another kind of applications are Social CRM applications, which consist on managing relationship with customers in social media.The forth group are applications for measuring the performance of communication, which also involve integrating heterogeneous data sources (e.g., online and offline advertising data sources with site analytics)Finally, a promising less-explored group of applications are those that automatize the process of media planning by analysing the insights obtained from the marketing data sources (e.g., placing advertising on TV programs, according to the engagement obtained from Social TV data sources.
The second key aspect a company should focus regarding BIG data applications is the alignment between technology specific customers needs.Moving from enumerating facts extracted from data to using such facts for decision support.In our opinion, such needs go beyond basic social media monitoring.
Brand Reputation Monitoring Applications are state-of-the-art applications of social media analysisWhich capture a set of basic KPIs and monitor their evolution over timeSuch indicators are typically:Brand popularity (i.e., volume of mentions to a brand in social media)Brand valuation (i.e., sentiment of the opinions mentioning the brand – positive, neutral or negative)Brand attributes (i.e., topics mentioned when talking about the brand)
In our opinion, Brand Reputation Monitoring applications are not enough to capture the complexity of the analyses required for social media studiesVolume and sentiment are metrics that cannot explain the status of the market by themselves.In addition, such applications perform a poor segmentation of the consumer (age, gender, incomes, place of residence).As these applications are focused on opinions (not in customers), it is difficult to perform Social CRM by using such applications.Finally, for a marketing company it is difficult to measure business-relevant KPIs, such as the performance of an advertising campaign.Next, we well explain several inovative applications being developed by Havas Media Group, that try to go further by aligning the analyis processes with marketing-specific business cases.
The first example consist in monitoring the status of a market by obtaining the state of decision of consumers regarding the acquisition of a productSpecfifically analyse social media opinions about brands and align social media users with states of the Consumer Decision JourneySuch states are the following:AWARENESS:Theconsumer considers an initial set of brands, based on brand perceptions and exposure to recent touch pointsEVALUATION: Consumers add of subtract brands as they evaluate what they wantPURCHASE: Ultimately, the consumer selects a brand at the moment of purchasePOST-PURCHASE: After purchasing a product or service, the consumer build expectations based on experience to inform the next decision journeyWe combine the detection of the stage in the Consumer Decision Journey with a socio-demographic segmentation of social media users
This slide presents example visualisations of the application being developed.The picture on the left reflects the volume of users that are on a particular stage segmented by place of residence.The picture on the upper right corner shows the distributions of consumers by several dimensions (e.g., age, gender, sentiment, location, …).The picture on the lower right corner shows the evolution of the distribution of consumers by stage in the consumer decision journey.
Anotherexampleapplicationis Social CRM.ChallengeHandling thousands of brand consumers in social mediaCommunity managers cannot handle every social media userTimelines are hugeApproachDealing with communities instead of dealing with consumersDetecting clusters of consumersFocusing communication on relevant consumersDetecting user roles (opinion leaders, influencers, …)Discovering information propagation pathsBig graph analysis techniquesThousands of brand consumers (nodes)Millions of relationships (friend of, follows, …)
The last application we want to show is one used for measuring the performance of communiction.ChallengeMeasure how marketing campaigns (paid media) influence in word of mouth (earned media)ApproachGenerate time series for buzz and advertising pressureApply time series data mining techniquesDetecting events on time seriesFind explanations to the events detected with trending topicsCorrelate buzz and advertising time series (thousands of tests)
The third key aspect a company should focus are human resources and methodologyIn our opinion, preparing for BIG data applications is not only about technologyThe apropriate skills must be found, and the approprite workflow must be implmemented