SlideShare a Scribd company logo
1 of 17
Download to read offline
Big Data: Opportunities, Strategies and Challenges
Executive Summary
Gregg Barrett
1
Acknowledgment
This report draws extensively, and focuses on, the work and viewpoints from industry participants
including:
Diversity Limited
Economist Intelligence Unit
Gartner
HBR
Hortonworks
IBM
ITG
Intel
McKinsey
Ordnance Survey
John Standish Consulting
Christopher Bienko @ IBM
Dirk deRoos @ IBM
John Choi @ IBM
Marc Andrews @ IBM
Paul Zikopoulos @ IBM
Rick Buglio @ IBM
Strategy Meets Action
References are included in-text as well as in the References section at the end of the report.
2
Challenges facing the industry
Difficult and uncertain economic conditions, low interest rates, decreasing underwriting profitability,
higher combined ratios and low investment returns are placing insurers under stress. Insurers also
have to confront commoditisation of the business, more informed consumers, high customer churn
rates, new distribution channels and strong competition. If this was not enough natural perils,
increases in regulatory intervention and greater demands for transparency by regulators, together
with ever increasing compliance requirements are placing immense strain on the capabilities of
insurers.
According to IBM (2013) to thrive in this environment insurers must gain a specific set of capabilities
that will allow them to:
- Build a customer-centric business model
- Find profitable ways to sustain growth
- Develop new, competitively priced products
- Increase claims efficiency and effectiveness
- Improve capital management and investment decisions
- Improve risk management and regulatory reporting
(IBM, 2013, pg. 2)
Insurers are turning to analytics
The business of insurance is based on analysing data to understand and evaluate risks. Two important
insurance professions, actuarial and underwriting, emerged at the beginning of the modern insurance
era in the 17th century. These both revolve around and are dependent upon the analysis of data.
(Strategy Meets Action, 2012, pg. 3)
While the insurance industry has long been recognized for analysing data, the new news involves the
overwhelming amount of data that is now available for analysis and the sophistication of the
technology tools that can be used to perform the analysis. The opportunities for advanced analysis
are many and the potential business impact is enormous.
(Strategy Meets Action, 2013, pg. 3)
3
The Concept of Big Data
In simple terms Big Data refers to a data environment that cannot be handled by traditional
technologies.
Big Data is often described in terms of the three V’s, and if you are at IBM, it is likely to be the four V’s
. Figure 1 below illustrates the IBM four V representation of Big Data:
Figure 1: Big Data in dimensions
Figure 1. Four dimensions of big data. Copyright 2012 by IBM. Reprinted with permission.
Volume refers to the quantity (gigabytes, terabytes, petabytes etc.) of data that organizations are
trying to harness. Importantly there is no specific measure of volume that defines Big Data, as what
constitutes truly “high” volume varies by industry and even geography. What is clear is that data
volumes continue to rise.
Variety refers to different types (forms) of data and data sources. When referring to data types this
includes; numeric, text, image, audio, web, log files etc., whether structured or unstructured. The
growth of data sources such as social media, smart devices, sensors and the Internet of Things has not
only resulted in increases in the volume of data but increases in the types of data as well.
Velocity refers to speed at which data is created, processed and analysed. Velocity impacts latency,
which is the lag time between when data is created or captured, and when it is processed into an
output form for decision making purposes. Importantly, certain types of data must be analysed in real-
time to be of value to the business, a task that places impossible demands on traditional systems
where the ability to capture, store and analyse data in real-time is severely limited.
Veracity refers to the level of reliability associated with certain types of data. According to IBM some
data is inherently uncertain, for example: sentiment and truthfulness in humans; GPS sensors
bouncing among the skyscrapers of Manhattan; weather conditions; economic factors; and the future.
When dealing with these types of data, no amount of data cleansing can correct for it. Yet despite
uncertainty, the data still contains valuable information. The need to acknowledge and embrace this
uncertainty is a hallmark of Big Data. (IBM, 2012, pg. 5)
4
The Big Data Impact
According to McKinsey (2011), Big Data creates value in several ways:
- Creating transparency
- Enabling experimentation to discover needs, expose variability, and improve performance
- Segmenting populations to customize actions
- Replacing/supporting human decision making with automated algorithms
- Innovating new business models, products, and services
To understand the impact at an organisational level, Erik Brynjolfsson with a team at MIT, working in
partnership with McKinsey, Lorin Hitt at Wharton and the MIT doctoral student Heekyung Kim,
conducted structured interviews with executives at 330 public North American companies about their
organizational and technology management practices, and gathered performance data from their
annual reports and independent sources.
Based on the analyses they conducted one relationship stood out: The more companies characterized
themselves as data-driven, the better they performed on objective measures of financial and
operational results. In particular, companies in the top third of their industry in the use of data-driven
decision making were, on average, 5% more productive and 6% more profitable than their
competitors. This performance difference remained robust after accounting for the contributions of
labour, capital, purchased services, and traditional IT investment. (HBR, 2012)
Further an IBM study based on survey responses of more than 1,000 business and IT executives from
more than 60 countries, revealed four transformative shifts in the use of Big Data:
1. A solid majority of organizations are now realizing a return on their Big Data investments
within a year.
2. Customer centricity still dominates analytics activities, but organizations are increasingly
solving operational challenges using Big Data.
3. Integrating digital capabilities into business processes is transforming organizations.
4. The value driver for Big Data has shifted from volume to velocity.
(IBM, 2014, pg. 1)
While Big Data has resulted in significant opportunity it has also brought new challenges. According
to Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014), some challenges include:
- Greater volumes of data than ever before
Placing more demands on the organisations security plan.
- The experimental and analytical usage of the data
Democratizing data within the organisation requires building trust into the Big Data
platform. A data governance framework covering lineage, ownership etc. is required for any
successful Big Data project.
- The nature and characteristics of Big Data
The data consists of more sensitive personal details than ever before raising governance, risk
and compliance concerns.
- The adoption of technologies that are still maturing
5
Big Data technologies like Hadoop (and much of the NoSQL world) do not have all of the
enterprise hardening from a security perspective that’s needed, and there’s no doubt
compromises are being made.
A look at Big Data in Insurance
Exploration and discovery
Big Data necessitates an approach of exploration and discovery. As articulated by Gartner (2013),
business analysts have typically worked to a requirements-based model, answering clearly-defined
business questions. Big Data, however, demands a different approach, using opportunistic analytics
and exploring answers to ill-formed or non-existent questions. (Gartner, 2013, pg. 1)
Figure 2: Culture change - Discovery versus control
Figure 2. A better assessment of the data around and connected to a single piece of information enables a more complete,
in-context understanding. Copyright 2013 by IBM. Reprinted with permission.
Moving to a data driven culture
Gartner (2014) has found that many insurance IT departments lack a consistent, enterprise-wide
business intelligence and data management strategy, because of siloed, line of-business-centric IT
systems. (Gartner, 2014, pg. 6)
In embracing the Big Data paradigm the Economist Intelligence Unit (2013) suggests moving towards
what they call a “data driven culture”. According to the report, in promoting a data driven culture
organisations should consider:
- Data-driven companies place a high value on sharing. Companies own data, not employees. Data
are a resource that can power growth, not something to be hoarded.
- Shared data should be utilised by as many employees as possible, which in practice means rolling
out training wherever it is needed.
6
- Data collection needs to be a primary activity across departments
- Perhaps most importantly, implementing a data driven culture requires buy-in from the top;
without that, little will change.
(Economist Intelligence Unit, 2013, pg. 11)
Emerging techniques in Big Data on the insurance front
According to Ordnance Survey (2013) the following are some of the emerging techniques being
deployed by insurers:
- Predictive modelling: already well used by insurance companies, this works even better when
more data is fed into the model.
- Data-clustering: automated grouping of similar data points can provide new insights into
apparently familiar situations. Livehoods.org is an example of how social media and ‘machine
learning’ can reveal previously-unseen patterns.
- Sentiment analysis: textual keyword analysis can help analyse the mood of Twitter chatter on a
given topic or brand.
- Web crawling: sophisticated programmes that can identify an individual’s ‘web footprint’ as a
result of posting on social media websites, blogs and photo-sharing services. Using data-matching,
this can be linked to public records and data from other third parties to build a multi-dimensional
profile of an individual.
(Ordnance Survey, 2013, pg. 22)
Data protection, a lurking risk
In addition to the transformative shifts in the use of Big Data mentioned earlier, the same IBM report
found that respondents rated data protection lowest on the list of data priorities; only 11 percent of
respondents identified it a “top three” priority. Given the proliferation of large-scale data breaches in
recent years, organizations risk the loss of customer and business partner confidence if adequate
precautions are not taken to safeguard data, as well as legal and remediation fees. Moreover, business
leaders should thoughtfully consider how their organizations use data to minimize any potential
backlash in perceived privacy infringement. (IBM, 2014, pg. 9)
Skills gap
The Big Data environment requires a skill set that is new to most organisations – requiring people with
deep expertise in statistics and machine learning, as well as managers and analysts who know how to
operate companies by using insights from Big Data.
According to McKinsey (2011), the United States alone faces a shortage of 140,000 to 190,000 people
with deep analytical skills as well as 1.5 million managers and analysts to analyse Big Data and make
decisions based on their findings.
In addressing the skills gap, IBM (2014) suggests organisations should consider the following:
7
Learn from the best within your organization.
- Tap into the pockets of talent within the organization - those few using predictive or
prescriptive analytics - to expand the skills of others.
- Create a strong internal professional program to arm analysts and executives who already
understand the organization’s business fundamentals with analytics. Sharing resources and
knowledge is a cost-effective way to build skills and helps limit the need to seek talent
elsewhere.
Externally supplement skills based on business case.
Not all organizations need a data scientist full time; the same is true for niche analytics skills that may
be used only to solve specific challenges.
- Organizations should invest in the talent and skills they need to solve the majority of their
analytics demands
- Consider vendors to supplement critical niche skills that are hard to find and expensive to
employ.
(IBM, 2014, pg. 15)
Big Data technologies
Apache Hadoop is the starting point for most organizations wanting to take the plunge into Big Data
analysis.
The Hadoop ecosystem
In their book, Big Data Beyond the Hype, Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014)
classify Hadoop as an ecosystem of software packages that provides a computing framework. These
include MapReduce, which leverages a K/V (key/value) processing framework (don’t confuse that with
a K/V database); a file system (HDFS); and many other software packages that support everything from
importing and exporting data (Sqoop) to storing transactional data (HBase), orchestration (Avro and
ZooKeeper), and more.
When you hear that someone is running a Hadoop cluster, it’s likely to mean MapReduce (or some
other framework like Spark) running on HDFS, but others will be using HBase (which also runs on
HDFS). Vendors in this space include IBM (with BigInsights for Hadoop), Cloudera, Hortonworks, MapR,
and Pivotal. On the other hand, NoSQL refers to non-RDBMS SQL database solutions such as HBase,
Cassandra, MongoDB, Riak, and CouchDB, among others.
(Zikopoulos, deRoos, Bienko, Buglio, Andrews, 2014, pg. 38)
Key components of many Big Data environments:
MapReduce
MapReduce is a system for parallel processing of large data sets.
According to IBM (2015) as an analogy, you can think of map and reduce tasks as the way a census
was conducted in Roman times, where the census bureau would dispatch its people to each city in the
empire. Each census taker in each city would be tasked to count the number of people in that city and
then return their results to the capital city. At the capital, the results from each city would be reduced
to a single count (sum of all cities) to determine the overall population of the empire. This mapping of
8
people to cities, in parallel, and then combining the results (reducing) is much more efficient than
sending a single person to count every person in the empire in a serial fashion. (IBM, 2015)
Hadoop
MapReduce is the heart of Hadoop. Hadoop is an open source software stack that runs on a cluster of
machines. Hadoop provides distributed storage and distributed processing for very large data sets.
NoSQL
NoSQL is a database environment. Using the definition from Planet Cassandra (2015), a NoSQL
database environment is, simply put, a non-relational and largely distributed database system that
enables rapid, ad-hoc organization and analysis of extremely high-volume, disparate data types.
NoSQL databases were developed in response to the sheer volume of data being generated, stored
and analyzed by modern users (user-generated data) and their applications (machine-generated data).
(Planet Cassandra, 2015)
Spark
What is Spark and what does it mean for Hadoop?
IBM (2014) refers to Spark as an open source engine for fast, large-scale data processing that can be
used with Hadoop, boasting speeds up to 100 times faster than Hadoop MapReduce in memory, or 10
times faster on disk. As with the early enthusiasm around Hadoop, Spark should not be thought of as
a singular platform for analytics, as it can be used with existing investments for the widest variety of
data types and analytics workloads. (IBM, 2014)
Figure 3: Example of a Big Data environment
Figure 3. Application Enrichment with Hadoop. Copyright 2013 by Hortonworks Inc.. Reprinted with permission.
9
The impact of Hadoop
According to IBM (2015), Hadoop changes the economics and dynamics of large-scale computing by
enabling a solution that is:
- Scalable: Add new nodes as needed without changing data formats, how data is loaded, how
jobs are written or the applications on top.
- Cost-effective: Hadoop brings massively parallel computing to commodity servers. The result
is a significant decrease in the cost per terabyte of storage, which in turn makes it affordable
to model all your data.
- Flexible: Hadoop is schema-less, and can absorb any type of data, structured or not, from a
number of sources. Data from multiple sources can be joined and aggregated in arbitrary
ways, enabling deeper analyses than any one system can provide by itself.
- Fault-tolerant: When you lose a node, the system redirects work to another location of the
data and continues processing without missing a beat.
(IBM, 2015, pg. 2)
Hadoop challenges
Hadoop is not without its own set of challenges. According to IBM (2014), there are four key areas of
Hadoop that need to mature in order to drive wider adoption, these include:
1) Performance
2) the reduction of skills
3) data governance
4) deep integration with existing technologies
(IBM, 2014)
Along similar lines TDWI Research (2015) in a recent survey found respondents struggling with the
following barriers to Hadoop implementation:
Barriers to Hadoop:
- Skills gap
- Weak business support
- Security concerns
- Data management hurdles
- Tool deficiencies
- Containing costs
(TDWI Research, 2015)
According to a study by the International Technology Group, organisations need to be particularly
mindful in the highly skilled programming requirements demanded of most Hadoop environments,
noting that:
Although the field of players has since expanded to include hundreds of venture capital-funded start-
ups, along with established systems and services vendors and large end users, social media businesses
continue to control Hadoop. Most of the more than one billion lines of code – more than 90 percent,
according to some estimates – in the Apache Hadoop stack has to date been contributed by these.
10
The priorities of this group have inevitably influenced Hadoop evolution. There tends to be an
assumption that Hadoop developers are highly skilled, capable of working with “raw” open source
code and configuring software components on a case-by-case basis as needs change. Manual coding
is the norm.
Decades of experience have shown that, regardless of which technologies are employed, manual
coding offers lower developer productivity and greater potential for errors than more sophisticated
techniques.
(ITG, 2013, pg. 2)
Big Data in the context of traditional technologies
The Big Data environment has been brought about by the advancement in technology enabling the
processing and storage of the volume, variety, velocity and veracity of data, which is beyond the
capabilities of traditional technology.
Big Data supplements traditional systems
As illustrated in Figure 3, the Big Data environment supports traditional technology, extending
capabilities into areas previously unsupported.
Gartner (2013) suggest that Big Data doesn't replace traditional data and analytics:
“…..big data technologies are not really replacing incumbents such as business intelligence, relational
database management systems and enterprise data warehouses. Instead, they supplement traditional
information management and analytics.” (Gartner, 2013, pg. 13)
Examples of three insurance use cases with Big Data
According to Gartner (2013) Big Data and the associated technology has been shown to provide the
following benefits:
- Detection and prevention of fraud or other security violations
- High ROI
- Little operational disruption
(Gartner, 2013, pg. 5)
Big Data to fight fraud
According to John Standish Consulting (2013), mobilizing Big Data is gaining wider attention in anti-
fraud circles. Insurers are sitting on troves of data, hard and soft. Much is never accessed for fraud-
fighting. Insurers can dramatically increase their anti-fraud assertiveness by insightfully accessing,
analyzing and mobilizing their large volumes of untapped data.
Marshaling analytics and big data with current rules and indicators into a seamless and unified anti-
fraud effort creates an expansive world of possibilities.
- Imagine the ability to search a billion rows of data and derive incisive answers to complex
questions in seconds.
- Imagine being able to comb through huge numbers of claim files quickly.
11
- Imagine more-quickly linking numerous ring members and entities acting in well-disguised
concert. These suspects likely could not be detected with sole or even primary reliance on
basic methods such as fraud indicators.
- Ultimately, imagine analyzing entire caseloads faster and more completely, thus addressing
the largest fraud problems and cost drivers in any of an insurer’s coverage territories.
(Standish, 2013)
Case study: Fraud at IBC
The Insurance Bureau of Canada (IBC) is the national insurance industry association representing
Canada’s home, car and business insurers. Because investigation of cases of suspected automobile
insurance fraud often took several years, the company’s investigative services division wanted to
accelerate its’ process. The IBC worked with IBM to conduct a proof of concept (POC) in Ontario,
Canada that explored new ways to increase the efficiency of fraud identification. The POC showed
how IBM solutions for big data can help identify suspect individuals and flag suspicious claims. IBM
solutions also help users visualize relationships and linkages to increase the accuracy and speed of
discovering potential fraud. In the POC, more than 233,000 claims from six years were analyzed. The
IBM solutions identified more than 2,000 suspected fraudulent claims with a value of CAD41 million.
IBM and the IBC estimate that these solutions could save the Ontario automobile insurance industry
approximately CAD200 million per year.
(IBM, 2012)
Big Data for customer segmentation
Case study: Customer segmentation at Progressive
In July 2012, Progressive Insurance released new findings from an analysis of five billion real-time
driving miles, confirming that driving behaviour has more than twice the predictive power of any other
insurance rating factor. Loss costs for drivers with the highest-risk driving behaviour are approximately
two-and-a-half times the costs for drivers with the lowest-risk behaviour. These results suggest that
car insurance rates could be far more personalized than they are today.
Progressive has also found that 70% of drivers who have signed up for its’ Snapshot UBI program pay
less for their insurance. The program involves installing a small monitoring device in the car (900,000
drivers have already done this) and driving normally. After the device has collected enough data,
customers receive a personalized rate for their insurance. Progressive is currently expanding access to
Snapshot to all of its’ drivers - not just Progressive customers - who can take a free test drive of the
technology and after 30 days find out whether their own driving behaviour can lower the price they
pay for insurance.
The problem with today's less granular systems of customer classification in the property and casualty
insurance market is that the majority of drivers who present a lower risk subsidize the minority of
higher-risk drivers.
(Gartner, 2013, pg. 5)
Big Data for underwriting
Case study: Improving underwriting decisions
A large global property casualty insurance company wanted to accelerate catastrophe risk modelling
in order to improve underwriting decisions and determine when to cap exposures in its’ portfolio. The
current modelling environment was too slow and unable to handle the large-scale data volumes that
12
the company wanted to analyze. The goal was to run multiple scenarios and model losses in hours,
but the current environment required up to 16 weeks. As a result, the company conducted analysis
only three or four times per year. A proof of concept demonstrated that the company could improve
performance by 100 times, accelerating query execution from three minutes to less than three
seconds.
The company decided to implement IBM solutions for big data, and can now run multiple catastrophe
risk models every month instead of only three or four times per year. Once data is refreshed, the
company can create “what-if” scenarios in hours rather than weeks. With a better and faster
understanding of exposures and probable maximum losses, the company can take action sooner to
change loss reserves and optimize its’ portfolio.
(IBM, 2013, pg. 7)
Costs associated with typical Big Data implementations
Although a Big Data environment such as that illustrated in Figure 3 can be constructed from open
source software, such as Hadoop and a NoSQL database such as MongoDB, there are still substantial
costs involved. These include:
1) Hardware costs
2) IT and operational costs in setting up a machine cluster and supporting it
3) Cost of personnel to work on the ecosystem
These costs are NOT trivial for the following reasons:
- Dealing with cutting edge technology and finding people who know the technology is
challenging
- The technology introduces a different programming paradigm, frequently requiring additional
training of existing engineering teams
- These technologies are new and still evolving and are not yet mature in the enterprise
ecosystem
- The hardware is server grade and large clusters require resources including network
administration, security administration, system administration etc., as well as data centre
operational costs including electricity, cooling etc.
Infrastructure as a Service (IaaS)
One consideration that can mitigate the cost implications of hardware and support personnel is the
use of a cloud offering. As pointed out by Intel (2015) clouds are already deployed on pools of
server, storage, and networking resources and can scale up or down as needed. Cloud computing
offers a cost-effective way to support Big Data technologies and the advanced analytics applications
that can drive business value.
Diversity Limited (2010) defines Infrastructure as a Service (IaaS) as “a way of delivering Cloud
Computing infrastructure – servers, storage, network and operating systems – as an on-demand
service. Rather than purchasing servers, software, datacenter space or network equipment,
organisations instead buy those resources as a fully outsourced service on demand.”
13
Recommended course for Big Data
IBM (2015) recommends that organisations consider the following when embarking on the Big Data
journey:
1. Choose projects with a high potential return on investment, for which data sources are
readily accessible and already in electronic form, and establish clear goals and quantifiable
metrics. There should be a strong business need for making the resulting data easily
accessible to broad user communities.
2. The data architecture should be extensible to allow addition of other data sources, including
streaming data, as needed.
3. As the project continues, create a feedback loop to inform other departments of insights
derived about products, marketing and sales. This helps promote the value of analytics,
builds a culture that focuses on deriving even better information from analytics, and instils a
high level of trust in the data’s veracity and completeness.
4. Surround Hadoop with a strong ecosystem of Big Data tools and analytics capabilities. The
richer the portfolio of capabilities in the selected Hadoop solution, the more freedom teams
have to solve problems and advance the organization’s insights.
(IBM, 2015, pg. 4)
Recommended Big Data platform
- Utilise an IaaS offering
- Explore the MapR and the IBM BigInsights offerings further.
IBM BigInsights example:
IBM BigInsights is based on 100 percent open source Hadoop. It extends Hadoop with enterprise-
grade technology including administration and integration capabilities, visualization and discovery
tools as well as security, audit history and performance management.
According to IBM, the BigInsights platform offers:
- Increased performance: An average 4 times performance gain over open source Hadoop.1
- Usability: BigInsights is optimized for a wide range of roles, including integration developers,
administrators, data scientists, analysts and line-of-business contacts.
- Integrated with IBM Watson™ Foundations big data platform: BigInsights comes bundled with
search and streaming analytics capabilities.
- Analytics: Built-in Hadoop analytics capabilities for machine data, social data, text and Big R
enable you to locate actionable insights from data in the Hadoop cluster rather than having
to move the data around.
Figure 4: Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications – Averages for All Installations
14
Figure 4. Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications. Copyright 2013 by 2013 by the International Technology Group. Reprinted with permission.
Conclusion
Big Data is having a substantive impact on the P&C insurance industry. Insurers are combining Big Data
and analytics to overcome many of the challenges confronting the industry, and to support new
capabilities. Although implementing a Big Data platform is not without its’ challenges, through careful
consideration, the organisation should be able to generate an appreciable return on its’ Big Data and
analytics initiative. The availability of IaaS platforms for Big Data reduce many of the initial risks that
would traditionally be associated with such projects. In addition the Big Data offerings from MapR
Technologies and IBM, based on initial research appear to be strong candidates for evaluation.
15
References
Diversity Limited. (2010). Moving your infrastructure to the cloud. [pdf]. Retrieved from
http://diversity.net.nz/wp-content/uploads/2011/01/Moving-to-the-Clouds.pdf
Economist Intelligence Unit. (2013). Fostering a data-driven culture. [pdf].
Retrieved from
http://www.economistinsights.com/search/node/sites%20default%20files%20downloads%20Tableau%20DataCu
lture%20130219%20pdf
Gartner. (2013). Characteristics of the traditional versus the big data approach. [Table]. Retrieved from Gartner. (2013).
Big data business benefits are hampered by 'culture clash'. [pdf]. Retrieved from
https://www.gartner.com/doc/2588415
Gartner. (2013). Use big data to solve fraud and security problems. [pdf]. Retrieved from
https://www.gartner.com/doc/2397715
Gartner. (2013). How it should deepen big data analysis to support customer-centricity. [pdf].
Retrieved from https://www.gartner.com/doc/2531116
Gartner. (2013). Consistent view of the customer for big data. [Diagram]. Retrieved from Gartner. (2013). How it should
deepen big data analysis to support customer-centricity. [pdf].
Retrieved from https://www.gartner.com/doc/2531116
Gartner. (2014). Agenda overview for p&c and life insurance. [pdf].
Retrieved from https://www.gartner.com/doc/2643327
HBR. (2012). Big Data: The management revolution. [pdf].
Retrieved from https://hbr.org/2012/10/big-data-the-management-revolution/ar
Hortonworks. (2013). Application enrichment with hadoop. [Diagram]. Retrieved from Hortonworks. (2013).
Apache Hadoop patterns of use. [pdf]. Retrieved from http://hortonworks.com/blog/apache-hadoop-patterns-of-
use-refine-enrich-and-explore/
IBM. (2012). Four dimensions of big data. [Diagram] Retrieved from IBM, (2012). Analytics: the real-world use of big
data. [pdf]. Retrieved from
http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF
IBM. (2012). Analytics: the real-world use of big data. [pdf].
Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF
IBM. (2012). Insurance bureau of Canada. [pdf]. Retrieved from
http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?subtype=AB&infotype=PM&appname=SWGE_IM_IM_USEN&htmlfid=IMC14775USEN&attachment=I
MC14775USEN.PDF
IBM. (2013). A better assessment of the data around and connected to a single piece of information enables a more
complete, in-context understanding. [Diagram]. Retrieved from IBM. (2013). The future of insurance. [pdf].
Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/imw14671usen/IMW14671USEN.PDF
IBM. (2013). Harnessing the power of big data and analytics for insurance. [pdf]. Retrieved from
http://public.dhe.ibm.com/common/ssi/ecm/en/imw14672usen/IMW14672USEN.PDF
IBM. (2014). Analytics: The speed advantage. [pdf].
Retrieved from http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/
IBM. (2014). IBM expands hadoop commitment with support for spark.. [blog].
Retrieved from http://www.ibmbigdatahub.com/blog/ibm-expands-hadoop-commitment-support-spark
IBM. (2015). Analytics: What is mapreduce. [web page].
Retrieved from http://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/
16
IBM. (2015). BigInsights for apache hadoop quick start edition. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=PM&subtype=BR&htmlfid=IMB14164USEN#loaded
IBM. (2015). Making the case for hadoop and big data in the enterprise. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=PM&subtype=BK&htmlfid=IMM14161USEN#loaded
ITG. (2013). Business case for enterprise big data deployments. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=IME14028USEN&appname=skmwww
ITG. (2013). Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications. [Diagram].
Retrieved from ITG. (2013). Business case for enterprise big data deployments. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=IME14028USEN&appname=skmwww
Intel. (2015). Big data cloud technology. [pdf].
Retrieved from http://www.intel.co.za/content/dam/www/public/us/en/documents/product-briefs/big-data-
cloud-technologies-brief.pdf
McKinsey. (2011). Big data: The next frontier for innovation, competition, and productivity. [pdf].
Retrieved from
http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation
Ordnance Survey. (2013) The big data rush: how data analytics can yield underwriting gold. [pdf].
Retrieved from http://events.marketforce.eu.com/big-data-underwriting-report-email
Planet Cassandra. (2015). Nosql databases defined and explained. [web page].
Retrieved from http://www.planetcassandra.org/what-is-nosql/
Standish, J. (2013). Speed to detection - strategically leveraging advanced analytics for insurance fraud. [blog]. Retrieved
from
http://www.johnstandishconsultinggroup.com/JohnStandishConsultingGroup.com/Blog/Entries/2013/8/9_Speed
_to_Detection_-_Strategically_Leveraging_Advanced_Analytics_for_Insurance_Fraud.html
Strategy Meets Action. (2012). Data and analytics in insurance. [pdf].
Retrieved from https://www.acord.org/library/Documents/2012_SMA_Data_Analytics.pdf
Strategy Meets Action. (2013). Data and analytics in insurance: p&c plans and priorities for 2013 and beyond. [pdf].
Retrieved from https://strategymeetsaction.com/data-and-analytics-in-insurance-p-and-c-plans-and-priorities-
for-2013-and-beyond/
Zikopoulos, P., deRoos, D., Bienko, C., Buglio, R., Andrews, M. (2014). Big data beyond the hype. [pdf].
Retrieved from
https://www.ibm.com/developerworks/community/blogs/SusanVisser/entry/big_data_beyond_the_hype_a_gui
de_to_conversations_for_today_s_data_center?lang=en

More Related Content

What's hot

Big Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperBig Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperVasu S
 
Big data trendsdirections nimführ.ppt
Big data trendsdirections nimführ.pptBig data trendsdirections nimführ.ppt
Big data trendsdirections nimführ.pptAravindharamanan S
 
Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019PromptCloud
 
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET Journal
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnIBM Danmark
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataDavid Pittman
 
Smart Analytics For The Utility Sector
Smart Analytics For The Utility SectorSmart Analytics For The Utility Sector
Smart Analytics For The Utility SectorHerman Bosker
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
 
Big Data: Real-life examples of Business Value Generation with Cloudera
Big Data: Real-life examples of Business Value Generation with ClouderaBig Data: Real-life examples of Business Value Generation with Cloudera
Big Data: Real-life examples of Business Value Generation with ClouderaCapgemini
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesT.S. Lim
 
Ai presentatie
Ai presentatieAi presentatie
Ai presentatieLunaDuFour
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveKun Le
 
Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Gregg Barrett
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Oea big-data-guide-1522052
Oea big-data-guide-1522052Oea big-data-guide-1522052
Oea big-data-guide-1522052Gilbert Rozario
 

What's hot (20)

Big Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperBig Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - Whitepaper
 
Big data trendsdirections nimführ.ppt
Big data trendsdirections nimführ.pptBig data trendsdirections nimführ.ppt
Big data trendsdirections nimführ.ppt
 
Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019
 
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
IRJET- A Scrutiny on Research Analysis of Big Data Analytical Method and Clou...
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
 
Taming the data beast
Taming the data beastTaming the data beast
Taming the data beast
 
The 25 Predictions About The Future Of Big Data
The 25 Predictions About The Future Of Big DataThe 25 Predictions About The Future Of Big Data
The 25 Predictions About The Future Of Big Data
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big Data
 
Smart Analytics For The Utility Sector
Smart Analytics For The Utility SectorSmart Analytics For The Utility Sector
Smart Analytics For The Utility Sector
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big data
 
Big Data: Real-life examples of Business Value Generation with Cloudera
Big Data: Real-life examples of Business Value Generation with ClouderaBig Data: Real-life examples of Business Value Generation with Cloudera
Big Data: Real-life examples of Business Value Generation with Cloudera
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in Businesses
 
Ai presentatie
Ai presentatieAi presentatie
Ai presentatie
 
Data Management
Data ManagementData Management
Data Management
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
 
Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...
 
R180305120123
R180305120123R180305120123
R180305120123
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Oea big-data-guide-1522052
Oea big-data-guide-1522052Oea big-data-guide-1522052
Oea big-data-guide-1522052
 

Viewers also liked

Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Coastal Pet Products, Inc.
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Real time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing applicationReal time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing applicationLeMeniz Infotech
 
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Senthil Kumar
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data scienceMahesh Kumar CV
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
Building Hadoop Data Applications with Kite by Tom White
Building Hadoop Data Applications with Kite by Tom WhiteBuilding Hadoop Data Applications with Kite by Tom White
Building Hadoop Data Applications with Kite by Tom WhiteThe Hive
 
BigDataEurope - Big Data & Energy
BigDataEurope - Big Data & EnergyBigDataEurope - Big Data & Energy
BigDataEurope - Big Data & EnergyBigData_Europe
 
Generating Insight from Big Data in Energy and the Environment
Generating Insight from Big Data in Energy and the EnvironmentGenerating Insight from Big Data in Energy and the Environment
Generating Insight from Big Data in Energy and the EnvironmentDavid Wallom
 
Enterprise Approach towards Cost Savings and Enterprise Agility
Enterprise Approach towards Cost Savings and Enterprise AgilityEnterprise Approach towards Cost Savings and Enterprise Agility
Enterprise Approach towards Cost Savings and Enterprise AgilityNUS-ISS
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview Hortonworks
 
Open-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th JuneOpen-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th JuneInnovative Management Services
 
Building hadoop based big data environment
Building hadoop based big data environmentBuilding hadoop based big data environment
Building hadoop based big data environmentEvans Ye
 

Viewers also liked (16)

Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title) Big Data, Security Intelligence, (And Why I Hate This Title)
Big Data, Security Intelligence, (And Why I Hate This Title)
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Real time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing applicationReal time big data analytical architecture for remote sensing application
Real time big data analytical architecture for remote sensing application
 
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
 
Open-BDA Hadoop Summt 2014 - Post Summit Report
Open-BDA Hadoop Summt 2014 - Post Summit ReportOpen-BDA Hadoop Summt 2014 - Post Summit Report
Open-BDA Hadoop Summt 2014 - Post Summit Report
 
Demystify big data data science
Demystify big data  data scienceDemystify big data  data science
Demystify big data data science
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
Building Hadoop Data Applications with Kite by Tom White
Building Hadoop Data Applications with Kite by Tom WhiteBuilding Hadoop Data Applications with Kite by Tom White
Building Hadoop Data Applications with Kite by Tom White
 
BigDataEurope - Big Data & Energy
BigDataEurope - Big Data & EnergyBigDataEurope - Big Data & Energy
BigDataEurope - Big Data & Energy
 
Generating Insight from Big Data in Energy and the Environment
Generating Insight from Big Data in Energy and the EnvironmentGenerating Insight from Big Data in Energy and the Environment
Generating Insight from Big Data in Energy and the Environment
 
Add
AddAdd
Add
 
Enterprise Approach towards Cost Savings and Enterprise Agility
Enterprise Approach towards Cost Savings and Enterprise AgilityEnterprise Approach towards Cost Savings and Enterprise Agility
Enterprise Approach towards Cost Savings and Enterprise Agility
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Open-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th JuneOpen-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th June
 
Building hadoop based big data environment
Building hadoop based big data environmentBuilding hadoop based big data environment
Building hadoop based big data environment
 

Similar to Big Data: Opportunities, Strategy and Challenges

Value proposition of analytics in P&C insurance
Value proposition of analytics in P&C insuranceValue proposition of analytics in P&C insurance
Value proposition of analytics in P&C insuranceGregg Barrett
 
CS309A Final Paper_KM_DD
CS309A Final Paper_KM_DDCS309A Final Paper_KM_DD
CS309A Final Paper_KM_DDDavid Darrough
 
BigData_WhitePaper
BigData_WhitePaperBigData_WhitePaper
BigData_WhitePaperReem Matloub
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paperJohn Enoch
 
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...IJERA Editor
 
Gartner eBook on Big Data
Gartner eBook on Big DataGartner eBook on Big Data
Gartner eBook on Big DataJyrki Määttä
 
Data set The Future of Big Data
Data set The Future of Big DataData set The Future of Big Data
Data set The Future of Big DataData-Set
 
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Dr. Cedric Alford
 
GSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGang Li
 
Rising Significance of Big Data Analytics for Exponential Growth.docx
Rising Significance of Big Data Analytics for Exponential Growth.docxRising Significance of Big Data Analytics for Exponential Growth.docx
Rising Significance of Big Data Analytics for Exponential Growth.docxSG Analytics
 
Big data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-reportBig data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-reportAravindharamanan S
 
Big data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-reportBig data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-reportAravindharamanan S
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurersdipak sahoo
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedShradha Verma
 
Three big questions about AI in financial services
Three big questions about AI in financial servicesThree big questions about AI in financial services
Three big questions about AI in financial servicesWhite & Case
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)Sonu Gupta
 
How Insurers Can Tame Data to Drive Innovation
How Insurers Can Tame Data to Drive InnovationHow Insurers Can Tame Data to Drive Innovation
How Insurers Can Tame Data to Drive InnovationCognizant
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big DataIRJET Journal
 

Similar to Big Data: Opportunities, Strategy and Challenges (20)

Value proposition of analytics in P&C insurance
Value proposition of analytics in P&C insuranceValue proposition of analytics in P&C insurance
Value proposition of analytics in P&C insurance
 
The Big Data Talent Gap
The Big Data Talent GapThe Big Data Talent Gap
The Big Data Talent Gap
 
CS309A Final Paper_KM_DD
CS309A Final Paper_KM_DDCS309A Final Paper_KM_DD
CS309A Final Paper_KM_DD
 
BigData_WhitePaper
BigData_WhitePaperBigData_WhitePaper
BigData_WhitePaper
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paper
 
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...
How ‘Big Data’ Can Create Significant Impact on Enterprises? Part I: Findings...
 
Gartner eBook on Big Data
Gartner eBook on Big DataGartner eBook on Big Data
Gartner eBook on Big Data
 
Data set The Future of Big Data
Data set The Future of Big DataData set The Future of Big Data
Data set The Future of Big Data
 
Big data Readiness white paper
Big data  Readiness white paperBig data  Readiness white paper
Big data Readiness white paper
 
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
 
GSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-Edition
 
Rising Significance of Big Data Analytics for Exponential Growth.docx
Rising Significance of Big Data Analytics for Exponential Growth.docxRising Significance of Big Data Analytics for Exponential Growth.docx
Rising Significance of Big Data Analytics for Exponential Growth.docx
 
Big data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-reportBig data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-report
 
Big data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-reportBig data-analytics-2013-peer-research-report
Big data-analytics-2013-peer-research-report
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
 
Three big questions about AI in financial services
Three big questions about AI in financial servicesThree big questions about AI in financial services
Three big questions about AI in financial services
 
Big data - a review (2013 4)
Big data - a review (2013 4)Big data - a review (2013 4)
Big data - a review (2013 4)
 
How Insurers Can Tame Data to Drive Innovation
How Insurers Can Tame Data to Drive InnovationHow Insurers Can Tame Data to Drive Innovation
How Insurers Can Tame Data to Drive Innovation
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
 

More from Gregg Barrett

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Gregg Barrett
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeGregg Barrett
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: InsuranceGregg Barrett
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentGregg Barrett
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingGregg Barrett
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Gregg Barrett
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsGregg Barrett
 
Data science unit introduction
Data science unit introductionData science unit introduction
Data science unit introductionGregg Barrett
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings powerGregg Barrett
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be excitingGregg Barrett
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerGregg Barrett
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersGregg Barrett
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Gregg Barrett
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in RGregg Barrett
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using RGregg Barrett
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using RGregg Barrett
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overviewGregg Barrett
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainGregg Barrett
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahoutGregg Barrett
 

More from Gregg Barrett (20)

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiative
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project Document
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boosting
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla Motors
 
Data science unit introduction
Data science unit introductionData science unit introduction
Data science unit introduction
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings power
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be exciting
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing Beer
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managers
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in R
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using R
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overview
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at Intermountain
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahout
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 

Big Data: Opportunities, Strategy and Challenges

  • 1. Big Data: Opportunities, Strategies and Challenges Executive Summary Gregg Barrett
  • 2. 1 Acknowledgment This report draws extensively, and focuses on, the work and viewpoints from industry participants including: Diversity Limited Economist Intelligence Unit Gartner HBR Hortonworks IBM ITG Intel McKinsey Ordnance Survey John Standish Consulting Christopher Bienko @ IBM Dirk deRoos @ IBM John Choi @ IBM Marc Andrews @ IBM Paul Zikopoulos @ IBM Rick Buglio @ IBM Strategy Meets Action References are included in-text as well as in the References section at the end of the report.
  • 3. 2 Challenges facing the industry Difficult and uncertain economic conditions, low interest rates, decreasing underwriting profitability, higher combined ratios and low investment returns are placing insurers under stress. Insurers also have to confront commoditisation of the business, more informed consumers, high customer churn rates, new distribution channels and strong competition. If this was not enough natural perils, increases in regulatory intervention and greater demands for transparency by regulators, together with ever increasing compliance requirements are placing immense strain on the capabilities of insurers. According to IBM (2013) to thrive in this environment insurers must gain a specific set of capabilities that will allow them to: - Build a customer-centric business model - Find profitable ways to sustain growth - Develop new, competitively priced products - Increase claims efficiency and effectiveness - Improve capital management and investment decisions - Improve risk management and regulatory reporting (IBM, 2013, pg. 2) Insurers are turning to analytics The business of insurance is based on analysing data to understand and evaluate risks. Two important insurance professions, actuarial and underwriting, emerged at the beginning of the modern insurance era in the 17th century. These both revolve around and are dependent upon the analysis of data. (Strategy Meets Action, 2012, pg. 3) While the insurance industry has long been recognized for analysing data, the new news involves the overwhelming amount of data that is now available for analysis and the sophistication of the technology tools that can be used to perform the analysis. The opportunities for advanced analysis are many and the potential business impact is enormous. (Strategy Meets Action, 2013, pg. 3)
  • 4. 3 The Concept of Big Data In simple terms Big Data refers to a data environment that cannot be handled by traditional technologies. Big Data is often described in terms of the three V’s, and if you are at IBM, it is likely to be the four V’s . Figure 1 below illustrates the IBM four V representation of Big Data: Figure 1: Big Data in dimensions Figure 1. Four dimensions of big data. Copyright 2012 by IBM. Reprinted with permission. Volume refers to the quantity (gigabytes, terabytes, petabytes etc.) of data that organizations are trying to harness. Importantly there is no specific measure of volume that defines Big Data, as what constitutes truly “high” volume varies by industry and even geography. What is clear is that data volumes continue to rise. Variety refers to different types (forms) of data and data sources. When referring to data types this includes; numeric, text, image, audio, web, log files etc., whether structured or unstructured. The growth of data sources such as social media, smart devices, sensors and the Internet of Things has not only resulted in increases in the volume of data but increases in the types of data as well. Velocity refers to speed at which data is created, processed and analysed. Velocity impacts latency, which is the lag time between when data is created or captured, and when it is processed into an output form for decision making purposes. Importantly, certain types of data must be analysed in real- time to be of value to the business, a task that places impossible demands on traditional systems where the ability to capture, store and analyse data in real-time is severely limited. Veracity refers to the level of reliability associated with certain types of data. According to IBM some data is inherently uncertain, for example: sentiment and truthfulness in humans; GPS sensors bouncing among the skyscrapers of Manhattan; weather conditions; economic factors; and the future. When dealing with these types of data, no amount of data cleansing can correct for it. Yet despite uncertainty, the data still contains valuable information. The need to acknowledge and embrace this uncertainty is a hallmark of Big Data. (IBM, 2012, pg. 5)
  • 5. 4 The Big Data Impact According to McKinsey (2011), Big Data creates value in several ways: - Creating transparency - Enabling experimentation to discover needs, expose variability, and improve performance - Segmenting populations to customize actions - Replacing/supporting human decision making with automated algorithms - Innovating new business models, products, and services To understand the impact at an organisational level, Erik Brynjolfsson with a team at MIT, working in partnership with McKinsey, Lorin Hitt at Wharton and the MIT doctoral student Heekyung Kim, conducted structured interviews with executives at 330 public North American companies about their organizational and technology management practices, and gathered performance data from their annual reports and independent sources. Based on the analyses they conducted one relationship stood out: The more companies characterized themselves as data-driven, the better they performed on objective measures of financial and operational results. In particular, companies in the top third of their industry in the use of data-driven decision making were, on average, 5% more productive and 6% more profitable than their competitors. This performance difference remained robust after accounting for the contributions of labour, capital, purchased services, and traditional IT investment. (HBR, 2012) Further an IBM study based on survey responses of more than 1,000 business and IT executives from more than 60 countries, revealed four transformative shifts in the use of Big Data: 1. A solid majority of organizations are now realizing a return on their Big Data investments within a year. 2. Customer centricity still dominates analytics activities, but organizations are increasingly solving operational challenges using Big Data. 3. Integrating digital capabilities into business processes is transforming organizations. 4. The value driver for Big Data has shifted from volume to velocity. (IBM, 2014, pg. 1) While Big Data has resulted in significant opportunity it has also brought new challenges. According to Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014), some challenges include: - Greater volumes of data than ever before Placing more demands on the organisations security plan. - The experimental and analytical usage of the data Democratizing data within the organisation requires building trust into the Big Data platform. A data governance framework covering lineage, ownership etc. is required for any successful Big Data project. - The nature and characteristics of Big Data The data consists of more sensitive personal details than ever before raising governance, risk and compliance concerns. - The adoption of technologies that are still maturing
  • 6. 5 Big Data technologies like Hadoop (and much of the NoSQL world) do not have all of the enterprise hardening from a security perspective that’s needed, and there’s no doubt compromises are being made. A look at Big Data in Insurance Exploration and discovery Big Data necessitates an approach of exploration and discovery. As articulated by Gartner (2013), business analysts have typically worked to a requirements-based model, answering clearly-defined business questions. Big Data, however, demands a different approach, using opportunistic analytics and exploring answers to ill-formed or non-existent questions. (Gartner, 2013, pg. 1) Figure 2: Culture change - Discovery versus control Figure 2. A better assessment of the data around and connected to a single piece of information enables a more complete, in-context understanding. Copyright 2013 by IBM. Reprinted with permission. Moving to a data driven culture Gartner (2014) has found that many insurance IT departments lack a consistent, enterprise-wide business intelligence and data management strategy, because of siloed, line of-business-centric IT systems. (Gartner, 2014, pg. 6) In embracing the Big Data paradigm the Economist Intelligence Unit (2013) suggests moving towards what they call a “data driven culture”. According to the report, in promoting a data driven culture organisations should consider: - Data-driven companies place a high value on sharing. Companies own data, not employees. Data are a resource that can power growth, not something to be hoarded. - Shared data should be utilised by as many employees as possible, which in practice means rolling out training wherever it is needed.
  • 7. 6 - Data collection needs to be a primary activity across departments - Perhaps most importantly, implementing a data driven culture requires buy-in from the top; without that, little will change. (Economist Intelligence Unit, 2013, pg. 11) Emerging techniques in Big Data on the insurance front According to Ordnance Survey (2013) the following are some of the emerging techniques being deployed by insurers: - Predictive modelling: already well used by insurance companies, this works even better when more data is fed into the model. - Data-clustering: automated grouping of similar data points can provide new insights into apparently familiar situations. Livehoods.org is an example of how social media and ‘machine learning’ can reveal previously-unseen patterns. - Sentiment analysis: textual keyword analysis can help analyse the mood of Twitter chatter on a given topic or brand. - Web crawling: sophisticated programmes that can identify an individual’s ‘web footprint’ as a result of posting on social media websites, blogs and photo-sharing services. Using data-matching, this can be linked to public records and data from other third parties to build a multi-dimensional profile of an individual. (Ordnance Survey, 2013, pg. 22) Data protection, a lurking risk In addition to the transformative shifts in the use of Big Data mentioned earlier, the same IBM report found that respondents rated data protection lowest on the list of data priorities; only 11 percent of respondents identified it a “top three” priority. Given the proliferation of large-scale data breaches in recent years, organizations risk the loss of customer and business partner confidence if adequate precautions are not taken to safeguard data, as well as legal and remediation fees. Moreover, business leaders should thoughtfully consider how their organizations use data to minimize any potential backlash in perceived privacy infringement. (IBM, 2014, pg. 9) Skills gap The Big Data environment requires a skill set that is new to most organisations – requiring people with deep expertise in statistics and machine learning, as well as managers and analysts who know how to operate companies by using insights from Big Data. According to McKinsey (2011), the United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyse Big Data and make decisions based on their findings. In addressing the skills gap, IBM (2014) suggests organisations should consider the following:
  • 8. 7 Learn from the best within your organization. - Tap into the pockets of talent within the organization - those few using predictive or prescriptive analytics - to expand the skills of others. - Create a strong internal professional program to arm analysts and executives who already understand the organization’s business fundamentals with analytics. Sharing resources and knowledge is a cost-effective way to build skills and helps limit the need to seek talent elsewhere. Externally supplement skills based on business case. Not all organizations need a data scientist full time; the same is true for niche analytics skills that may be used only to solve specific challenges. - Organizations should invest in the talent and skills they need to solve the majority of their analytics demands - Consider vendors to supplement critical niche skills that are hard to find and expensive to employ. (IBM, 2014, pg. 15) Big Data technologies Apache Hadoop is the starting point for most organizations wanting to take the plunge into Big Data analysis. The Hadoop ecosystem In their book, Big Data Beyond the Hype, Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014) classify Hadoop as an ecosystem of software packages that provides a computing framework. These include MapReduce, which leverages a K/V (key/value) processing framework (don’t confuse that with a K/V database); a file system (HDFS); and many other software packages that support everything from importing and exporting data (Sqoop) to storing transactional data (HBase), orchestration (Avro and ZooKeeper), and more. When you hear that someone is running a Hadoop cluster, it’s likely to mean MapReduce (or some other framework like Spark) running on HDFS, but others will be using HBase (which also runs on HDFS). Vendors in this space include IBM (with BigInsights for Hadoop), Cloudera, Hortonworks, MapR, and Pivotal. On the other hand, NoSQL refers to non-RDBMS SQL database solutions such as HBase, Cassandra, MongoDB, Riak, and CouchDB, among others. (Zikopoulos, deRoos, Bienko, Buglio, Andrews, 2014, pg. 38) Key components of many Big Data environments: MapReduce MapReduce is a system for parallel processing of large data sets. According to IBM (2015) as an analogy, you can think of map and reduce tasks as the way a census was conducted in Roman times, where the census bureau would dispatch its people to each city in the empire. Each census taker in each city would be tasked to count the number of people in that city and then return their results to the capital city. At the capital, the results from each city would be reduced to a single count (sum of all cities) to determine the overall population of the empire. This mapping of
  • 9. 8 people to cities, in parallel, and then combining the results (reducing) is much more efficient than sending a single person to count every person in the empire in a serial fashion. (IBM, 2015) Hadoop MapReduce is the heart of Hadoop. Hadoop is an open source software stack that runs on a cluster of machines. Hadoop provides distributed storage and distributed processing for very large data sets. NoSQL NoSQL is a database environment. Using the definition from Planet Cassandra (2015), a NoSQL database environment is, simply put, a non-relational and largely distributed database system that enables rapid, ad-hoc organization and analysis of extremely high-volume, disparate data types. NoSQL databases were developed in response to the sheer volume of data being generated, stored and analyzed by modern users (user-generated data) and their applications (machine-generated data). (Planet Cassandra, 2015) Spark What is Spark and what does it mean for Hadoop? IBM (2014) refers to Spark as an open source engine for fast, large-scale data processing that can be used with Hadoop, boasting speeds up to 100 times faster than Hadoop MapReduce in memory, or 10 times faster on disk. As with the early enthusiasm around Hadoop, Spark should not be thought of as a singular platform for analytics, as it can be used with existing investments for the widest variety of data types and analytics workloads. (IBM, 2014) Figure 3: Example of a Big Data environment Figure 3. Application Enrichment with Hadoop. Copyright 2013 by Hortonworks Inc.. Reprinted with permission.
  • 10. 9 The impact of Hadoop According to IBM (2015), Hadoop changes the economics and dynamics of large-scale computing by enabling a solution that is: - Scalable: Add new nodes as needed without changing data formats, how data is loaded, how jobs are written or the applications on top. - Cost-effective: Hadoop brings massively parallel computing to commodity servers. The result is a significant decrease in the cost per terabyte of storage, which in turn makes it affordable to model all your data. - Flexible: Hadoop is schema-less, and can absorb any type of data, structured or not, from a number of sources. Data from multiple sources can be joined and aggregated in arbitrary ways, enabling deeper analyses than any one system can provide by itself. - Fault-tolerant: When you lose a node, the system redirects work to another location of the data and continues processing without missing a beat. (IBM, 2015, pg. 2) Hadoop challenges Hadoop is not without its own set of challenges. According to IBM (2014), there are four key areas of Hadoop that need to mature in order to drive wider adoption, these include: 1) Performance 2) the reduction of skills 3) data governance 4) deep integration with existing technologies (IBM, 2014) Along similar lines TDWI Research (2015) in a recent survey found respondents struggling with the following barriers to Hadoop implementation: Barriers to Hadoop: - Skills gap - Weak business support - Security concerns - Data management hurdles - Tool deficiencies - Containing costs (TDWI Research, 2015) According to a study by the International Technology Group, organisations need to be particularly mindful in the highly skilled programming requirements demanded of most Hadoop environments, noting that: Although the field of players has since expanded to include hundreds of venture capital-funded start- ups, along with established systems and services vendors and large end users, social media businesses continue to control Hadoop. Most of the more than one billion lines of code – more than 90 percent, according to some estimates – in the Apache Hadoop stack has to date been contributed by these.
  • 11. 10 The priorities of this group have inevitably influenced Hadoop evolution. There tends to be an assumption that Hadoop developers are highly skilled, capable of working with “raw” open source code and configuring software components on a case-by-case basis as needs change. Manual coding is the norm. Decades of experience have shown that, regardless of which technologies are employed, manual coding offers lower developer productivity and greater potential for errors than more sophisticated techniques. (ITG, 2013, pg. 2) Big Data in the context of traditional technologies The Big Data environment has been brought about by the advancement in technology enabling the processing and storage of the volume, variety, velocity and veracity of data, which is beyond the capabilities of traditional technology. Big Data supplements traditional systems As illustrated in Figure 3, the Big Data environment supports traditional technology, extending capabilities into areas previously unsupported. Gartner (2013) suggest that Big Data doesn't replace traditional data and analytics: “…..big data technologies are not really replacing incumbents such as business intelligence, relational database management systems and enterprise data warehouses. Instead, they supplement traditional information management and analytics.” (Gartner, 2013, pg. 13) Examples of three insurance use cases with Big Data According to Gartner (2013) Big Data and the associated technology has been shown to provide the following benefits: - Detection and prevention of fraud or other security violations - High ROI - Little operational disruption (Gartner, 2013, pg. 5) Big Data to fight fraud According to John Standish Consulting (2013), mobilizing Big Data is gaining wider attention in anti- fraud circles. Insurers are sitting on troves of data, hard and soft. Much is never accessed for fraud- fighting. Insurers can dramatically increase their anti-fraud assertiveness by insightfully accessing, analyzing and mobilizing their large volumes of untapped data. Marshaling analytics and big data with current rules and indicators into a seamless and unified anti- fraud effort creates an expansive world of possibilities. - Imagine the ability to search a billion rows of data and derive incisive answers to complex questions in seconds. - Imagine being able to comb through huge numbers of claim files quickly.
  • 12. 11 - Imagine more-quickly linking numerous ring members and entities acting in well-disguised concert. These suspects likely could not be detected with sole or even primary reliance on basic methods such as fraud indicators. - Ultimately, imagine analyzing entire caseloads faster and more completely, thus addressing the largest fraud problems and cost drivers in any of an insurer’s coverage territories. (Standish, 2013) Case study: Fraud at IBC The Insurance Bureau of Canada (IBC) is the national insurance industry association representing Canada’s home, car and business insurers. Because investigation of cases of suspected automobile insurance fraud often took several years, the company’s investigative services division wanted to accelerate its’ process. The IBC worked with IBM to conduct a proof of concept (POC) in Ontario, Canada that explored new ways to increase the efficiency of fraud identification. The POC showed how IBM solutions for big data can help identify suspect individuals and flag suspicious claims. IBM solutions also help users visualize relationships and linkages to increase the accuracy and speed of discovering potential fraud. In the POC, more than 233,000 claims from six years were analyzed. The IBM solutions identified more than 2,000 suspected fraudulent claims with a value of CAD41 million. IBM and the IBC estimate that these solutions could save the Ontario automobile insurance industry approximately CAD200 million per year. (IBM, 2012) Big Data for customer segmentation Case study: Customer segmentation at Progressive In July 2012, Progressive Insurance released new findings from an analysis of five billion real-time driving miles, confirming that driving behaviour has more than twice the predictive power of any other insurance rating factor. Loss costs for drivers with the highest-risk driving behaviour are approximately two-and-a-half times the costs for drivers with the lowest-risk behaviour. These results suggest that car insurance rates could be far more personalized than they are today. Progressive has also found that 70% of drivers who have signed up for its’ Snapshot UBI program pay less for their insurance. The program involves installing a small monitoring device in the car (900,000 drivers have already done this) and driving normally. After the device has collected enough data, customers receive a personalized rate for their insurance. Progressive is currently expanding access to Snapshot to all of its’ drivers - not just Progressive customers - who can take a free test drive of the technology and after 30 days find out whether their own driving behaviour can lower the price they pay for insurance. The problem with today's less granular systems of customer classification in the property and casualty insurance market is that the majority of drivers who present a lower risk subsidize the minority of higher-risk drivers. (Gartner, 2013, pg. 5) Big Data for underwriting Case study: Improving underwriting decisions A large global property casualty insurance company wanted to accelerate catastrophe risk modelling in order to improve underwriting decisions and determine when to cap exposures in its’ portfolio. The current modelling environment was too slow and unable to handle the large-scale data volumes that
  • 13. 12 the company wanted to analyze. The goal was to run multiple scenarios and model losses in hours, but the current environment required up to 16 weeks. As a result, the company conducted analysis only three or four times per year. A proof of concept demonstrated that the company could improve performance by 100 times, accelerating query execution from three minutes to less than three seconds. The company decided to implement IBM solutions for big data, and can now run multiple catastrophe risk models every month instead of only three or four times per year. Once data is refreshed, the company can create “what-if” scenarios in hours rather than weeks. With a better and faster understanding of exposures and probable maximum losses, the company can take action sooner to change loss reserves and optimize its’ portfolio. (IBM, 2013, pg. 7) Costs associated with typical Big Data implementations Although a Big Data environment such as that illustrated in Figure 3 can be constructed from open source software, such as Hadoop and a NoSQL database such as MongoDB, there are still substantial costs involved. These include: 1) Hardware costs 2) IT and operational costs in setting up a machine cluster and supporting it 3) Cost of personnel to work on the ecosystem These costs are NOT trivial for the following reasons: - Dealing with cutting edge technology and finding people who know the technology is challenging - The technology introduces a different programming paradigm, frequently requiring additional training of existing engineering teams - These technologies are new and still evolving and are not yet mature in the enterprise ecosystem - The hardware is server grade and large clusters require resources including network administration, security administration, system administration etc., as well as data centre operational costs including electricity, cooling etc. Infrastructure as a Service (IaaS) One consideration that can mitigate the cost implications of hardware and support personnel is the use of a cloud offering. As pointed out by Intel (2015) clouds are already deployed on pools of server, storage, and networking resources and can scale up or down as needed. Cloud computing offers a cost-effective way to support Big Data technologies and the advanced analytics applications that can drive business value. Diversity Limited (2010) defines Infrastructure as a Service (IaaS) as “a way of delivering Cloud Computing infrastructure – servers, storage, network and operating systems – as an on-demand service. Rather than purchasing servers, software, datacenter space or network equipment, organisations instead buy those resources as a fully outsourced service on demand.”
  • 14. 13 Recommended course for Big Data IBM (2015) recommends that organisations consider the following when embarking on the Big Data journey: 1. Choose projects with a high potential return on investment, for which data sources are readily accessible and already in electronic form, and establish clear goals and quantifiable metrics. There should be a strong business need for making the resulting data easily accessible to broad user communities. 2. The data architecture should be extensible to allow addition of other data sources, including streaming data, as needed. 3. As the project continues, create a feedback loop to inform other departments of insights derived about products, marketing and sales. This helps promote the value of analytics, builds a culture that focuses on deriving even better information from analytics, and instils a high level of trust in the data’s veracity and completeness. 4. Surround Hadoop with a strong ecosystem of Big Data tools and analytics capabilities. The richer the portfolio of capabilities in the selected Hadoop solution, the more freedom teams have to solve problems and advance the organization’s insights. (IBM, 2015, pg. 4) Recommended Big Data platform - Utilise an IaaS offering - Explore the MapR and the IBM BigInsights offerings further. IBM BigInsights example: IBM BigInsights is based on 100 percent open source Hadoop. It extends Hadoop with enterprise- grade technology including administration and integration capabilities, visualization and discovery tools as well as security, audit history and performance management. According to IBM, the BigInsights platform offers: - Increased performance: An average 4 times performance gain over open source Hadoop.1 - Usability: BigInsights is optimized for a wide range of roles, including integration developers, administrators, data scientists, analysts and line-of-business contacts. - Integrated with IBM Watson™ Foundations big data platform: BigInsights comes bundled with search and streaming analytics capabilities. - Analytics: Built-in Hadoop analytics capabilities for machine data, social data, text and Big R enable you to locate actionable insights from data in the Hadoop cluster rather than having to move the data around. Figure 4: Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache Hadoop for Major Applications – Averages for All Installations
  • 15. 14 Figure 4. Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache Hadoop for Major Applications. Copyright 2013 by 2013 by the International Technology Group. Reprinted with permission. Conclusion Big Data is having a substantive impact on the P&C insurance industry. Insurers are combining Big Data and analytics to overcome many of the challenges confronting the industry, and to support new capabilities. Although implementing a Big Data platform is not without its’ challenges, through careful consideration, the organisation should be able to generate an appreciable return on its’ Big Data and analytics initiative. The availability of IaaS platforms for Big Data reduce many of the initial risks that would traditionally be associated with such projects. In addition the Big Data offerings from MapR Technologies and IBM, based on initial research appear to be strong candidates for evaluation.
  • 16. 15 References Diversity Limited. (2010). Moving your infrastructure to the cloud. [pdf]. Retrieved from http://diversity.net.nz/wp-content/uploads/2011/01/Moving-to-the-Clouds.pdf Economist Intelligence Unit. (2013). Fostering a data-driven culture. [pdf]. Retrieved from http://www.economistinsights.com/search/node/sites%20default%20files%20downloads%20Tableau%20DataCu lture%20130219%20pdf Gartner. (2013). Characteristics of the traditional versus the big data approach. [Table]. Retrieved from Gartner. (2013). Big data business benefits are hampered by 'culture clash'. [pdf]. Retrieved from https://www.gartner.com/doc/2588415 Gartner. (2013). Use big data to solve fraud and security problems. [pdf]. Retrieved from https://www.gartner.com/doc/2397715 Gartner. (2013). How it should deepen big data analysis to support customer-centricity. [pdf]. Retrieved from https://www.gartner.com/doc/2531116 Gartner. (2013). Consistent view of the customer for big data. [Diagram]. Retrieved from Gartner. (2013). How it should deepen big data analysis to support customer-centricity. [pdf]. Retrieved from https://www.gartner.com/doc/2531116 Gartner. (2014). Agenda overview for p&c and life insurance. [pdf]. Retrieved from https://www.gartner.com/doc/2643327 HBR. (2012). Big Data: The management revolution. [pdf]. Retrieved from https://hbr.org/2012/10/big-data-the-management-revolution/ar Hortonworks. (2013). Application enrichment with hadoop. [Diagram]. Retrieved from Hortonworks. (2013). Apache Hadoop patterns of use. [pdf]. Retrieved from http://hortonworks.com/blog/apache-hadoop-patterns-of- use-refine-enrich-and-explore/ IBM. (2012). Four dimensions of big data. [Diagram] Retrieved from IBM, (2012). Analytics: the real-world use of big data. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF IBM. (2012). Analytics: the real-world use of big data. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF IBM. (2012). Insurance bureau of Canada. [pdf]. Retrieved from http://www-01.ibm.com/common/ssi/cgi- bin/ssialias?subtype=AB&infotype=PM&appname=SWGE_IM_IM_USEN&htmlfid=IMC14775USEN&attachment=I MC14775USEN.PDF IBM. (2013). A better assessment of the data around and connected to a single piece of information enables a more complete, in-context understanding. [Diagram]. Retrieved from IBM. (2013). The future of insurance. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/imw14671usen/IMW14671USEN.PDF IBM. (2013). Harnessing the power of big data and analytics for insurance. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/imw14672usen/IMW14672USEN.PDF IBM. (2014). Analytics: The speed advantage. [pdf]. Retrieved from http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/ IBM. (2014). IBM expands hadoop commitment with support for spark.. [blog]. Retrieved from http://www.ibmbigdatahub.com/blog/ibm-expands-hadoop-commitment-support-spark IBM. (2015). Analytics: What is mapreduce. [web page]. Retrieved from http://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/
  • 17. 16 IBM. (2015). BigInsights for apache hadoop quick start edition. [pdf]. Retrieved from http://www-01.ibm.com/common/ssi/cgi- bin/ssialias?infotype=PM&subtype=BR&htmlfid=IMB14164USEN#loaded IBM. (2015). Making the case for hadoop and big data in the enterprise. [pdf]. Retrieved from http://www-01.ibm.com/common/ssi/cgi- bin/ssialias?infotype=PM&subtype=BK&htmlfid=IMM14161USEN#loaded ITG. (2013). Business case for enterprise big data deployments. [pdf]. Retrieved from http://www-01.ibm.com/common/ssi/cgi- bin/ssialias?htmlfid=IME14028USEN&appname=skmwww ITG. (2013). Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache Hadoop for Major Applications. [Diagram]. Retrieved from ITG. (2013). Business case for enterprise big data deployments. [pdf]. Retrieved from http://www-01.ibm.com/common/ssi/cgi- bin/ssialias?htmlfid=IME14028USEN&appname=skmwww Intel. (2015). Big data cloud technology. [pdf]. Retrieved from http://www.intel.co.za/content/dam/www/public/us/en/documents/product-briefs/big-data- cloud-technologies-brief.pdf McKinsey. (2011). Big data: The next frontier for innovation, competition, and productivity. [pdf]. Retrieved from http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation Ordnance Survey. (2013) The big data rush: how data analytics can yield underwriting gold. [pdf]. Retrieved from http://events.marketforce.eu.com/big-data-underwriting-report-email Planet Cassandra. (2015). Nosql databases defined and explained. [web page]. Retrieved from http://www.planetcassandra.org/what-is-nosql/ Standish, J. (2013). Speed to detection - strategically leveraging advanced analytics for insurance fraud. [blog]. Retrieved from http://www.johnstandishconsultinggroup.com/JohnStandishConsultingGroup.com/Blog/Entries/2013/8/9_Speed _to_Detection_-_Strategically_Leveraging_Advanced_Analytics_for_Insurance_Fraud.html Strategy Meets Action. (2012). Data and analytics in insurance. [pdf]. Retrieved from https://www.acord.org/library/Documents/2012_SMA_Data_Analytics.pdf Strategy Meets Action. (2013). Data and analytics in insurance: p&c plans and priorities for 2013 and beyond. [pdf]. Retrieved from https://strategymeetsaction.com/data-and-analytics-in-insurance-p-and-c-plans-and-priorities- for-2013-and-beyond/ Zikopoulos, P., deRoos, D., Bienko, C., Buglio, R., Andrews, M. (2014). Big data beyond the hype. [pdf]. Retrieved from https://www.ibm.com/developerworks/community/blogs/SusanVisser/entry/big_data_beyond_the_hype_a_gui de_to_conversations_for_today_s_data_center?lang=en