SlideShare a Scribd company logo
1 of 24
Download to read offline
Data Science Startup
Discussion Document
This overview is not intended to be a business case for data science. It is expected that you are already familiar with the
value proposition. However, a reference to several case study examples has been included at the end of this document
as a reminder of the broad applicability of the subject at hand.
The intent of this document is to set in motion the discussion for the creation of a startup in South Africa that is focused
on data science. To be clear the objective of this startup is to:
 Capture the best talent that exists in South Africa in data science
 Be the leader in data science in Southern Africa and the go-to for organisations seeking services, products and
training
 Be a leader in the global data science marketplace by having the best people in the business and a competitive
advantage over international firms based on lower people costs
Do not be deluded into thinking that this undertaking is at all easy. However challenges inherent in this undertaking are
an opportunity as they serve as barriers to entry for those seeking to compete.
Intent
copyright Gregg Barrett August 2016
Example of some of the current uses of data science
- Detection of unauthorized trading activity - Accelerating biomedical research
- Identification of data abuse to protect sensitive information and intellectual property - Discovery of patterns of behaviour and links between key actors
- Preparing for major political and economic transformations - Anticipating emerging threats such as the planning of terrorist attacks
- Accurate rating for insurance underwriting - Improving patient outcomes
- Predicting disease outbreaks - Predicting the path of wildfires
- Detection of information security threats - Detection and elimination of sophisticated criminal activity
- Identification of poachers from real time drone footage and audio networks - Autonomous driving vehicles
- Managing datacentre infrastructure - Product recommendations
- Predicting part failure - Improving transportation efficiency
- Customer/contact centre support - Understanding consumer sentiment
- Market making in securities - Language translation
- Credit scoring - New craft beer recipes!
copyright Gregg Barrett August 2016
Describing data science is like trying to describe a sunset.
It should be easy, but somehow capturing the words is impossible
(Booz Allen Hamilton, 2015)
copyright Gregg Barrett August 2016
We shall use the following definition
Data science is the utilisation of a vast set of tools for modelling and understanding complex datasets.
To simplify matters we shall consider;
 analytics
 machine learning
 artificial intelligence
 and big data
as being part of our data science framework.
Data science is NOT:
 fancy looking reports (product of SQL queries)
 spiffy dashboards (sexy bar graphs and pie charts)
 a wonderfully expensive Business Intelligence offering
copyright Gregg Barrett August 2016
The future of data science
What happened? A company that wasn’t even in your industry launched a new product and has completely flattened you. Sound familiar? It
does for anyone who’s familiar with Uber. Uber first launched as a transportation service, using data and analytics to provide customers with
easy, accessible and fast transportation directly from their phone. Now, Uber has since expanded to beyond just transportation, offering
additional services from consumers’ phones such as meals and delivery. (IBM, 2016)
Some of the hottest, most critical domains in which data science will be applied in the coming years include:
 Cybersecurity including advanced detection, modelling, prediction, and prescriptive analytics
 Healthcare including genomics, precision medicine, population health, healthcare delivery, health data sharing and integration,
health record mining, and wearable device analytics
 IoT (Internet of Things) including sensor analytics, smart data, and emergent discovery alerting and response
 Customer Engagement and Experience including 360-degree view, gamification, and just-in-time personalization
 Smart X, where X = cities, highways, cars, delivery systems, supply chain, and more
 Precision Y, where Y = medicine, farming, harvesting, manufacturing, pricing, and more
 Personalized Z, where Z = marketing, advertising, healthcare, learning, and more
 Human capital (talent) and organizational analytics
 Societal good (Booz Allen Hamilton, 2015)
copyright Gregg Barrett August 2016
Examples of those with data science at their core
Two of the worlds most successful hedge funds:
 Renaissance Technologies LLC
 Bridgewater Associates
A British startup in 2010, acquired by Google in 2014 for around 600 million USD:
 DeepMind
One of the first Data Science consulting firms founded in 1995:
 Elder Research
A startup focused on autonomous driving:
 comma.ai
A startup focused on cybersecurity:
 SparkCognition
copyright Gregg Barrett August 2016
Fighting blind without data science
Float like a butterfly, sting like a bee,
for most firms in South Africa they can’t hit what they can’t see.
copyright Gregg Barrett August 2016
Why South Africa
 Value proposition for data science in South Africa is no different from that in other countries.
 Globally skills are in short supply and in South Africa the problem is even more acute.
 For the handful or persons in South Africa with the necessary competence, opportunities abroad are compelling, as
compensation is around 3 times what they would receive in South Africa.
 Data science in South Africa is for the most part in a nascent state. Leading solution providers for example have no
presence anywhere on the African continent:
MapR Cloudera Hortonworks
Datameer Trifacta Paxata
Palantir Elder Research Alpine Data Labs
RapidMiner SparkCognition Pivotal Software
 For international organisations weakness in the South African economy and the South African rand make the value
proposition of a South African based provider compelling.
copyright Gregg Barrett August 2016
It’s more about people than about machines
At the very core of this undertaking are people - they are the key to success. Only the truly brilliant will do. They are the
outliers and are not easily sourced or recruited. Fortunately, these people tend to be averse to; “Fortunte 500”,
“multinational”, “blue chip organisation”, which invoke thoughts of stifling bureaucracy and politics. A startup is what
appeals to them, where they have their say, are individuals within a team, have a stake in something that can make a
difference and where they can be themselves.
They are a rather scarce commodity in South Africa. However this presents an opportunity as the scarcity of talent
serves as an impediment to firms seeking to compete and build competence in this space.
Capturing the best and the brightest in the data science market in South Africa is a primary objective.
What it takes to manage such an operation.
copyright Gregg Barrett August 2016
Winner-takes-all
In this field one brilliant person can deliver the work of 10 average persons. It is critical that every individual that is a part of this startup have skin-in-the-game through an
equity stake. The equity position serves to attract and retain the people we seek.
People cost is the single largest cost, but also a source of competitive advantage. As a guide for data science positions in the United States:
Entry level position: 100 000 USD base salary
Mid-level position: 150 000 – 250 000 USD base salary
Senior level position: 300 000 – 500 000 USD base salary
South Africa cannot compete with such levels of compensation – a contributing factor why much of the talent leaves the country. We do not have to have such compensation
levels however in order to be successful. It is estimated that we can comfortably operate at around 65% - 75% of the cost of a comparative firm in the US. A cost saving of 25%
will be a major competitive differentiator and particularly attractive to international firms.
As a guide in South Africa we would aim for:
Entry level position: 600 000 ZAR base salary
Mid-level position: 800 000 ZAR base salary
Senior level position: 1 000 000 ZAR base salary
We believe the following strategy will be attractive:
 compensation levels higher that what is currently offered by local organisations
 an equity position
 being part of a startup composed only of the best
 an opportunity to make a major impact
copyright Gregg Barrett August 2016
About people
I said that the best in this business are a rather scarce commodity but what do they look like? Herewith are a couple of
examples:
 Gabor Melis
 George Hotz
What are some of the skills that these persons possess? The document “The Quest for Unicorns” by Elder Research serves as a
good starting point:
The Quest for Unicorns by Elder Research
The following article from The Economist gives some insight into just how intense the arms race for talent has become:
As Silicon Valley fights for talent, universities struggle to hold on to their stars
copyright Gregg Barrett August 2016
Options for South African organisations pushing forward on data science
1. Build the capability internally: Such an approach will be challenging, with most firms not even knowing where to
start. The shortage of talent simply compounds the problem.
2. Retain the services of an outside firm: There are several outside of South Africa. Such an approach will be costly
though due to dollar exposure. Therefore, the likely approach will be to restrict the search to the local market,
supporting our proposition - and a proposition that conversely will be appealing to international firms.
3. Incubate/finance a separate entity and in so doing gain the necessary business capability as well as the added
benefit of an equity position which could generate financial gain.
copyright Gregg Barrett August 2016
What is needed
The following options are being considered:
Startup via funding: funding the startup as a wholly separate entity for a three year stretch in exchange for an equity
position
Startup via incubation: incubating the startup within an existing organisation, where the startup generates value for the
organisation and where the organisation has an equity position in the startup with a view to a spin off once it has
reached sufficient scale
Startup via initial clients: securing sufficient initial clients under contract to cover start-up costs
copyright Gregg Barrett August 2016
Budget
We are looking to put together a 5 to 10 person team. This would require a budget of 5 – 10 million ZAR a year for three
years.
The budget calculation is rather straight forward:
5 million ZAR a year for a 5 person team
10 million ZAR a year for a 10 person team
The nature of the business means that it does not require investment in physical assets. Electricity and an internet
connection for access to cloud infrastructure are the primary requirements. The startup is thus minimally exposed to
risks in the South African operating environment. Further, cloud infrastructure requirements are scaled as and when
needed – pay as you go.
Risk
Probability of increases in income tax and corporate tax rates in South Africa are viewed as a risk which could place
upward pressure on operating costs. However there are options to mitigate this risk.
copyright Gregg Barrett August 2016
Revenue sources
 Consulting
Strategy
Execution
 Product
Product will be created as and when the need arises. However consulting would be the initial focus with product
being a longer term focus.
 Training
Approach
The approach is to be as agnostic as possible when it comes to platform/technology/products.
We would also seek to develop academic collaboration with the likes of UCT and WITS.
Example of the Bloomberg Labs Data Science program.
copyright Gregg Barrett August 2016
Areas for consulting
Cross Industry Standard Process for Data Mining (CRISP-DM) approach is a data mining process model that provides a reference methodology for
conducting data mining. The tasks and output listed in the approach gives an example of areas where consulting work can be provided in executing a
data science project.
Figure 1: Generic tasks (bold) and outputs (italic) of the CRISP-DM reference model
copyright Gregg Barrett August 2016
Optionality through data driven business models
There is a growing trend of data driven technology companies utilising their own solutions to compete with incumbents
in the marketplace, as opposed to licensing their offerings to established incumbents. For example, let’s say that Google
finds a new way to price and deliver insurance. An approach which is now seemingly more frequently being considered is
rather than licensing it to an existing participant(s) in the insurance market, they setup their own insurance entity – with
negative interest rates in many parts of the world, capital is abundant and operating licenses are not impossible to
obtain.
Mondo is an example of such thinking:
Digital challenger bank Mondo just got its banking licence
Uber is another:
Uber’s First Self-Driving Fleet Arrives in Pittsburgh This Month
copyright Gregg Barrett August 2016
Data Charlatans
I spoke earlier of the need to recruit the best and the brightest. Why you ask? Get things wrong and at best you look silly at
worst your blow things up:
Example of getting it wrong and looking silly:
John Gray: Steven Pinker is wrong about violence and war
Example of blowing up:
Recipe for Disaster: The Formula That Killed Wall Street
Big Data brings it’s own set of challenges:
Beware the Big Errors of ‘Big Data’
Beyond Big Data: Identifying Important Information for Real World Challenges
copyright Gregg Barrett August 2016
A note for insurance
Traditional actuarial approaches are no match for current data and computing resources available with the likes Gradient Boosting
Machines, Neural Networks and ensembles of such providing far superior levels of accuracy.
“As more insurers use predictive analytics, those not doing so will be increasingly exposed to adverse selection because their market
will be limited to a subsection for the general population that has worse-than-average loss ratios.” (Nyce, 2007)
Analytics has the potential to make a positive impact on virtually every aspect of the insurance life cycle.
 Product development
 Marketing and distribution
 Pricing and underwriting
 Risk control
 Claims management
 Performance management (Accenture, 2013, pg. 5)
For a more comprehensive overview of data science in insurance:
Value proposition of analytics in P&C insurance
copyright Gregg Barrett August 2016
Further reading of potential interest
Bridgewater Associates building an artificial intelligence competence: Bridgewater Is Said to Start Artificial-Intelligence Team
Bloomberg LP building a machine learning competence: Bloomberg and “the magic” of machine learning
Example of Google using it’s DeepMind unit to save on energy consumption: Google Cuts Its Giant Electricity Bill With DeepMind-Powered AI
Example of the arms race for data: Tiny Satellites: The Latest Innovation Hedge Funds Are Using to Get a Leg Up
copyright Gregg Barrett August 2016
Case studies on data science abound on the internet, for example:
Healthcare: When Health Care Gets a Healthy Dose of Data – Intermountain Healthcare
Industrial: The Industrial Internet – GE Digital
Automotive: The Connected Vehicle Data Platform – Ford Motor Company
Insurance: Geospatial Analytics – Progressive Insurance
Case Studies from MIT Sloan Management Review: MIT Sloan Management Review Case Studies
Case Studies from Elder Research:
Defense and intelligence: Automating Textual Data Discovery And Analysis
Nonprofit Service Organization: Determining Influential Factors for Conference Satisfaction
Pharmaceutical: Discovering the Efficacy of a New Drug
Retail, Consumer Electronics: Enhancing Customer Loyalty
Government, Healthcare: Improving Claims Approval Speed and Accuracy
Retail Banking, Financial Services: Improving Credit Card Risk Scoring
Telecommunications: Improving Customer Retention and Profitablity
Healthcare Insurance: Improving Provider Performance and Patient Outcomes
Retail Banking, Financial Services: Predicting Financial Account Churn
Oil and Gas: Predicting Natural Gas Well Freezing
Government: Prioritizing Building Lease Renewals
Healthcare Insurance: Prioritizing Long-Term Care Claims
Government: Reducing Fraud, Waste, and Abuse
Retail, Computer and Electronic, Product Manufacturing: Reducing Service Provider and Warranty Fraud
IT Management: Staffing Optimization
Insurance: Understanding Customer Sentiment
Retail, Commercial Software: Using Log Analytics to Improve User Experience
There are no shortage of conferences either, for example: Bloomberg Data for Good Exchange
Organized around the following topic areas
- Justice and fairness, including criminal justice, discrimination, algorithmic bias, workers’ rights, voting rights, etc.
- Economic development, including housing, job security, immigration, wages, challenges coming from the “gig” economy, remittance services, etc.
- Security and safety, including emergency services, cyber-attacks, dark web and illegal content, gun control, resilience, etc.
- Public service delivery, including transportation, sustainability, biodiversity and health monitoring, public health, etc.
copyright Gregg Barrett August 2016
Compiled by:
Gregg Barrett
copyright Gregg Barrett August 2016
Reference
Accenture. (2013). The digital insurer: achieving payback in insurance analytics. [pdf].
Retrieved from http://www.accenture.com/us-en/Pages/insight-payback-insurance-analytics.aspx
Booz Allen Hamilton. (2015). The field guide to data science. [pdf]. Retrieved from
https://www.boozallen.com/content/dam/boozallen/documents/2015/12/2015-FIeld-Guide-To-Data-Science.pdf
CRISP-DM. (2000). Generic tasks (bold) and outputs (italic) of the CRISP-DM reference model. [Figure]. Retrieved from CRISP-DM. (2000). CRISP-DM
1.0. [pdf]. Retrieved from https://the-modeling-agency.com/crisp-dm.pdf
IBM. (2016). Why data science should be your priority. [pdf]. Retrieved from
http://www.ibmbigdatahub.com/blog/why-data-science-should-be-your-top-priority
Nyce, C. (2007). Predictive analytics white paper. [pdf]. Retrieved from
http://www.theinstitutes.org/doc/predictivemodelingwhitepaper.pdf
copyright Gregg Barrett August 2016

More Related Content

What's hot

Regulating corporate vc
Regulating corporate vcRegulating corporate vc
Regulating corporate vcIan Beckett
 
Atomico Need-to-Know 8 September 2017
Atomico Need-to-Know 8 September 2017Atomico Need-to-Know 8 September 2017
Atomico Need-to-Know 8 September 2017Atomico
 
DealMarket Digest Issue 132 - 14 March 2014
DealMarket Digest Issue 132 - 14 March 2014DealMarket Digest Issue 132 - 14 March 2014
DealMarket Digest Issue 132 - 14 March 2014Urs Haeusler
 
GSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGang Li
 
Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...
Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...
Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...Thalento®
 
BCG WCIT2019 Plenary Presentation
BCG WCIT2019 Plenary PresentationBCG WCIT2019 Plenary Presentation
BCG WCIT2019 Plenary PresentationMiguel Carrasco
 
Atomico Need-to-Know 17 September 2018
Atomico Need-to-Know 17 September 2018Atomico Need-to-Know 17 September 2018
Atomico Need-to-Know 17 September 2018Atomico
 
Legal tech trends 2019 report
Legal tech trends 2019 reportLegal tech trends 2019 report
Legal tech trends 2019 reportDan Storbaek
 
ConsumerComplaints_VisualReport
ConsumerComplaints_VisualReportConsumerComplaints_VisualReport
ConsumerComplaints_VisualReportAakash Parwani
 
Tracxn - Report - Artificial Intelligence
Tracxn - Report - Artificial IntelligenceTracxn - Report - Artificial Intelligence
Tracxn - Report - Artificial IntelligenceSharad Maheshwari
 

What's hot (18)

Regulating corporate vc
Regulating corporate vcRegulating corporate vc
Regulating corporate vc
 
The 25 Predictions About The Future Of Big Data
The 25 Predictions About The Future Of Big DataThe 25 Predictions About The Future Of Big Data
The 25 Predictions About The Future Of Big Data
 
Atomico Need-to-Know 8 September 2017
Atomico Need-to-Know 8 September 2017Atomico Need-to-Know 8 September 2017
Atomico Need-to-Know 8 September 2017
 
DealMarket Digest Issue 132 - 14 March 2014
DealMarket Digest Issue 132 - 14 March 2014DealMarket Digest Issue 132 - 14 March 2014
DealMarket Digest Issue 132 - 14 March 2014
 
The Bionic Company
The Bionic CompanyThe Bionic Company
The Bionic Company
 
SVB China Startup Outlook 2016
SVB China Startup Outlook 2016SVB China Startup Outlook 2016
SVB China Startup Outlook 2016
 
GSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-Edition
 
Artificial Intelligence in the startup world
Artificial Intelligence in the startup worldArtificial Intelligence in the startup world
Artificial Intelligence in the startup world
 
Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...
Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...
Thalento® Presentation HRM Expo Russia 2014: "Big Data, is Talent Analytics t...
 
Final iam97 unicorns patent strategies
Final iam97 unicorns patent strategiesFinal iam97 unicorns patent strategies
Final iam97 unicorns patent strategies
 
The Future of Work
The Future of WorkThe Future of Work
The Future of Work
 
BCG WCIT2019 Plenary Presentation
BCG WCIT2019 Plenary PresentationBCG WCIT2019 Plenary Presentation
BCG WCIT2019 Plenary Presentation
 
Atomico Need-to-Know 17 September 2018
Atomico Need-to-Know 17 September 2018Atomico Need-to-Know 17 September 2018
Atomico Need-to-Know 17 September 2018
 
Intervyo document
Intervyo documentIntervyo document
Intervyo document
 
Legal tech trends 2019 report
Legal tech trends 2019 reportLegal tech trends 2019 report
Legal tech trends 2019 report
 
ConsumerComplaints_VisualReport
ConsumerComplaints_VisualReportConsumerComplaints_VisualReport
ConsumerComplaints_VisualReport
 
Impact at Scale
Impact at ScaleImpact at Scale
Impact at Scale
 
Tracxn - Report - Artificial Intelligence
Tracxn - Report - Artificial IntelligenceTracxn - Report - Artificial Intelligence
Tracxn - Report - Artificial Intelligence
 

Viewers also liked

Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016
Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016
Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016Filipe Barretto
 
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...DIWUG
 
Running Business Critical Workloads on AWS – Nam Je Cho
Running Business Critical Workloads on AWS – Nam Je ChoRunning Business Critical Workloads on AWS – Nam Je Cho
Running Business Critical Workloads on AWS – Nam Je ChoAmazon Web Services
 
High Availability Architecture for Legacy Stuff - a 10.000 feet overview
High Availability Architecture for Legacy Stuff - a 10.000 feet overviewHigh Availability Architecture for Legacy Stuff - a 10.000 feet overview
High Availability Architecture for Legacy Stuff - a 10.000 feet overviewMarco Amado
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
 
Cwin16 tls-s2-0945-going cloud native
Cwin16 tls-s2-0945-going cloud nativeCwin16 tls-s2-0945-going cloud native
Cwin16 tls-s2-0945-going cloud nativeCapgemini
 
Plan de transport 2014: le Brabant Flamand
Plan de transport 2014: le Brabant FlamandPlan de transport 2014: le Brabant Flamand
Plan de transport 2014: le Brabant FlamandSNCB
 
Nato Constitution- & Laws. Chris Helweg
Nato Constitution-  &  Laws. Chris HelwegNato Constitution-  &  Laws. Chris Helweg
Nato Constitution- & Laws. Chris HelwegChris Helweg
 
15 oefeningen schakelen van weerstanden
15 oefeningen schakelen van weerstanden15 oefeningen schakelen van weerstanden
15 oefeningen schakelen van weerstandenFreddy Van Eynde
 
Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)Salesforce Partners
 
The Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- WestphalenThe Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- WestphalenLeishman Associates
 
Production testing and disaster recovery
Production testing and disaster recoveryProduction testing and disaster recovery
Production testing and disaster recoveryBizTalk360
 
Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...Data Con LA
 
Challenges and outlook with Big Data
Challenges and outlook with Big Data Challenges and outlook with Big Data
Challenges and outlook with Big Data IJCERT JOURNAL
 
소셜 코딩 GitHub & branch & branch strategy
소셜 코딩 GitHub & branch & branch strategy소셜 코딩 GitHub & branch & branch strategy
소셜 코딩 GitHub & branch & branch strategyKenu, GwangNam Heo
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUGStu Hood
 
codeless/serverless develop
codeless/serverless develop codeless/serverless develop
codeless/serverless develop Tomoyuki Obi
 
Workshop 2: Building a streaming data platform on AWS
Workshop 2: Building a streaming data platform on AWSWorkshop 2: Building a streaming data platform on AWS
Workshop 2: Building a streaming data platform on AWSAmazon Web Services
 

Viewers also liked (20)

Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016
Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016
Rio Cloud Computing Meetup 25/01/2017 - Lançamentos do AWS re:Invent 2016
 
PaaS for Dummies
PaaS for DummiesPaaS for Dummies
PaaS for Dummies
 
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
SPSNL17 - Securing Office 365 and Microsoft Azure like a rock star (or groupi...
 
Running Business Critical Workloads on AWS – Nam Je Cho
Running Business Critical Workloads on AWS – Nam Je ChoRunning Business Critical Workloads on AWS – Nam Je Cho
Running Business Critical Workloads on AWS – Nam Je Cho
 
High Availability Architecture for Legacy Stuff - a 10.000 feet overview
High Availability Architecture for Legacy Stuff - a 10.000 feet overviewHigh Availability Architecture for Legacy Stuff - a 10.000 feet overview
High Availability Architecture for Legacy Stuff - a 10.000 feet overview
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
Cwin16 tls-s2-0945-going cloud native
Cwin16 tls-s2-0945-going cloud nativeCwin16 tls-s2-0945-going cloud native
Cwin16 tls-s2-0945-going cloud native
 
Bennett raglinphotography
Bennett raglinphotographyBennett raglinphotography
Bennett raglinphotography
 
Plan de transport 2014: le Brabant Flamand
Plan de transport 2014: le Brabant FlamandPlan de transport 2014: le Brabant Flamand
Plan de transport 2014: le Brabant Flamand
 
Nato Constitution- & Laws. Chris Helweg
Nato Constitution-  &  Laws. Chris HelwegNato Constitution-  &  Laws. Chris Helweg
Nato Constitution- & Laws. Chris Helweg
 
15 oefeningen schakelen van weerstanden
15 oefeningen schakelen van weerstanden15 oefeningen schakelen van weerstanden
15 oefeningen schakelen van weerstanden
 
Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)Emerging Technologies: Heroku for ISVs (October 13, 2014)
Emerging Technologies: Heroku for ISVs (October 13, 2014)
 
The Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- WestphalenThe Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
The Loss of HMAS SYDNEY 2: Medical Aspects- Westphalen
 
Production testing and disaster recovery
Production testing and disaster recoveryProduction testing and disaster recovery
Production testing and disaster recovery
 
Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...Big Data Commercialization and associated IoT Platform Implications by Ramnik...
Big Data Commercialization and associated IoT Platform Implications by Ramnik...
 
Challenges and outlook with Big Data
Challenges and outlook with Big Data Challenges and outlook with Big Data
Challenges and outlook with Big Data
 
소셜 코딩 GitHub & branch & branch strategy
소셜 코딩 GitHub & branch & branch strategy소셜 코딩 GitHub & branch & branch strategy
소셜 코딩 GitHub & branch & branch strategy
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
codeless/serverless develop
codeless/serverless develop codeless/serverless develop
codeless/serverless develop
 
Workshop 2: Building a streaming data platform on AWS
Workshop 2: Building a streaming data platform on AWSWorkshop 2: Building a streaming data platform on AWS
Workshop 2: Building a streaming data platform on AWS
 

Similar to Data science unit introduction

Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017 Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017 Mutiu Iyanda, mMBA, ASM
 
Presentation To Seda Technology Programme
Presentation To Seda Technology ProgrammePresentation To Seda Technology Programme
Presentation To Seda Technology ProgrammeElton050505
 
Id insurance big data analytics whitepaper 20150527_lo res
Id insurance  big data analytics whitepaper  20150527_lo resId insurance  big data analytics whitepaper  20150527_lo res
Id insurance big data analytics whitepaper 20150527_lo resPrakash Kuttikatt
 
ID_Insurance Big Data Analytics whitepaper_ 20150527_lo res
ID_Insurance  Big Data Analytics whitepaper_ 20150527_lo resID_Insurance  Big Data Analytics whitepaper_ 20150527_lo res
ID_Insurance Big Data Analytics whitepaper_ 20150527_lo resPrakash Kuttikatt
 
Id insurance big data analytics whitepaper 20150527_lo res
Id insurance  big data analytics whitepaper  20150527_lo resId insurance  big data analytics whitepaper  20150527_lo res
Id insurance big data analytics whitepaper 20150527_lo resPrakash Kuttikatt
 
Transform customer intelligence-Calculai
Transform customer intelligence-CalculaiTransform customer intelligence-Calculai
Transform customer intelligence-CalculaiAnupam Kundu
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraVin Malhotra
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerLucas Group
 
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...dipak sahoo
 
Analytics trends deloitte
Analytics trends deloitteAnalytics trends deloitte
Analytics trends deloitteMani Kansal
 
Analytics Trends 20145 - Deloitte - us-da-analytics-analytics-trends-2015
Analytics Trends 20145 -  Deloitte - us-da-analytics-analytics-trends-2015Analytics Trends 20145 -  Deloitte - us-da-analytics-analytics-trends-2015
Analytics Trends 20145 - Deloitte - us-da-analytics-analytics-trends-2015Edgar Alejandro Villegas
 
Data science training in bangalore
Data science training in bangaloreData science training in bangalore
Data science training in bangalorepriyankaravilla
 
Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business
Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business
Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business Bernard Marr
 

Similar to Data science unit introduction (20)

Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017 Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017
 
Presentation To Seda Technology Programme
Presentation To Seda Technology ProgrammePresentation To Seda Technology Programme
Presentation To Seda Technology Programme
 
Is AI the Next Frontier for National Competitive Advantage?
Is AI the Next Frontier for National Competitive Advantage?Is AI the Next Frontier for National Competitive Advantage?
Is AI the Next Frontier for National Competitive Advantage?
 
Id insurance big data analytics whitepaper 20150527_lo res
Id insurance  big data analytics whitepaper  20150527_lo resId insurance  big data analytics whitepaper  20150527_lo res
Id insurance big data analytics whitepaper 20150527_lo res
 
ID_Insurance Big Data Analytics whitepaper_ 20150527_lo res
ID_Insurance  Big Data Analytics whitepaper_ 20150527_lo resID_Insurance  Big Data Analytics whitepaper_ 20150527_lo res
ID_Insurance Big Data Analytics whitepaper_ 20150527_lo res
 
Id insurance big data analytics whitepaper 20150527_lo res
Id insurance  big data analytics whitepaper  20150527_lo resId insurance  big data analytics whitepaper  20150527_lo res
Id insurance big data analytics whitepaper 20150527_lo res
 
Transform customer intelligence-Calculai
Transform customer intelligence-CalculaiTransform customer intelligence-Calculai
Transform customer intelligence-Calculai
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its power
 
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
 
Analytics trends deloitte
Analytics trends deloitteAnalytics trends deloitte
Analytics trends deloitte
 
Analytics Trends 20145 - Deloitte - us-da-analytics-analytics-trends-2015
Analytics Trends 20145 -  Deloitte - us-da-analytics-analytics-trends-2015Analytics Trends 20145 -  Deloitte - us-da-analytics-analytics-trends-2015
Analytics Trends 20145 - Deloitte - us-da-analytics-analytics-trends-2015
 
Data science training in bangalore
Data science training in bangaloreData science training in bangalore
Data science training in bangalore
 
Data analytics courses
Data analytics coursesData analytics courses
Data analytics courses
 
Data analytics course in bangalore
Data analytics course in bangaloreData analytics course in bangalore
Data analytics course in bangalore
 
Data analytics course in bangalore
Data analytics course in bangaloreData analytics course in bangalore
Data analytics course in bangalore
 
Data analytics courses
Data analytics coursesData analytics courses
Data analytics courses
 
Data science course in bangalore
Data science course in bangaloreData science course in bangalore
Data science course in bangalore
 
Data science courses
Data science coursesData science courses
Data science courses
 
Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business
Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business
Data Is The New Oil: How Shell Has Become A Data-Driven And AI-Enabled Business
 

More from Gregg Barrett

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Gregg Barrett
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeGregg Barrett
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: InsuranceGregg Barrett
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentGregg Barrett
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingGregg Barrett
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Gregg Barrett
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsGregg Barrett
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings powerGregg Barrett
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be excitingGregg Barrett
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerGregg Barrett
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersGregg Barrett
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Gregg Barrett
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in RGregg Barrett
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using RGregg Barrett
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using RGregg Barrett
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overviewGregg Barrett
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainGregg Barrett
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahoutGregg Barrett
 

More from Gregg Barrett (20)

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiative
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project Document
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boosting
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla Motors
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings power
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be exciting
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing Beer
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managers
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in R
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using R
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overview
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at Intermountain
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahout
 

Recently uploaded

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Recently uploaded (20)

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Data science unit introduction

  • 2. This overview is not intended to be a business case for data science. It is expected that you are already familiar with the value proposition. However, a reference to several case study examples has been included at the end of this document as a reminder of the broad applicability of the subject at hand. The intent of this document is to set in motion the discussion for the creation of a startup in South Africa that is focused on data science. To be clear the objective of this startup is to:  Capture the best talent that exists in South Africa in data science  Be the leader in data science in Southern Africa and the go-to for organisations seeking services, products and training  Be a leader in the global data science marketplace by having the best people in the business and a competitive advantage over international firms based on lower people costs Do not be deluded into thinking that this undertaking is at all easy. However challenges inherent in this undertaking are an opportunity as they serve as barriers to entry for those seeking to compete. Intent copyright Gregg Barrett August 2016
  • 3. Example of some of the current uses of data science - Detection of unauthorized trading activity - Accelerating biomedical research - Identification of data abuse to protect sensitive information and intellectual property - Discovery of patterns of behaviour and links between key actors - Preparing for major political and economic transformations - Anticipating emerging threats such as the planning of terrorist attacks - Accurate rating for insurance underwriting - Improving patient outcomes - Predicting disease outbreaks - Predicting the path of wildfires - Detection of information security threats - Detection and elimination of sophisticated criminal activity - Identification of poachers from real time drone footage and audio networks - Autonomous driving vehicles - Managing datacentre infrastructure - Product recommendations - Predicting part failure - Improving transportation efficiency - Customer/contact centre support - Understanding consumer sentiment - Market making in securities - Language translation - Credit scoring - New craft beer recipes! copyright Gregg Barrett August 2016
  • 4. Describing data science is like trying to describe a sunset. It should be easy, but somehow capturing the words is impossible (Booz Allen Hamilton, 2015) copyright Gregg Barrett August 2016
  • 5. We shall use the following definition Data science is the utilisation of a vast set of tools for modelling and understanding complex datasets. To simplify matters we shall consider;  analytics  machine learning  artificial intelligence  and big data as being part of our data science framework. Data science is NOT:  fancy looking reports (product of SQL queries)  spiffy dashboards (sexy bar graphs and pie charts)  a wonderfully expensive Business Intelligence offering copyright Gregg Barrett August 2016
  • 6. The future of data science What happened? A company that wasn’t even in your industry launched a new product and has completely flattened you. Sound familiar? It does for anyone who’s familiar with Uber. Uber first launched as a transportation service, using data and analytics to provide customers with easy, accessible and fast transportation directly from their phone. Now, Uber has since expanded to beyond just transportation, offering additional services from consumers’ phones such as meals and delivery. (IBM, 2016) Some of the hottest, most critical domains in which data science will be applied in the coming years include:  Cybersecurity including advanced detection, modelling, prediction, and prescriptive analytics  Healthcare including genomics, precision medicine, population health, healthcare delivery, health data sharing and integration, health record mining, and wearable device analytics  IoT (Internet of Things) including sensor analytics, smart data, and emergent discovery alerting and response  Customer Engagement and Experience including 360-degree view, gamification, and just-in-time personalization  Smart X, where X = cities, highways, cars, delivery systems, supply chain, and more  Precision Y, where Y = medicine, farming, harvesting, manufacturing, pricing, and more  Personalized Z, where Z = marketing, advertising, healthcare, learning, and more  Human capital (talent) and organizational analytics  Societal good (Booz Allen Hamilton, 2015) copyright Gregg Barrett August 2016
  • 7. Examples of those with data science at their core Two of the worlds most successful hedge funds:  Renaissance Technologies LLC  Bridgewater Associates A British startup in 2010, acquired by Google in 2014 for around 600 million USD:  DeepMind One of the first Data Science consulting firms founded in 1995:  Elder Research A startup focused on autonomous driving:  comma.ai A startup focused on cybersecurity:  SparkCognition copyright Gregg Barrett August 2016
  • 8. Fighting blind without data science Float like a butterfly, sting like a bee, for most firms in South Africa they can’t hit what they can’t see. copyright Gregg Barrett August 2016
  • 9. Why South Africa  Value proposition for data science in South Africa is no different from that in other countries.  Globally skills are in short supply and in South Africa the problem is even more acute.  For the handful or persons in South Africa with the necessary competence, opportunities abroad are compelling, as compensation is around 3 times what they would receive in South Africa.  Data science in South Africa is for the most part in a nascent state. Leading solution providers for example have no presence anywhere on the African continent: MapR Cloudera Hortonworks Datameer Trifacta Paxata Palantir Elder Research Alpine Data Labs RapidMiner SparkCognition Pivotal Software  For international organisations weakness in the South African economy and the South African rand make the value proposition of a South African based provider compelling. copyright Gregg Barrett August 2016
  • 10. It’s more about people than about machines At the very core of this undertaking are people - they are the key to success. Only the truly brilliant will do. They are the outliers and are not easily sourced or recruited. Fortunately, these people tend to be averse to; “Fortunte 500”, “multinational”, “blue chip organisation”, which invoke thoughts of stifling bureaucracy and politics. A startup is what appeals to them, where they have their say, are individuals within a team, have a stake in something that can make a difference and where they can be themselves. They are a rather scarce commodity in South Africa. However this presents an opportunity as the scarcity of talent serves as an impediment to firms seeking to compete and build competence in this space. Capturing the best and the brightest in the data science market in South Africa is a primary objective. What it takes to manage such an operation. copyright Gregg Barrett August 2016
  • 11. Winner-takes-all In this field one brilliant person can deliver the work of 10 average persons. It is critical that every individual that is a part of this startup have skin-in-the-game through an equity stake. The equity position serves to attract and retain the people we seek. People cost is the single largest cost, but also a source of competitive advantage. As a guide for data science positions in the United States: Entry level position: 100 000 USD base salary Mid-level position: 150 000 – 250 000 USD base salary Senior level position: 300 000 – 500 000 USD base salary South Africa cannot compete with such levels of compensation – a contributing factor why much of the talent leaves the country. We do not have to have such compensation levels however in order to be successful. It is estimated that we can comfortably operate at around 65% - 75% of the cost of a comparative firm in the US. A cost saving of 25% will be a major competitive differentiator and particularly attractive to international firms. As a guide in South Africa we would aim for: Entry level position: 600 000 ZAR base salary Mid-level position: 800 000 ZAR base salary Senior level position: 1 000 000 ZAR base salary We believe the following strategy will be attractive:  compensation levels higher that what is currently offered by local organisations  an equity position  being part of a startup composed only of the best  an opportunity to make a major impact copyright Gregg Barrett August 2016
  • 12. About people I said that the best in this business are a rather scarce commodity but what do they look like? Herewith are a couple of examples:  Gabor Melis  George Hotz What are some of the skills that these persons possess? The document “The Quest for Unicorns” by Elder Research serves as a good starting point: The Quest for Unicorns by Elder Research The following article from The Economist gives some insight into just how intense the arms race for talent has become: As Silicon Valley fights for talent, universities struggle to hold on to their stars copyright Gregg Barrett August 2016
  • 13. Options for South African organisations pushing forward on data science 1. Build the capability internally: Such an approach will be challenging, with most firms not even knowing where to start. The shortage of talent simply compounds the problem. 2. Retain the services of an outside firm: There are several outside of South Africa. Such an approach will be costly though due to dollar exposure. Therefore, the likely approach will be to restrict the search to the local market, supporting our proposition - and a proposition that conversely will be appealing to international firms. 3. Incubate/finance a separate entity and in so doing gain the necessary business capability as well as the added benefit of an equity position which could generate financial gain. copyright Gregg Barrett August 2016
  • 14. What is needed The following options are being considered: Startup via funding: funding the startup as a wholly separate entity for a three year stretch in exchange for an equity position Startup via incubation: incubating the startup within an existing organisation, where the startup generates value for the organisation and where the organisation has an equity position in the startup with a view to a spin off once it has reached sufficient scale Startup via initial clients: securing sufficient initial clients under contract to cover start-up costs copyright Gregg Barrett August 2016
  • 15. Budget We are looking to put together a 5 to 10 person team. This would require a budget of 5 – 10 million ZAR a year for three years. The budget calculation is rather straight forward: 5 million ZAR a year for a 5 person team 10 million ZAR a year for a 10 person team The nature of the business means that it does not require investment in physical assets. Electricity and an internet connection for access to cloud infrastructure are the primary requirements. The startup is thus minimally exposed to risks in the South African operating environment. Further, cloud infrastructure requirements are scaled as and when needed – pay as you go. Risk Probability of increases in income tax and corporate tax rates in South Africa are viewed as a risk which could place upward pressure on operating costs. However there are options to mitigate this risk. copyright Gregg Barrett August 2016
  • 16. Revenue sources  Consulting Strategy Execution  Product Product will be created as and when the need arises. However consulting would be the initial focus with product being a longer term focus.  Training Approach The approach is to be as agnostic as possible when it comes to platform/technology/products. We would also seek to develop academic collaboration with the likes of UCT and WITS. Example of the Bloomberg Labs Data Science program. copyright Gregg Barrett August 2016
  • 17. Areas for consulting Cross Industry Standard Process for Data Mining (CRISP-DM) approach is a data mining process model that provides a reference methodology for conducting data mining. The tasks and output listed in the approach gives an example of areas where consulting work can be provided in executing a data science project. Figure 1: Generic tasks (bold) and outputs (italic) of the CRISP-DM reference model copyright Gregg Barrett August 2016
  • 18. Optionality through data driven business models There is a growing trend of data driven technology companies utilising their own solutions to compete with incumbents in the marketplace, as opposed to licensing their offerings to established incumbents. For example, let’s say that Google finds a new way to price and deliver insurance. An approach which is now seemingly more frequently being considered is rather than licensing it to an existing participant(s) in the insurance market, they setup their own insurance entity – with negative interest rates in many parts of the world, capital is abundant and operating licenses are not impossible to obtain. Mondo is an example of such thinking: Digital challenger bank Mondo just got its banking licence Uber is another: Uber’s First Self-Driving Fleet Arrives in Pittsburgh This Month copyright Gregg Barrett August 2016
  • 19. Data Charlatans I spoke earlier of the need to recruit the best and the brightest. Why you ask? Get things wrong and at best you look silly at worst your blow things up: Example of getting it wrong and looking silly: John Gray: Steven Pinker is wrong about violence and war Example of blowing up: Recipe for Disaster: The Formula That Killed Wall Street Big Data brings it’s own set of challenges: Beware the Big Errors of ‘Big Data’ Beyond Big Data: Identifying Important Information for Real World Challenges copyright Gregg Barrett August 2016
  • 20. A note for insurance Traditional actuarial approaches are no match for current data and computing resources available with the likes Gradient Boosting Machines, Neural Networks and ensembles of such providing far superior levels of accuracy. “As more insurers use predictive analytics, those not doing so will be increasingly exposed to adverse selection because their market will be limited to a subsection for the general population that has worse-than-average loss ratios.” (Nyce, 2007) Analytics has the potential to make a positive impact on virtually every aspect of the insurance life cycle.  Product development  Marketing and distribution  Pricing and underwriting  Risk control  Claims management  Performance management (Accenture, 2013, pg. 5) For a more comprehensive overview of data science in insurance: Value proposition of analytics in P&C insurance copyright Gregg Barrett August 2016
  • 21. Further reading of potential interest Bridgewater Associates building an artificial intelligence competence: Bridgewater Is Said to Start Artificial-Intelligence Team Bloomberg LP building a machine learning competence: Bloomberg and “the magic” of machine learning Example of Google using it’s DeepMind unit to save on energy consumption: Google Cuts Its Giant Electricity Bill With DeepMind-Powered AI Example of the arms race for data: Tiny Satellites: The Latest Innovation Hedge Funds Are Using to Get a Leg Up copyright Gregg Barrett August 2016
  • 22. Case studies on data science abound on the internet, for example: Healthcare: When Health Care Gets a Healthy Dose of Data – Intermountain Healthcare Industrial: The Industrial Internet – GE Digital Automotive: The Connected Vehicle Data Platform – Ford Motor Company Insurance: Geospatial Analytics – Progressive Insurance Case Studies from MIT Sloan Management Review: MIT Sloan Management Review Case Studies Case Studies from Elder Research: Defense and intelligence: Automating Textual Data Discovery And Analysis Nonprofit Service Organization: Determining Influential Factors for Conference Satisfaction Pharmaceutical: Discovering the Efficacy of a New Drug Retail, Consumer Electronics: Enhancing Customer Loyalty Government, Healthcare: Improving Claims Approval Speed and Accuracy Retail Banking, Financial Services: Improving Credit Card Risk Scoring Telecommunications: Improving Customer Retention and Profitablity Healthcare Insurance: Improving Provider Performance and Patient Outcomes Retail Banking, Financial Services: Predicting Financial Account Churn Oil and Gas: Predicting Natural Gas Well Freezing Government: Prioritizing Building Lease Renewals Healthcare Insurance: Prioritizing Long-Term Care Claims Government: Reducing Fraud, Waste, and Abuse Retail, Computer and Electronic, Product Manufacturing: Reducing Service Provider and Warranty Fraud IT Management: Staffing Optimization Insurance: Understanding Customer Sentiment Retail, Commercial Software: Using Log Analytics to Improve User Experience There are no shortage of conferences either, for example: Bloomberg Data for Good Exchange Organized around the following topic areas - Justice and fairness, including criminal justice, discrimination, algorithmic bias, workers’ rights, voting rights, etc. - Economic development, including housing, job security, immigration, wages, challenges coming from the “gig” economy, remittance services, etc. - Security and safety, including emergency services, cyber-attacks, dark web and illegal content, gun control, resilience, etc. - Public service delivery, including transportation, sustainability, biodiversity and health monitoring, public health, etc. copyright Gregg Barrett August 2016
  • 23. Compiled by: Gregg Barrett copyright Gregg Barrett August 2016
  • 24. Reference Accenture. (2013). The digital insurer: achieving payback in insurance analytics. [pdf]. Retrieved from http://www.accenture.com/us-en/Pages/insight-payback-insurance-analytics.aspx Booz Allen Hamilton. (2015). The field guide to data science. [pdf]. Retrieved from https://www.boozallen.com/content/dam/boozallen/documents/2015/12/2015-FIeld-Guide-To-Data-Science.pdf CRISP-DM. (2000). Generic tasks (bold) and outputs (italic) of the CRISP-DM reference model. [Figure]. Retrieved from CRISP-DM. (2000). CRISP-DM 1.0. [pdf]. Retrieved from https://the-modeling-agency.com/crisp-dm.pdf IBM. (2016). Why data science should be your priority. [pdf]. Retrieved from http://www.ibmbigdatahub.com/blog/why-data-science-should-be-your-top-priority Nyce, C. (2007). Predictive analytics white paper. [pdf]. Retrieved from http://www.theinstitutes.org/doc/predictivemodelingwhitepaper.pdf copyright Gregg Barrett August 2016