SlideShare a Scribd company logo
1 of 43
Download to read offline
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS re:Invent
American Heart Association: Finding Cures to Heart
Disease Through the Power of Technology
Dr. Jennifer Hall, Chief—Institute for Precision Cardiovascular Medicine, @jen_precision
Laura Stevens, Research Fellow—American Heart Association
Bob Strahan, Sr. Consultant—AWS Professional Services
ABD316
November 28, 2017
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
PART 1: The AHA Precision Medicine Platform—why it’s important and what it is/will be used for
Inspire with the purpose, mission, and relevance of the platform. Life is why!
PART 2: The heart of the architecture—harmonize, search, and analyze
Share how PMP got started, key goals, core concepts, challenges, and solutions
PART 3: The PMP today
Live demo of today’s PMP—dataset search, request access, launch workspace, do science
PART 4: What’s Next?
Adopt new AWS services and tools; learn more about deploying and managing PMP; learn how
to deploy the reference solution and experiment for yourself; final takeaways
PART 5 – Questions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The AHA Precision Medicine Platform
PART 1
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
1
OVER 75%
of cardiovascular disease deaths take place
in LOW-AND MIDDLE-INCOME COUNTRIES.
EVERY 2 SECONDS
someone around the world dies
from cardiovascular disease.
Cardiovascular
diseases
are the number
cause of DEATH IN
THE WORLD.
The global cost of
cardiovascular
disease
is approximately
$900
BILLION
and will
exceed
$1TRILLION
BY 2030
The global impact of cardiovascular diseases
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Technology transforming science:
improving health and health-care outcomes
Access Sharing Analytics Improved Patient
Outcomes
The promise of personalized medicine
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What problem are we trying to solve?
NEW TECHNOLOGY
AND CAPABILITIES
OLD MODELS
FOR DATA ACCESS AND ANALYTICS
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Modernizing health care in the cloud:
New ways to develop, integrate, and utilize data
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
New models for research:
Removing traditional barriers to scientific discovery
Faster
access to data
One marketplace
for datasets
More time/money
spent on research
Improved patient
outcomes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The heart of the architecture
PART 2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It started with a conceptual architecture
Amazon S3Amazon S3
upload
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
At the heart are three core concepts
Harmonize: Enable search and analysis across datasets
Search: Find the data you care about
Analyze: Prove your hypotheses, create and share insights,
advance science
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scenario
To prove or refute your hypothesis, you need to find
datasets, combine them, and analyze their data.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Discordant datasets
Created at different times by different people
Use different names to mean the same thing, and the same
names to mean different things
Use different units of measurement, different scales, different categories
Instruments weren’t calibrated to a common standard
DATA QUALITY ISSUES
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Harmonization
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Harmonization challenges
Sometimes harmonization is easier said than done.
Easier
Harder
Standardize variable names
Standardize units of measurement
Align continuous with categorized values
Align readings from different instruments/calibrations
Align measurements when procedures vary
Align survey responses when questions vary
Information missing from dataset
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Harmonize on AWS
Amazon S3
Python or R
Jupyter Notebooks
Apache Spark
Amazon EMR (compute)
Raw
datasets in
Amazon S3
Amazon EMR
Notebook
plus
harmonized
datasets in
Amazon S3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Raw datasets
Ingest
1
Harmonize
3
Explore
2
Store
4
Amazon EMR (Spark)
Amazon
S3
HarmonizedRaw
Harmonize
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Load & explore
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Transform Rename
variables
Standardize
‘gender’
categories
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Add variables
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Create dictionary
Variable
name
Value
distribution
stats
Descriptive
metadata
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Save data & dictionary to Amazon S3 & Amazon
Elasticsearch Service
Save subset of
variables to
Amazon ES
Save all variables
to Amazon S3 as
SparkSQL table
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Harmonization notebooks: Save notebook
Notebook saves itself,
along with the harmonized
data it creates
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Search & discovery
• Quickly find relevant data
• Preliminary analysis of filtered data
• Link to harmonization notebook
• Amazon ES
• Index created from Spark
harmonization
• Data (search)
• Metadata (UI filters)
• Filter accordion panel and dashboard
Embedded
dashboard
Links to
harmonization
notebooks
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Search page—dynamic UI filter bar
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ES
Raw datasets
Amazon EC2
Container
Service (Amazon
ECS)
ALB
Ingest
1
Kibana
Search
filters
NGINX
Harmonize
3
Explore
2
Store
4
Amazon EMR (Spark)
ES
Proxy
ES
Proxy
Amazon S3
HarmonizedRaw
Harmonize Search and discover
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Researcher workspaces
- Python or R
- Jupyter Notebooks
- Spark
- EMR
- Create clear, beautiful, executable,
reproducible science
Same platform as harmonization
From: Notebook Gallery
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Researcher accounts
• Researcher workspaces can run in different (researcher dedicated) AWS
accounts
• Amazon S3 bucket access policies used to grant secure dataset access to
researcher accounts (or users/roles within account)
• Researchers can bring their own datasets into Amazon S3 buckets in their
own accounts
• Amazon EMR/Spark supports wide variety of scalable data science and
genomics tools
• In the PMP, researcher workspaces are created, secured, and managed on
researcher’s behalf
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The PMP today
Demo: precision.heart.org
PART 3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demo: precision.heart.org
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What’s next?
PART 4
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless harmonization
- AWS Glue
- Serverless
- Schema inference
- PySpark code
- Extensible
(custom code/libraries)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Standardize on Parquet file format
All harmonized datasets saved as Parquet files instead of CSV
All harmonized datasets in AWS GLUE catalog
Genomics files, too!
- FASTA, BAM, VCF to Parquet using ADAM
Enables SQL analysis via Amazon Athena and/or Amazon Redshift
Spectrum
• Fast and scalable
• Access to SQL-based BI and data science tools
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless analytics
- Amazon Athena
- Serverless SQL
- Data stays on Amazon S3
- Tables created by Spark
harmonization
- Amazon QuickSight
- Serverless, easy, fast, beautiful,
shareable
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DevSecOps, compliance, etc.
Partner-led PMP session in HCLS track (HLC309)
• Architecture
• Compliance
• Security
• Monitoring (New! Amazon Macie)
• Processes
• DevOps
• Much more
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Harmonize, search, & analyze loosely coupled
datasets on AWS (AWS Big Data Blog)
• Companion sample app—deploy
reference architecture and
samples with one-click launch
button
www.amazon.com/harmonize-search-analyze
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key lessons learned
• Academic community in cardiovascular and stroke sciences needs
more computer engineers, architects, and bioinformaticists
• Scientific community needs more tools within a cloud
marketplace that do not always require coding
• Scientific community needs data-use cases to help inform them
of the power of cloud computing
• Academic communities are not geared toward a financially simple
model of cloud computing
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of working with AWS
• Multitude of services that improve the ability to accelerate
scientific discoveries by improving access to data, how data is
searched data, and tools in the workspace to analyze data
• AWS brings experts in cloud architecture, engineering and
software that enable scientists to accelerate their research.
• AWS and AHA are partnering to bring educational tools to the
marketplace for youth, trainees, and professionals.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The future of precision.heart.org
• Marketplace for scientists, clinicians, patients, consumers, and
communities
• Community forum
• Link to journals
• Educational tools
• Data challenges
• Artificial Intelligence in clinical Medicine, scientific
discoveries, and communities
• Direct-to-participant recruitment—MyResearchLegacy.org
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Questions
PART 5
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DEMO

More Related Content

What's hot

ABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSightABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSightAmazon Web Services
 
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...Amazon Web Services
 
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017Amazon Web Services
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueAmazon Web Services
 
ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218
ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218
ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218Amazon Web Services
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...Amazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
ABD215_Serverless Data Prep with AWS Glue
ABD215_Serverless Data Prep with AWS GlueABD215_Serverless Data Prep with AWS Glue
ABD215_Serverless Data Prep with AWS GlueAmazon Web Services
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...Amazon Web Services
 
ABD317_Building Your First Big Data Application on AWS - ABD317
ABD317_Building Your First Big Data Application on AWS - ABD317ABD317_Building Your First Big Data Application on AWS - ABD317
ABD317_Building Your First Big Data Application on AWS - ABD317Amazon Web Services
 
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...Amazon Web Services
 
HLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdfHLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdfAmazon Web Services
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Amazon Web Services
 
FSV302_An Architecture for Trade Capture and Regulatory Reporting
FSV302_An Architecture for Trade Capture and Regulatory ReportingFSV302_An Architecture for Trade Capture and Regulatory Reporting
FSV302_An Architecture for Trade Capture and Regulatory ReportingAmazon Web Services
 
Best Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSBest Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSAmazon Web Services
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTAmazon Web Services
 
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseGPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseAmazon Web Services
 
BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017Dave Nielsen
 
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Amazon Web Services
 

What's hot (20)

ABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSightABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
 
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
 
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
How Twilio Scaled Its Data Driven Culture - ABD309 - re:Invent 2017
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS Glue
 
Building Data Lakes with AWS
Building Data Lakes with AWSBuilding Data Lakes with AWS
Building Data Lakes with AWS
 
ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218
ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218
ABD218_How Euroleague Basketball Uses IoT Analytics to Engage Fans- ABD218
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
ABD215_Serverless Data Prep with AWS Glue
ABD215_Serverless Data Prep with AWS GlueABD215_Serverless Data Prep with AWS Glue
ABD215_Serverless Data Prep with AWS Glue
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
 
ABD317_Building Your First Big Data Application on AWS - ABD317
ABD317_Building Your First Big Data Application on AWS - ABD317ABD317_Building Your First Big Data Application on AWS - ABD317
ABD317_Building Your First Big Data Application on AWS - ABD317
 
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
 
HLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdfHLC301-Simplifying Healthcare Data Management on AWS.pdf
HLC301-Simplifying Healthcare Data Management on AWS.pdf
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
 
FSV302_An Architecture for Trade Capture and Regulatory Reporting
FSV302_An Architecture for Trade Capture and Regulatory ReportingFSV302_An Architecture for Trade Capture and Regulatory Reporting
FSV302_An Architecture for Trade Capture and Regulatory Reporting
 
Best Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSBest Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWS
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data WarehouseGPSWKS401_Designing a Cloud Enterprise Data Warehouse
GPSWKS401_Designing a Cloud Enterprise Data Warehouse
 
BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017
 
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
Visualize your data in Data Lake with AWS Athena and AWS Quicksight Hands-on ...
 

Similar to ABD316_American Heart Association Finding Cures to Heart Disease Through the Power of Technology

HLC309_The American Heart Association and How to Build a Secure and Collabora...
HLC309_The American Heart Association and How to Build a Secure and Collabora...HLC309_The American Heart Association and How to Build a Secure and Collabora...
HLC309_The American Heart Association and How to Build a Secure and Collabora...Amazon Web Services
 
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...Amazon Web Services
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseAmazon Web Services
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansAmazon Web Services
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyAmazon Web Services
 
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Amazon Web Services
 
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017Amazon Web Services
 
Tensors for topic modeling and deep learning on AWS Sagemaker
Tensors for topic modeling and deep learning on AWS SagemakerTensors for topic modeling and deep learning on AWS Sagemaker
Tensors for topic modeling and deep learning on AWS SagemakerAnima Anandkumar
 
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SFHow Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SFAmazon Web Services
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksAmazon Web Services
 
How Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsHow Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsAmazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsAmazon Web Services
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsAmazon Web Services
 
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Amazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsAmazon Web Services
 
Value of Data Beyond Analytics by Darin Briskman
 Value of Data Beyond Analytics by Darin Briskman Value of Data Beyond Analytics by Darin Briskman
Value of Data Beyond Analytics by Darin BriskmanSameer Kenkare
 
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS SummitCreate Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS SummitAmazon Web Services
 
Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...
Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...
Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...Amazon Web Services
 
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017Amazon Web Services
 

Similar to ABD316_American Heart Association Finding Cures to Heart Disease Through the Power of Technology (20)

HLC309_The American Heart Association and How to Build a Secure and Collabora...
HLC309_The American Heart Association and How to Build a Secure and Collabora...HLC309_The American Heart Association and How to Build a Secure and Collabora...
HLC309_The American Heart Association and How to Build a Secure and Collabora...
 
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
ABD318_Architecting a data lake with Amazon S3, Amazon Kinesis, AWS Glue and ...
 
Architecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the EnterpriseArchitecting an Open Data Lake for the Enterprise
Architecting an Open Data Lake for the Enterprise
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
 
GPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made EasyGPSWKS301_Comprehensive Big Data Architecture Made Easy
GPSWKS301_Comprehensive Big Data Architecture Made Easy
 
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
Comprehensive Big Data Analytics Architecture Made Easy - The AWS Marketplace...
 
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
 
Tensors for topic modeling and deep learning on AWS Sagemaker
Tensors for topic modeling and deep learning on AWS SagemakerTensors for topic modeling and deep learning on AWS Sagemaker
Tensors for topic modeling and deep learning on AWS Sagemaker
 
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SFHow Amazon.com Uses AWS Analytics: Data Analytics Week SF
How Amazon.com Uses AWS Analytics: Data Analytics Week SF
 
How Amazon uses AWS Analytics
How Amazon uses AWS AnalyticsHow Amazon uses AWS Analytics
How Amazon uses AWS Analytics
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 
How Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS AnalyticsHow Amazon.com Uses AWS Analytics
How Amazon.com Uses AWS Analytics
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven Decisions
 
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
Value of Data Beyond Analytics by Darin Briskman
 Value of Data Beyond Analytics by Darin Briskman Value of Data Beyond Analytics by Darin Briskman
Value of Data Beyond Analytics by Darin Briskman
 
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS SummitCreate Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
 
Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...
Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...
Deliver Voice Automated Serverless BI Solutions in Under 3 Hours - ABD325 - r...
 
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

ABD316_American Heart Association Finding Cures to Heart Disease Through the Power of Technology

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS re:Invent American Heart Association: Finding Cures to Heart Disease Through the Power of Technology Dr. Jennifer Hall, Chief—Institute for Precision Cardiovascular Medicine, @jen_precision Laura Stevens, Research Fellow—American Heart Association Bob Strahan, Sr. Consultant—AWS Professional Services ABD316 November 28, 2017
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda PART 1: The AHA Precision Medicine Platform—why it’s important and what it is/will be used for Inspire with the purpose, mission, and relevance of the platform. Life is why! PART 2: The heart of the architecture—harmonize, search, and analyze Share how PMP got started, key goals, core concepts, challenges, and solutions PART 3: The PMP today Live demo of today’s PMP—dataset search, request access, launch workspace, do science PART 4: What’s Next? Adopt new AWS services and tools; learn more about deploying and managing PMP; learn how to deploy the reference solution and experiment for yourself; final takeaways PART 5 – Questions
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The AHA Precision Medicine Platform PART 1
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 1 OVER 75% of cardiovascular disease deaths take place in LOW-AND MIDDLE-INCOME COUNTRIES. EVERY 2 SECONDS someone around the world dies from cardiovascular disease. Cardiovascular diseases are the number cause of DEATH IN THE WORLD. The global cost of cardiovascular disease is approximately $900 BILLION and will exceed $1TRILLION BY 2030 The global impact of cardiovascular diseases
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Technology transforming science: improving health and health-care outcomes Access Sharing Analytics Improved Patient Outcomes
  • 6. The promise of personalized medicine
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What problem are we trying to solve? NEW TECHNOLOGY AND CAPABILITIES OLD MODELS FOR DATA ACCESS AND ANALYTICS
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Modernizing health care in the cloud: New ways to develop, integrate, and utilize data
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. New models for research: Removing traditional barriers to scientific discovery Faster access to data One marketplace for datasets More time/money spent on research Improved patient outcomes
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The heart of the architecture PART 2
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It started with a conceptual architecture Amazon S3Amazon S3 upload
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. At the heart are three core concepts Harmonize: Enable search and analysis across datasets Search: Find the data you care about Analyze: Prove your hypotheses, create and share insights, advance science
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scenario To prove or refute your hypothesis, you need to find datasets, combine them, and analyze their data.
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Discordant datasets Created at different times by different people Use different names to mean the same thing, and the same names to mean different things Use different units of measurement, different scales, different categories Instruments weren’t calibrated to a common standard DATA QUALITY ISSUES
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Harmonization
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Harmonization challenges Sometimes harmonization is easier said than done. Easier Harder Standardize variable names Standardize units of measurement Align continuous with categorized values Align readings from different instruments/calibrations Align measurements when procedures vary Align survey responses when questions vary Information missing from dataset
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Harmonize on AWS Amazon S3 Python or R Jupyter Notebooks Apache Spark Amazon EMR (compute) Raw datasets in Amazon S3 Amazon EMR Notebook plus harmonized datasets in Amazon S3
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Raw datasets Ingest 1 Harmonize 3 Explore 2 Store 4 Amazon EMR (Spark) Amazon S3 HarmonizedRaw Harmonize
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Load & explore
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Transform Rename variables Standardize ‘gender’ categories
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Add variables
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Create dictionary Variable name Value distribution stats Descriptive metadata
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Save data & dictionary to Amazon S3 & Amazon Elasticsearch Service Save subset of variables to Amazon ES Save all variables to Amazon S3 as SparkSQL table
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Harmonization notebooks: Save notebook Notebook saves itself, along with the harmonized data it creates
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Search & discovery • Quickly find relevant data • Preliminary analysis of filtered data • Link to harmonization notebook • Amazon ES • Index created from Spark harmonization • Data (search) • Metadata (UI filters) • Filter accordion panel and dashboard Embedded dashboard Links to harmonization notebooks
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Search page—dynamic UI filter bar
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon ES Raw datasets Amazon EC2 Container Service (Amazon ECS) ALB Ingest 1 Kibana Search filters NGINX Harmonize 3 Explore 2 Store 4 Amazon EMR (Spark) ES Proxy ES Proxy Amazon S3 HarmonizedRaw Harmonize Search and discover
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Researcher workspaces - Python or R - Jupyter Notebooks - Spark - EMR - Create clear, beautiful, executable, reproducible science Same platform as harmonization From: Notebook Gallery
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Researcher accounts • Researcher workspaces can run in different (researcher dedicated) AWS accounts • Amazon S3 bucket access policies used to grant secure dataset access to researcher accounts (or users/roles within account) • Researchers can bring their own datasets into Amazon S3 buckets in their own accounts • Amazon EMR/Spark supports wide variety of scalable data science and genomics tools • In the PMP, researcher workspaces are created, secured, and managed on researcher’s behalf
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The PMP today Demo: precision.heart.org PART 3
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demo: precision.heart.org
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What’s next? PART 4
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless harmonization - AWS Glue - Serverless - Schema inference - PySpark code - Extensible (custom code/libraries)
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Standardize on Parquet file format All harmonized datasets saved as Parquet files instead of CSV All harmonized datasets in AWS GLUE catalog Genomics files, too! - FASTA, BAM, VCF to Parquet using ADAM Enables SQL analysis via Amazon Athena and/or Amazon Redshift Spectrum • Fast and scalable • Access to SQL-based BI and data science tools
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless analytics - Amazon Athena - Serverless SQL - Data stays on Amazon S3 - Tables created by Spark harmonization - Amazon QuickSight - Serverless, easy, fast, beautiful, shareable
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. DevSecOps, compliance, etc. Partner-led PMP session in HCLS track (HLC309) • Architecture • Compliance • Security • Monitoring (New! Amazon Macie) • Processes • DevOps • Much more
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Harmonize, search, & analyze loosely coupled datasets on AWS (AWS Big Data Blog) • Companion sample app—deploy reference architecture and samples with one-click launch button www.amazon.com/harmonize-search-analyze
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Key lessons learned • Academic community in cardiovascular and stroke sciences needs more computer engineers, architects, and bioinformaticists • Scientific community needs more tools within a cloud marketplace that do not always require coding • Scientific community needs data-use cases to help inform them of the power of cloud computing • Academic communities are not geared toward a financially simple model of cloud computing
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Benefits of working with AWS • Multitude of services that improve the ability to accelerate scientific discoveries by improving access to data, how data is searched data, and tools in the workspace to analyze data • AWS brings experts in cloud architecture, engineering and software that enable scientists to accelerate their research. • AWS and AHA are partnering to bring educational tools to the marketplace for youth, trainees, and professionals.
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The future of precision.heart.org • Marketplace for scientists, clinicians, patients, consumers, and communities • Community forum • Link to journals • Educational tools • Data challenges • Artificial Intelligence in clinical Medicine, scientific discoveries, and communities • Direct-to-participant recruitment—MyResearchLegacy.org
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Questions PART 5
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. DEMO