SlideShare a Scribd company logo
1 of 12
Single-cell Data on Polly
Polly by Elucidata
Elucidata’s data harmonization platform- Polly, delivers the highest
quality single cell data to fit diverse analysis methods & pipelines. All
datasets are Polly Verified, i.e harmonized with a configurable, granular
& transparent curation process
Streamlined Journey to Improving Quality of Single-cell Data
Data on Polly
Data at Source
Tabular file
(MTX, CSV)
Txt File
● 50% Missing annotations
● <2% Harmonized
● Different access nuances
● Formats vary across datasets &
samples
● <1% Missing annotations
● 100% Harmonized
● 4X New fields added
● Consistent H5AD format
Processing
Metadata
Harmonization
Cell
Annotation
Quality
Assurance
Polly Harmonization
Single-Cell Data Options on Polly
Raw Counts Polly Processed Counts Author Processed Counts
What is it?
Raw unfiltered counts
extracted from the
source, cleaned and
metadata annotated
Harmonized Single Cell Data,
consistently processed & cell
type annotated using a
validated Polly Pipeline
Single Cell Data that is
processed & cell type
annotated using author
provided parameters
Useful for
Re-Processing and
annotating data with
in-house pipelines
Making data comparable &
interoperable for large scale
comparative analyses
Replicating a published study
of interest
Output
File(s)
Unfiltered Raw counts
with 30 metadata fields
(H5AD)
● Polly Processed Counts
with cell type annotations
and 32 other fields (H5AD)
● Raw Counts with 30 fields
(H5AD)
● Author Processed Counts
with cell type annotations &
32 other fields (H5AD)
● Raw Counts with 30 fields
(H5AD)
Why Access Single-Cell Data on Polly?
Data You
Can Trust
~50 QA checks performed
on all data/metadata to
ensure quality and
provenance.
Learn how each dataset
was processed and
annotated with
comprehensive QA
reports.
Complete
Transparency
Request custom metadata
fields or cell type
annotation with your own
markers.
Customizable
Harmonization
Flexible Ways to
Consume Data
Work with Polly’s data on
tools and environments of
your choice. No download
restrictions applied!
How We Deliver: Data Concierge
Data Audits
● Experts identify datasets relevant to your research on/off Polly
● Requirement gathering for curation & processing of found data
Store in your Atlas
● Domain specific repository of Analysis-Ready data
● All datasets are QC-ed, Custom Curated & Polly Verified
Exploration and Analysis
● Explore on Polly via CellxGene
● Download data with Polly’s APIs or GUI, explore on tools of choice
● Customized solutions as service: GSEA, Knowledge Graphs, ML
Classifiers and Dashboards for analysis & visualization
7
About the Customer
A therapeutics is an early stage startup based in Boston that is developing biologics for inflammatory and
autoimmune diseases. The company was looking to identify potential targets for these indications.
Objective
Find and integrate single cell datasets specific to inflammatory diseases from public sources.
Perform meta-analysis to arrive at fibroblast specific gene targets for further exploration.
Target Identification & Validation with Curated Single-cell Data: Case-study
Finding Relevant Datasets
How Was the Data Processed?
Data at Source Unfiltered Raw
Counts
H5AD files with
Hugo symbols, QC
metrics, curated
metadata fields
Filtering &
Normalization
Consistent filtering
criteria, normalization
& Batch effect
correction
Cell Type
Annotation
Store on
Atlas
Marker list from
publications to
derive cell
annotations
h5AD with curated
metadata and
consistently
annotated cells
mtx, csv, tsv, h5ad,
seurat, h5
Meta-Analysis for Target Identification and Validation
Differential expression analysis of merged
data to get top 250 DEGs
13 datasets identified and 3 datasets
merged
Refine results to top 20 genes with RF
model and point biserial scores
Examine expression and narrow down to
10 genes
Review literature and perform pathway
analysis to arrive at 5 targets
B cells
T cells
Myeloid cells
Plasma
Stem or
Enterocyte cells
Mast cells
Vascular cells
Fibroblasts
UMAP
2
UMAP 1
Integrated Cell Type
Diseased Normal
Fibroblast Fibroblast Other
Other
Gene
1
Single Cell Data Curation
Impact
Target Identification & Validation
156 scRNA-Seq datasets, specific to inflammatory diseases
were identified and annotated with relevant metadata information
Shortlisted 4 novel targets and validated 5 pre-identified targets
using meta-analysis
Time Savings
4X acceleration in the target identification process (from 8-10 months
to 2.5 months)
Reach out to us at info@elucidata.io or Book a Demo
with us to learn more.

More Related Content

Similar to Single-cell Data on Polly.pptx

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Paolo Missier
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data ManagementDABBETA DIVYA
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Paolo Missier
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshopGenomeInABottle
 
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenPAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenIntegrated Breeding Platform
 
Focus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targetsFocus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targetsNolan Nichols
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized MedicineEdgewater
 
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysisD1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysisDr. Wilfred Lin (Ph.D.)
 
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...DataWorks Summit
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Ann-Marie Roche
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformaticscontactsoorya
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Upendra Agarwal
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Remedy Informatics
 
Health advances ai in diagnostic development
Health advances ai in diagnostic developmentHealth advances ai in diagnostic development
Health advances ai in diagnostic developmentHealth Advances
 
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew WoodwarkPistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew WoodwarkPistoia Alliance
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesMatthieu Schapranow
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 

Similar to Single-cell Data on Polly.pptx (20)

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenPAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
 
Focus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targetsFocus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targets
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysisD1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
 
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformatics
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
 
Irida bccdc dec10_2015
Irida bccdc dec10_2015Irida bccdc dec10_2015
Irida bccdc dec10_2015
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
 
Health advances ai in diagnostic development
Health advances ai in diagnostic developmentHealth advances ai in diagnostic development
Health advances ai in diagnostic development
 
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew WoodwarkPistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreelreely ones
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 

Single-cell Data on Polly.pptx

  • 2. Polly by Elucidata Elucidata’s data harmonization platform- Polly, delivers the highest quality single cell data to fit diverse analysis methods & pipelines. All datasets are Polly Verified, i.e harmonized with a configurable, granular & transparent curation process
  • 3. Streamlined Journey to Improving Quality of Single-cell Data Data on Polly Data at Source Tabular file (MTX, CSV) Txt File ● 50% Missing annotations ● <2% Harmonized ● Different access nuances ● Formats vary across datasets & samples ● <1% Missing annotations ● 100% Harmonized ● 4X New fields added ● Consistent H5AD format Processing Metadata Harmonization Cell Annotation Quality Assurance Polly Harmonization
  • 4. Single-Cell Data Options on Polly Raw Counts Polly Processed Counts Author Processed Counts What is it? Raw unfiltered counts extracted from the source, cleaned and metadata annotated Harmonized Single Cell Data, consistently processed & cell type annotated using a validated Polly Pipeline Single Cell Data that is processed & cell type annotated using author provided parameters Useful for Re-Processing and annotating data with in-house pipelines Making data comparable & interoperable for large scale comparative analyses Replicating a published study of interest Output File(s) Unfiltered Raw counts with 30 metadata fields (H5AD) ● Polly Processed Counts with cell type annotations and 32 other fields (H5AD) ● Raw Counts with 30 fields (H5AD) ● Author Processed Counts with cell type annotations & 32 other fields (H5AD) ● Raw Counts with 30 fields (H5AD)
  • 5. Why Access Single-Cell Data on Polly? Data You Can Trust ~50 QA checks performed on all data/metadata to ensure quality and provenance. Learn how each dataset was processed and annotated with comprehensive QA reports. Complete Transparency Request custom metadata fields or cell type annotation with your own markers. Customizable Harmonization Flexible Ways to Consume Data Work with Polly’s data on tools and environments of your choice. No download restrictions applied!
  • 6. How We Deliver: Data Concierge Data Audits ● Experts identify datasets relevant to your research on/off Polly ● Requirement gathering for curation & processing of found data Store in your Atlas ● Domain specific repository of Analysis-Ready data ● All datasets are QC-ed, Custom Curated & Polly Verified Exploration and Analysis ● Explore on Polly via CellxGene ● Download data with Polly’s APIs or GUI, explore on tools of choice ● Customized solutions as service: GSEA, Knowledge Graphs, ML Classifiers and Dashboards for analysis & visualization
  • 7. 7 About the Customer A therapeutics is an early stage startup based in Boston that is developing biologics for inflammatory and autoimmune diseases. The company was looking to identify potential targets for these indications. Objective Find and integrate single cell datasets specific to inflammatory diseases from public sources. Perform meta-analysis to arrive at fibroblast specific gene targets for further exploration. Target Identification & Validation with Curated Single-cell Data: Case-study
  • 9. How Was the Data Processed? Data at Source Unfiltered Raw Counts H5AD files with Hugo symbols, QC metrics, curated metadata fields Filtering & Normalization Consistent filtering criteria, normalization & Batch effect correction Cell Type Annotation Store on Atlas Marker list from publications to derive cell annotations h5AD with curated metadata and consistently annotated cells mtx, csv, tsv, h5ad, seurat, h5
  • 10. Meta-Analysis for Target Identification and Validation Differential expression analysis of merged data to get top 250 DEGs 13 datasets identified and 3 datasets merged Refine results to top 20 genes with RF model and point biserial scores Examine expression and narrow down to 10 genes Review literature and perform pathway analysis to arrive at 5 targets B cells T cells Myeloid cells Plasma Stem or Enterocyte cells Mast cells Vascular cells Fibroblasts UMAP 2 UMAP 1 Integrated Cell Type Diseased Normal Fibroblast Fibroblast Other Other Gene 1
  • 11. Single Cell Data Curation Impact Target Identification & Validation 156 scRNA-Seq datasets, specific to inflammatory diseases were identified and annotated with relevant metadata information Shortlisted 4 novel targets and validated 5 pre-identified targets using meta-analysis Time Savings 4X acceleration in the target identification process (from 8-10 months to 2.5 months)
  • 12. Reach out to us at info@elucidata.io or Book a Demo with us to learn more.