SlideShare a Scribd company logo
1 of 41
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
Chemistry data delivery from the US-EPA
to support environmental chemistry
Antony Williams
(…but I represent many contributors!)
Government Cheminformatics Conference, October 2023
This talk is to open discussions…
• There are many tools developed by our cheminformatics team
and across other centers in EPA. I will represent ours only…
• We have production level public-facing tools, proof-of-concept
public-facing tools, and many tools in development…
• From proof-of-concept to public-facing can take a while
• This talk (and others from our team) is to make you aware of
our efforts and encourage discussions
1
Free-Access Cheminformatics Tools
• The Center for Computational Toxicology and Exposure has
delivered many tools
– CompTox Chemicals Dashboard
– Proof-of-Concept cheminformatics modules
• Chemicals Hazard Profiling
• AnalyticalQC data
• Chemical Transformations Database
• Structure standardizer
• Chemical Safety Profiling
• All chemicals are stored/curated in DSSTox
2
DSSTox Database
3
Accessing DSSTox chemistry: CompTox Chemicals Dashboard
• A publicly accessible website delivering:
– 1.2M chemicals with related property data
– Related substances: transformation products, mono/polymer
– Experimental/predicted physicochemical property data
– Experimental Human and Ecological hazard data
– Integration to “biological assay data” (ToxCast/Tox21)
– Information regarding chemicals in consumer products
– Links to other agency websites and public data resources
– “Batch searching” for tens to thousands of chemicals
4
CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard
5
1 of ~1.2M Chemical Pages
Experimental/Predicted Properties
6
Lots of “proof-of-concept” tools in development
• PoCs are research software builds to prove approaches
before moving into production software environments
• PoCs are to figure out how to address specific questions
• Assemble data, develop data model(s), test user interface
approaches, work with test user base to garner feedback
• Since PoCs are internal access data refreshes and application
updates can be more
• Underlying APIs are being used in our research
7
PoCs have been rebuilt for production
• Examples of PoCs integrated into production apps
– WebTEST predictions on the Dashboard
– Structure/substructure/similarity search
8
Cheminformatics PoC Modules
https://www.epa.gov/chemical-research/cheminformatics
7 integrated modules built using common
cheminformatics & structure-handling algorithms
9
Previously called the “Hazard Comparison Dashboard”
Underlying “Resolver” searching for all identifiers
10
SMILES
NAMES
CAS RN
InChIKey
DTXSIDs
Hazard Profile
11
Hazard Profile
On-Hover view of trumping scheme call
12
Hazard Profile
On-click view of underlying data
13
Easy Export of all data to Excel or SDF
14
Linked to Chemical Transformation Simulator
15
Linked to Chemical Transformation Simulator
16
Simple Analog “read-across”
• Suppose a chemical has limited data – perform an analog
search to find related chemicals with data
17
Simple Analog “read-across”
Similarity
18
WebTEST Batch Prediction
• Batch prediction of all WebTEST predictions
• Display of experimental and predicted data and reports
19
QSAR-Ready/MS-Ready Standardizer
• “QSAR and MS-Ready” standardization underpins models and linking
• MS-Ready is ESSENTIAL to our support of Non-Targeted Analysis
• QSAR-Ready rules need tweaking
20
https://jcheminf.biomedcentral.com/articles/10.1186/s1332
1-018-0299-2
QSAR/MS-Ready standardizer
• We now CONTROL the rules…add new rules, edit existing rules
21
Example: Tautomer Rules
• We control rules for
– Tautomers
– Mesomers
– Neutralize/De-radicalize
– Break salts
– Standard checks
– etc….
• Necessary for mapping
chemicals in DSSTox
22
Structure Alerts Module
• Structure “Alerts” module based on:
– SMARTS (PAINS)
– ToxPrints (Ashby and TTC)
– SMILES (IARC 1, 2, 3a and 3b)
23
ID Chemical aim ashby iarc1 …
“PREDICT 2.0” model delivery
24
Excel report for models for each data set
• Cover sheet with model metadata
• Training and test set statistics
25
• Training and test set statistics
• Prediction results for each method
We need to add in “PFAS-ToxPrints”
26
Analytical QC data for Tox21 (more tomorrow)
• >9000 chemicals with >40,000 spectra (LCMS, GCMS & NMR)
27
AMOS database
AMOS Analytical Methods and Open Spectra
29
Chemical Transformation Simulator
30
Chemical Transformation Simulator Database
31
ChET
ChET Reaction Map Lists
33
ChET Visual Reaction Maps
• Compare and overlap maps
• Load all maps containing a
particular chemical
• Prune and filter maps
34
Chemical Space Mapping (CheMSTER)
Chemical Mapping of Space Translated into Enhanced
Representations
35
• Initially built to support
NTA research
• Functionality to overlap
and compare datasets
• Selection of chemicals
based on variables
(predicted properties)
• Plug-in growing model set
to add variables for
comparison
Perfect Example of FAIR Data and APIs
• We owe a lot to FAIR data and availability of information
• We curate a lot of our chemistry data using public resources
such as PubChem, ChEBI, Common Chemistry and others
• The availability of Public APIs takes things to another level!
• We have been using the PubChem API to harvest data so
we can build new applications, like the Safety Module
36
Cheminformatics Safety Module
Integrate multiple data streams…
37
The CompTox API is now public
https://api-ccte.epa.gov/docs/index.html
38
Conclusions
• Underpinning chemistry data is from the DSSTox database
• CompTox Chemicals Dashboard is public access to DSSTox
and other related databases
• Proof-of-Concept (PoC) tools are built to prove approaches
• Effort is both cost and time efficient
• Everything is increasingly API driven and APIs are now public
39
Acknowledgments
• Our DSSTox curation team
• AMOS – Greg Janesch and Tyler Carr
• AnalyticalQC Viewer – Christian Ramsland
• Cheminformatics Modules – Nate Charest, Charlie Lowe,
Todd Martin
• ChET – Adam Edelman-Munoz, Caroline Stevens and team
• ChemSTER – Nate Charest and Adam Edelman-Munoz
• Our SCDCD colleagues and DevOps team
40

More Related Content

Similar to Chemistry data delivery from the US-EPA to support environmental chemistry

The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...Andrew McEachran
 

Similar to Chemistry data delivery from the US-EPA to support environmental chemistry (20)

Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Cheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting ExposomicsCheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting Exposomics
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
Data Review and Clean-Up Using Crowdsourced Input via the US EPA CompTox Das...
Data Review and Clean-Up Using Crowdsourced Input via the  US EPA CompTox Das...Data Review and Clean-Up Using Crowdsourced Input via the  US EPA CompTox Das...
Data Review and Clean-Up Using Crowdsourced Input via the US EPA CompTox Das...
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
 
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
The EPA Comptox Chemicals Dashboard as a Data Integration Hub for Environment...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
 

Recently uploaded

Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxCherry
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACherry
 
Understanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution MethodsUnderstanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution Methodsimroshankoirala
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Cherry
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cherry
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloChristian Robert
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCherry
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Cherry
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCherry
 
Precision Silviculture and Silviculture practices of bamboo.pptx
Precision Silviculture and Silviculture practices of bamboo.pptxPrecision Silviculture and Silviculture practices of bamboo.pptx
Precision Silviculture and Silviculture practices of bamboo.pptxNISHIKANTKRISHAN
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Cherry
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsSérgio Sacani
 
Lipids: types, structure and important functions.
Lipids: types, structure and important functions.Lipids: types, structure and important functions.
Lipids: types, structure and important functions.Cherry
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneySérgio Sacani
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationSérgio Sacani
 
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...Cherry
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Cherry
 

Recently uploaded (20)

Genome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptxGenome organization in virus,bacteria and eukaryotes.pptx
Genome organization in virus,bacteria and eukaryotes.pptx
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
Understanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution MethodsUnderstanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution Methods
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Adaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte CarloAdaptive Restore algorithm & importance Monte Carlo
Adaptive Restore algorithm & importance Monte Carlo
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Precision Silviculture and Silviculture practices of bamboo.pptx
Precision Silviculture and Silviculture practices of bamboo.pptxPrecision Silviculture and Silviculture practices of bamboo.pptx
Precision Silviculture and Silviculture practices of bamboo.pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Lipids: types, structure and important functions.
Lipids: types, structure and important functions.Lipids: types, structure and important functions.
Lipids: types, structure and important functions.
 
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
Major groups of bacteria: Spirochetes, Chlamydia, Rickettsia, nanobes, mycopl...
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 

Chemistry data delivery from the US-EPA to support environmental chemistry

  • 1. The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA Chemistry data delivery from the US-EPA to support environmental chemistry Antony Williams (…but I represent many contributors!) Government Cheminformatics Conference, October 2023
  • 2. This talk is to open discussions… • There are many tools developed by our cheminformatics team and across other centers in EPA. I will represent ours only… • We have production level public-facing tools, proof-of-concept public-facing tools, and many tools in development… • From proof-of-concept to public-facing can take a while • This talk (and others from our team) is to make you aware of our efforts and encourage discussions 1
  • 3. Free-Access Cheminformatics Tools • The Center for Computational Toxicology and Exposure has delivered many tools – CompTox Chemicals Dashboard – Proof-of-Concept cheminformatics modules • Chemicals Hazard Profiling • AnalyticalQC data • Chemical Transformations Database • Structure standardizer • Chemical Safety Profiling • All chemicals are stored/curated in DSSTox 2
  • 5. Accessing DSSTox chemistry: CompTox Chemicals Dashboard • A publicly accessible website delivering: – 1.2M chemicals with related property data – Related substances: transformation products, mono/polymer – Experimental/predicted physicochemical property data – Experimental Human and Ecological hazard data – Integration to “biological assay data” (ToxCast/Tox21) – Information regarding chemicals in consumer products – Links to other agency websites and public data resources – “Batch searching” for tens to thousands of chemicals 4
  • 7. 1 of ~1.2M Chemical Pages Experimental/Predicted Properties 6
  • 8. Lots of “proof-of-concept” tools in development • PoCs are research software builds to prove approaches before moving into production software environments • PoCs are to figure out how to address specific questions • Assemble data, develop data model(s), test user interface approaches, work with test user base to garner feedback • Since PoCs are internal access data refreshes and application updates can be more • Underlying APIs are being used in our research 7
  • 9. PoCs have been rebuilt for production • Examples of PoCs integrated into production apps – WebTEST predictions on the Dashboard – Structure/substructure/similarity search 8
  • 10. Cheminformatics PoC Modules https://www.epa.gov/chemical-research/cheminformatics 7 integrated modules built using common cheminformatics & structure-handling algorithms 9 Previously called the “Hazard Comparison Dashboard”
  • 11. Underlying “Resolver” searching for all identifiers 10 SMILES NAMES CAS RN InChIKey DTXSIDs
  • 13. Hazard Profile On-Hover view of trumping scheme call 12
  • 14. Hazard Profile On-click view of underlying data 13
  • 15. Easy Export of all data to Excel or SDF 14
  • 16. Linked to Chemical Transformation Simulator 15
  • 17. Linked to Chemical Transformation Simulator 16
  • 18. Simple Analog “read-across” • Suppose a chemical has limited data – perform an analog search to find related chemicals with data 17
  • 20. WebTEST Batch Prediction • Batch prediction of all WebTEST predictions • Display of experimental and predicted data and reports 19
  • 21. QSAR-Ready/MS-Ready Standardizer • “QSAR and MS-Ready” standardization underpins models and linking • MS-Ready is ESSENTIAL to our support of Non-Targeted Analysis • QSAR-Ready rules need tweaking 20 https://jcheminf.biomedcentral.com/articles/10.1186/s1332 1-018-0299-2
  • 22. QSAR/MS-Ready standardizer • We now CONTROL the rules…add new rules, edit existing rules 21
  • 23. Example: Tautomer Rules • We control rules for – Tautomers – Mesomers – Neutralize/De-radicalize – Break salts – Standard checks – etc…. • Necessary for mapping chemicals in DSSTox 22
  • 24. Structure Alerts Module • Structure “Alerts” module based on: – SMARTS (PAINS) – ToxPrints (Ashby and TTC) – SMILES (IARC 1, 2, 3a and 3b) 23 ID Chemical aim ashby iarc1 …
  • 25. “PREDICT 2.0” model delivery 24
  • 26. Excel report for models for each data set • Cover sheet with model metadata • Training and test set statistics 25 • Training and test set statistics • Prediction results for each method
  • 27. We need to add in “PFAS-ToxPrints” 26
  • 28. Analytical QC data for Tox21 (more tomorrow) • >9000 chemicals with >40,000 spectra (LCMS, GCMS & NMR) 27
  • 30. AMOS Analytical Methods and Open Spectra 29
  • 33. ChET
  • 34. ChET Reaction Map Lists 33
  • 35. ChET Visual Reaction Maps • Compare and overlap maps • Load all maps containing a particular chemical • Prune and filter maps 34
  • 36. Chemical Space Mapping (CheMSTER) Chemical Mapping of Space Translated into Enhanced Representations 35 • Initially built to support NTA research • Functionality to overlap and compare datasets • Selection of chemicals based on variables (predicted properties) • Plug-in growing model set to add variables for comparison
  • 37. Perfect Example of FAIR Data and APIs • We owe a lot to FAIR data and availability of information • We curate a lot of our chemistry data using public resources such as PubChem, ChEBI, Common Chemistry and others • The availability of Public APIs takes things to another level! • We have been using the PubChem API to harvest data so we can build new applications, like the Safety Module 36
  • 38. Cheminformatics Safety Module Integrate multiple data streams… 37
  • 39. The CompTox API is now public https://api-ccte.epa.gov/docs/index.html 38
  • 40. Conclusions • Underpinning chemistry data is from the DSSTox database • CompTox Chemicals Dashboard is public access to DSSTox and other related databases • Proof-of-Concept (PoC) tools are built to prove approaches • Effort is both cost and time efficient • Everything is increasingly API driven and APIs are now public 39
  • 41. Acknowledgments • Our DSSTox curation team • AMOS – Greg Janesch and Tyler Carr • AnalyticalQC Viewer – Christian Ramsland • Cheminformatics Modules – Nate Charest, Charlie Lowe, Todd Martin • ChET – Adam Edelman-Munoz, Caroline Stevens and team • ChemSTER – Nate Charest and Adam Edelman-Munoz • Our SCDCD colleagues and DevOps team 40