SlideShare a Scribd company logo
1 of 38
Using Cheminformatics Approaches to Develop
a Structure Searchable Database
of Analytical Methods
Antony Williams1, Greg Janesch2, Tyler Carr2,
Sakuntala Sivasupramaniam3, and Nancy Baker4
1. Center for Computational Toxicology and Exposure, US-EPA, RTP, NC
2. ORAU Student Services Contractor
3. Senior Environmental Employment (SEE) Program
4. ParlezChem Inc.
http://www.orcid.org/0000-0002-2668-4821
March 2024: Spring Fall Meeting, New Orleans, LA
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
Building a Methods Database
• Simple Vision: I want to find the best method(s) associated with a
chemical and/or class of chemicals
• Answer the question “I cannot find a method for my chemical” - HELP
• The Approach:
– Aggregate MS method documents (and adjust the definition of “what is a useful method”)
– Extract chemistry (mostly CASRN and Names)
– Map CASRN and Names to structures
– Deliver a proof-of-concept application to search a database by names, CASRNs, InChIKeys
and ultimately structure
1
Most people start with Google
• People have many places to search for methods, and
there is no one integration hub, except for search engines
• Search engines can return so many hits – then you filter
by analytes, matrix, analytical methodology, so many
synonyms and abbreviations for so many chemicals
2
Synonyms, Abbreviations and Chemicals
3
• CAS Numbers, Names
and Abbreviations can
limit what’s possible…
Might this be a better view?
4
Might this be a better view?
5
When methods are mapped to chemistry…
• The advantages of mapping chemicals directly to methods
– When chemicals are mapped it opens access to many other tools
– Chemical structures allow for QSAR modeling
6
…and what if we could then profile toxicity?
7
…and what if we could then profile toxicity?
8
…or simply harvest data from the
CompTox Chemicals Dashboard
9
What data would you like???
10
Introducing AMOS
Analytical Methods and Spectra Database
• Three types of data in the database:
– Methods (regulatory, lab manuals and SOPs, publications, tech notes)
– Spectra (from public domain and our own laboratories)
– Monographs (harvested from SWGDRUG and other sites)
• Some methods have associated spectra
• Some data are just externally linked
• Currently contains around 263,000 spectra, 600,000 external
links, 4800 “Fact Sheets” and ~5000 methods
• Spectra – LC-MS, GC-MS, NMR
• ALL data are growing in number
11
Where are there methods?
• Agency-based methods
– EPA, USGS, USDA, CDC, FDA, OSHA, DEA, ATSDR, NIOSH …
• ASTM and ISO methods
• Vendor application notes – Bruker, Waters, Agilent, Sciex,
Shimadzu, LECO, Thermo
• Peer-reviewed articles
• Laboratory Documents – lab manuals, SOPs
12
A view of the methods list
13
Filtering the list for interests…
• Look for pesticides studied in water, by GC/MS, after 1990
14
Where are there methods?
• 900 method documents from the EPA harvested
15
Many Scanned Documents!!!
16
• Methods generally developed by
the agrochemical companies
• Include parents plus degradation
products
• Lots of scanned, old, documents
but the historical records are still
of significant use
• Electronic document forms of old
documents still of benefit
Embedding the old Method PDFs
17
Embedding New Method PDFs
18
When Methods are OPEN Access
19
When Methods are PubMed OPEN Access
20
Proprietary Methods for INTERNAL Access
21
If there is no method for your chemical
• Use “Chemical Similarity Searching” so that you can find
chemicals that are similar in structure space
• Use the “Tanimoto Similarity Search
22
Searching for a chemical – CASRN, Name
Direct structure searching coming
23
When Methods are Not Enough
• EPA is highly active in the field of non-targeted analysis
• We have been applying lots of cheminformatics approaches
24
Building a spectrum library to search against
25
Linking to actual spectra
26
Linking to actual spectra
27
There are errors EVERYWHERE: 110-75-8
It can be challenging
9/365 chemicals…
There are MANY errors in public
spectral databases
30
There are MANY errors in public
spectral databases
31
There are MANY errors in public
spectral databases
32
There are MANY errors in public
spectral databases
33
• Chemical structure representations would ideally
be standardized…consider tautomeric forms
• Not all substances are explicit and can be
ambiguous representations
Fact Sheets – 4800 of them and growing
34
>40,000 AnalyticalQC spectra
35
Conclusions
• We have built the first chemical structure indexed open
database of “methods”.
• Methods are not just “approved methods” but also
standard operating procedures, application notes, lab
manuals, regulatory methods etc.
• Integrating methods to experimental spectral data will
serve our non-targeted analysis efforts
• Work is underway to make it public
36
If you want to help…
• Send information regarding analytical methods and method
articles to williams.antony@epa.gov
• If you want to add methods or spectral data please contact me
37

More Related Content

Similar to Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods

Consensus ranking and fragmentation prediction for identification of unknowns...
Consensus ranking and fragmentation prediction for identification of unknowns...Consensus ranking and fragmentation prediction for identification of unknowns...
Consensus ranking and fragmentation prediction for identification of unknowns...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure standardization approaches for mass spectrometry data integration
Structure standardization approaches for  mass spectrometry data integrationStructure standardization approaches for  mass spectrometry data integration
Structure standardization approaches for mass spectrometry data integration
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Similar to Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods (20)

Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Consensus ranking and fragmentation prediction for identification of unknowns...
Consensus ranking and fragmentation prediction for identification of unknowns...Consensus ranking and fragmentation prediction for identification of unknowns...
Consensus ranking and fragmentation prediction for identification of unknowns...
 
Structure standardization approaches for mass spectrometry data integration
Structure standardization approaches for  mass spectrometry data integrationStructure standardization approaches for  mass spectrometry data integration
Structure standardization approaches for mass spectrometry data integration
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...Cheminformatics approaches to support chemical identification delivered via t...
Cheminformatics approaches to support chemical identification delivered via t...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Cheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting ExposomicsCheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting Exposomics
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
 
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Toxico...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Toxico...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
 

Recently uploaded

GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.pptGENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
SyedArifMalki
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Cherry
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
COMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCOMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demerits
Cherry
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Cherry
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Cherry
 
Lipids: types, structure and important functions.
Lipids: types, structure and important functions.Lipids: types, structure and important functions.
Lipids: types, structure and important functions.
Cherry
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Cherry
 

Recently uploaded (20)

ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
 
Molecular phylogeny, molecular clock hypothesis, molecular evolution, kimuras...
Molecular phylogeny, molecular clock hypothesis, molecular evolution, kimuras...Molecular phylogeny, molecular clock hypothesis, molecular evolution, kimuras...
Molecular phylogeny, molecular clock hypothesis, molecular evolution, kimuras...
 
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.pptGENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of Asepsis
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Precision Silviculture and Silviculture practices of bamboo.pptx
Precision Silviculture and Silviculture practices of bamboo.pptxPrecision Silviculture and Silviculture practices of bamboo.pptx
Precision Silviculture and Silviculture practices of bamboo.pptx
 
COMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCOMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demerits
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Lipids: types, structure and important functions.
Lipids: types, structure and important functions.Lipids: types, structure and important functions.
Lipids: types, structure and important functions.
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 
Taphonomy and Quality of the Fossil Record
Taphonomy and Quality of the  Fossil RecordTaphonomy and Quality of the  Fossil Record
Taphonomy and Quality of the Fossil Record
 
CONTRIBUTION OF PANCHANAN MAHESHWARI.pptx
CONTRIBUTION OF PANCHANAN MAHESHWARI.pptxCONTRIBUTION OF PANCHANAN MAHESHWARI.pptx
CONTRIBUTION OF PANCHANAN MAHESHWARI.pptx
 

Applying Cheminformatics to Develop a Structure Searchable Database of Analytical Methods

  • 1. Using Cheminformatics Approaches to Develop a Structure Searchable Database of Analytical Methods Antony Williams1, Greg Janesch2, Tyler Carr2, Sakuntala Sivasupramaniam3, and Nancy Baker4 1. Center for Computational Toxicology and Exposure, US-EPA, RTP, NC 2. ORAU Student Services Contractor 3. Senior Environmental Employment (SEE) Program 4. ParlezChem Inc. http://www.orcid.org/0000-0002-2668-4821 March 2024: Spring Fall Meeting, New Orleans, LA The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  • 2. Building a Methods Database • Simple Vision: I want to find the best method(s) associated with a chemical and/or class of chemicals • Answer the question “I cannot find a method for my chemical” - HELP • The Approach: – Aggregate MS method documents (and adjust the definition of “what is a useful method”) – Extract chemistry (mostly CASRN and Names) – Map CASRN and Names to structures – Deliver a proof-of-concept application to search a database by names, CASRNs, InChIKeys and ultimately structure 1
  • 3. Most people start with Google • People have many places to search for methods, and there is no one integration hub, except for search engines • Search engines can return so many hits – then you filter by analytes, matrix, analytical methodology, so many synonyms and abbreviations for so many chemicals 2
  • 4. Synonyms, Abbreviations and Chemicals 3 • CAS Numbers, Names and Abbreviations can limit what’s possible…
  • 5. Might this be a better view? 4
  • 6. Might this be a better view? 5
  • 7. When methods are mapped to chemistry… • The advantages of mapping chemicals directly to methods – When chemicals are mapped it opens access to many other tools – Chemical structures allow for QSAR modeling 6
  • 8. …and what if we could then profile toxicity? 7
  • 9. …and what if we could then profile toxicity? 8
  • 10. …or simply harvest data from the CompTox Chemicals Dashboard 9
  • 11. What data would you like??? 10
  • 12. Introducing AMOS Analytical Methods and Spectra Database • Three types of data in the database: – Methods (regulatory, lab manuals and SOPs, publications, tech notes) – Spectra (from public domain and our own laboratories) – Monographs (harvested from SWGDRUG and other sites) • Some methods have associated spectra • Some data are just externally linked • Currently contains around 263,000 spectra, 600,000 external links, 4800 “Fact Sheets” and ~5000 methods • Spectra – LC-MS, GC-MS, NMR • ALL data are growing in number 11
  • 13. Where are there methods? • Agency-based methods – EPA, USGS, USDA, CDC, FDA, OSHA, DEA, ATSDR, NIOSH … • ASTM and ISO methods • Vendor application notes – Bruker, Waters, Agilent, Sciex, Shimadzu, LECO, Thermo • Peer-reviewed articles • Laboratory Documents – lab manuals, SOPs 12
  • 14. A view of the methods list 13
  • 15. Filtering the list for interests… • Look for pesticides studied in water, by GC/MS, after 1990 14
  • 16. Where are there methods? • 900 method documents from the EPA harvested 15
  • 17. Many Scanned Documents!!! 16 • Methods generally developed by the agrochemical companies • Include parents plus degradation products • Lots of scanned, old, documents but the historical records are still of significant use • Electronic document forms of old documents still of benefit
  • 18. Embedding the old Method PDFs 17
  • 20. When Methods are OPEN Access 19
  • 21. When Methods are PubMed OPEN Access 20
  • 22. Proprietary Methods for INTERNAL Access 21
  • 23. If there is no method for your chemical • Use “Chemical Similarity Searching” so that you can find chemicals that are similar in structure space • Use the “Tanimoto Similarity Search 22
  • 24. Searching for a chemical – CASRN, Name Direct structure searching coming 23
  • 25. When Methods are Not Enough • EPA is highly active in the field of non-targeted analysis • We have been applying lots of cheminformatics approaches 24
  • 26. Building a spectrum library to search against 25
  • 27. Linking to actual spectra 26
  • 28. Linking to actual spectra 27
  • 29. There are errors EVERYWHERE: 110-75-8
  • 30. It can be challenging 9/365 chemicals…
  • 31. There are MANY errors in public spectral databases 30
  • 32. There are MANY errors in public spectral databases 31
  • 33. There are MANY errors in public spectral databases 32
  • 34. There are MANY errors in public spectral databases 33 • Chemical structure representations would ideally be standardized…consider tautomeric forms • Not all substances are explicit and can be ambiguous representations
  • 35. Fact Sheets – 4800 of them and growing 34
  • 37. Conclusions • We have built the first chemical structure indexed open database of “methods”. • Methods are not just “approved methods” but also standard operating procedures, application notes, lab manuals, regulatory methods etc. • Integrating methods to experimental spectral data will serve our non-targeted analysis efforts • Work is underway to make it public 36
  • 38. If you want to help… • Send information regarding analytical methods and method articles to williams.antony@epa.gov • If you want to add methods or spectral data please contact me 37