Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big data exposome and pediatric outcomes

725 views

Published on

Talk at NEMOURS Hot Topics on 12/8/2016

Published in: Health & Medicine
  • Yes you are right. There are many research paper writing services available now. But almost services are fake and illegal. Only a genuine service will treat their customer with quality research papers. ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! I can recommend a site that has helped me. It's called ⇒ www.WritePaper.info ⇐ They helped me for writing my quality research paper.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • There is a useful site for you that will help you to write a perfect and valuable essay and so on. Check out, please ⇒ www.WritePaper.info ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Works For Teens! Hey, I'm only 18 and I thought I was going to have small boobs forever. After using your book for about 2 weeks, I started seeing results! I then used it for another month and I managed to get my breasts up to a C cup (with padding). I'm so pleased and I'm getting a lot more attention from boys now! Thanks you ✔✔✔ https://dwz1.cc/iZqgQnlK
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • I know that while making the choice to recover was crucial, and having the will comes in handy, but it is the girls in this program and their support that has helped me make it this far. I have been in recovery without purging for two months. I know I still have a long way to go, but I could not imagine doing it on my own. Even in this short amount of time, I can see myself emerging and the world is opening up before me. I have started making plans to do the things I’ve wanted to do for years, but that got lost in my bulimia cycle. Just taking that first step in recovery, has made me see what is really important. ●●● http://t.cn/A6Pq6KF6
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Big data exposome and pediatric outcomes

  1. 1. Big data streams to elucidate the role of environmental exposures in pediatric outcomes Chirag J Patel Hot Topics! 12/8/2016 chirag@hms.harvard.edu @chiragjp www.chiragjpgroup.org
  2. 2. Conflicts of Interest Disclosure: None
  3. 3. P = G + E Phenotypes are a function of inherited and environmental factors
  4. 4. P = G + EType 2 Diabetes Cancer Birthweight Birth timing Phenotype Phenotypes are a function of inherited and environmental factors
  5. 5. P = G + EType 2 Diabetes Cancer Birthweight Birth timing Phenotype Genome polymorphisms Phenotypes are a function of inherited and environmental factors
  6. 6. P = G + EType 2 Diabetes Cancer Birthweight Birth timing Phenotype Genome polymorphisms Environment Infectious agents Nutrients Pollutants Drugs Phenotypes are a function of inherited and environmental factors
  7. 7. P = G + E However: we lack methods to discover the role of E in phenotypes and disease for precision medicine.
  8. 8. ... and the case is different with genetics (e.g., genomics)! over 1,400 Genome-wide Association Studies (GWAS) NHGRI GWAS Catalog https://www.genome.gov/ G: dad mom me
  9. 9. A similar platform for discovery should exist for E!
  10. 10. A similar platform for discovery should exist for E! Why?
  11. 11. σ2 P = σ2 G + σ2 E
  12. 12. σ2 G σ2 P H2 = Heritability (H2) is the range of phenotypic variability attributed to genetic variability in a population Indicator of the proportion of phenotypic differences attributed to G.
  13. 13. Height is an example of a heritable trait: Francis Galton shows how its done (1887) mid-height of 205 parents described 60% of variability of 928 offspring σ2 G σ2 P
  14. 14. Source: SNPedia.com H2 estimates for complex traits are low and variable: massive opportunity for high-throughput E research Eye color Hair curliness Type−1 diabetes Height Schizophrenia Epilepsy Graves' disease Celiac disease Polycystic ovary syndrome Attention deficit hyperactivity disorder Bipolar disorder Obesity Alzheimer's disease Anorexia nervosa Psoriasis Bone mineral density Menarche, age at Nicotine dependence Sexual orientation Alcoholism Lupus Rheumatoid arthritis Crohn's disease Migraine Thyroid cancer Autism Blood pressure, diastolic Body mass index Depression Coronary artery disease Insomnia Menopause, age at Heart disease Birthweight Prostate cancer QT interval Breast cancer Ovarian cancer Preterm Birth Hangover Stroke Asthma Blood pressure, systolic Hypertension Osteoarthritis Parkinson's disease Longevity Type−2 diabetes Gallstone disease Testicular cancer Cervical cancer Sciatica Parturition Timing Bladder cancer Colon cancer Lung cancer Leukemia Stomach cancer 0 25 50 75 100 Heritability: Var(G)/Var(Phenotype)
  15. 15. Source: SNPedia.com H2 estimates for complex traits are low and variable: massive opportunity for high-throughput E research Eye color Hair curliness Type−1 diabetes Height Schizophrenia Epilepsy Graves' disease Celiac disease Polycystic ovary syndrome Attention deficit hyperactivity disorder Bipolar disorder Obesity Alzheimer's disease Anorexia nervosa Psoriasis Bone mineral density Menarche, age at Nicotine dependence Sexual orientation Alcoholism Lupus Rheumatoid arthritis Crohn's disease Migraine Thyroid cancer Autism Blood pressure, diastolic Body mass index Depression Coronary artery disease Insomnia Menopause, age at Heart disease Birthweight Prostate cancer QT interval Breast cancer Ovarian cancer Preterm Birth Hangover Stroke Asthma Blood pressure, systolic Hypertension Osteoarthritis Parkinson's disease Longevity Type−2 diabetes Gallstone disease Testicular cancer Cervical cancer Sciatica Parturition Timing Bladder cancer Colon cancer Lung cancer Leukemia Stomach cancer 0 25 50 75 100 Heritability: Var(G)/Var(Phenotype) Type 2 Diabetes (25%)
  16. 16. Source: SNPedia.com H2 estimates for complex traits are low and variable: massive opportunity for high-throughput E research Eye color Hair curliness Type−1 diabetes Height Schizophrenia Epilepsy Graves' disease Celiac disease Polycystic ovary syndrome Attention deficit hyperactivity disorder Bipolar disorder Obesity Alzheimer's disease Anorexia nervosa Psoriasis Bone mineral density Menarche, age at Nicotine dependence Sexual orientation Alcoholism Lupus Rheumatoid arthritis Crohn's disease Migraine Thyroid cancer Autism Blood pressure, diastolic Body mass index Depression Coronary artery disease Insomnia Menopause, age at Heart disease Birthweight Prostate cancer QT interval Breast cancer Ovarian cancer Preterm Birth Hangover Stroke Asthma Blood pressure, systolic Hypertension Osteoarthritis Parkinson's disease Longevity Type−2 diabetes Gallstone disease Testicular cancer Cervical cancer Sciatica Parturition Timing Bladder cancer Colon cancer Lung cancer Leukemia Stomach cancer 0 25 50 75 100 Heritability: Var(G)/Var(Phenotype) Type 2 Diabetes (25%) Heart Disease (30-60%)
  17. 17. Source: SNPedia.com H2 estimates for complex traits are low and variable: massive opportunity for high-throughput E research Eye color Hair curliness Type−1 diabetes Height Schizophrenia Epilepsy Graves' disease Celiac disease Polycystic ovary syndrome Attention deficit hyperactivity disorder Bipolar disorder Obesity Alzheimer's disease Anorexia nervosa Psoriasis Bone mineral density Menarche, age at Nicotine dependence Sexual orientation Alcoholism Lupus Rheumatoid arthritis Crohn's disease Migraine Thyroid cancer Autism Blood pressure, diastolic Body mass index Depression Coronary artery disease Insomnia Menopause, age at Heart disease Birthweight Prostate cancer QT interval Breast cancer Ovarian cancer Preterm Birth Hangover Stroke Asthma Blood pressure, systolic Hypertension Osteoarthritis Parkinson's disease Longevity Type−2 diabetes Gallstone disease Testicular cancer Cervical cancer Sciatica Parturition Timing Bladder cancer Colon cancer Lung cancer Leukemia Stomach cancer 0 25 50 75 100 Heritability: Var(G)/Var(Phenotype) Preterm Birth (37%) Birthweight (40%) Timing (20%)
  18. 18. Source: SNPedia.com H2 estimates for complex traits are low and variable: massive opportunity for high-throughput E research Eye color Hair curliness Type−1 diabetes Height Schizophrenia Epilepsy Graves' disease Celiac disease Polycystic ovary syndrome Attention deficit hyperactivity disorder Bipolar disorder Obesity Alzheimer's disease Anorexia nervosa Psoriasis Bone mineral density Menarche, age at Nicotine dependence Sexual orientation Alcoholism Lupus Rheumatoid arthritis Crohn's disease Migraine Thyroid cancer Autism Blood pressure, diastolic Body mass index Depression Coronary artery disease Insomnia Menopause, age at Heart disease Birthweight Prostate cancer QT interval Breast cancer Ovarian cancer Preterm Birth Hangover Stroke Asthma Blood pressure, systolic Hypertension Osteoarthritis Parkinson's disease Longevity Type−2 diabetes Gallstone disease Testicular cancer Cervical cancer Sciatica Parturition Timing Bladder cancer Colon cancer Lung cancer Leukemia Stomach cancer 0 25 50 75 100 Heritability: Var(G)/Var(Phenotype) H2 < 50%
  19. 19. Explaining the other 50%: A new data-driven paradigm for robust discovery of E in disease via EWAS and the exposome PERSPECTIVES Xenobiotics Inflammation Preexisting disease Lipid peroxidation Oxidative stress Gut flora Internal chemical environment Externalenvironment ExposomeRADIATION DIET POLLUTION INFECTIONS DRUGS LIFE-STYLE STRESS Reactive electrophiles Metals Endocrine disrupters Immune modulators Receptor-binding proteins itical entity for disease eti- ogy (7). Recent discussion as focused on whether and ow to implement this vision 8). Although fully charac- rizing human exposomes daunting, strategies can be eveloped for getting “snap- hots” of critical portions of person’s exposome during ifferent stages of life. At ne extreme is a “bottom-up” rategy in which all chemi- als in each external source f a subject’s exposome are easured at each time point. lthoughthisapproachwould ave the advantage of relat- g important exposures to e air, water, or diet, it would quire enormous effort and ould miss essential compo- ents of the internal chemi- al environment due to such actors as gender, obesity, flammation, and stress. By ontrast, a “top-down” strat- gy would measure all chem- als (or products of their ownstream processing or ffects, so-called read-outs r signatures) in a subject’s ood. This would require nly a single blood specimen each time point and would relate directly ruptors and can be measured through serum some (telomere) length in peripheral blood mono- nuclear cells responded to chronic psychological stress, possibly mediated by the production of reac- tive oxygen species (15). Characterizing the exposome represents a tech- nological challenge like that of thehumangenomeproject,which began when DNA sequencing was in its infancy (16). Analyti- cal systems are needed to pro- cess small amounts of blood from thousands of subjects. Assays should be multiplexed for mea- suring many chemicals in each class of interest. Tandem mass spectrometry, gene and protein chips, and microfluidic systems offer the means to do this. Plat- forms for high-throughput assays shouldleadtoeconomiesofscale, again like those experienced by the human genome project. And because exposome technologies would provide feedback for thera- peuticinterventionsandpersonal- ized medicine, they should moti- vate the development of commer- cial devices for screening impor- tant environmental exposures in blood samples. With successful characterization of both Characterizing the exposome. The exposome represents the combined exposures from all sources that reach the internal chemical environment. Toxicologically important classes of exposome chemicals are shown. Signatures and biomarkers can detect these agents in blood or serum. onOctober21,2010www.sciencemag.orgrom Wild, 2005 Rappaport and Smith, 2010, 2011 Buck-Louis and Sundaram 2012 Miller and Jones, 2014 Patel CJ and Ioannidis JPAI, 2014
  20. 20. Explaining the other 50%: A new data-driven paradigm for robust discovery of E in disease via EWAS and the exposome what to measure? PERSPECTIVES Xenobiotics Inflammation Preexisting disease Lipid peroxidation Oxidative stress Gut flora Internal chemical environment Externalenvironment ExposomeRADIATION DIET POLLUTION INFECTIONS DRUGS LIFE-STYLE STRESS Reactive electrophiles Metals Endocrine disrupters Immune modulators Receptor-binding proteins itical entity for disease eti- ogy (7). Recent discussion as focused on whether and ow to implement this vision 8). Although fully charac- rizing human exposomes daunting, strategies can be eveloped for getting “snap- hots” of critical portions of person’s exposome during ifferent stages of life. At ne extreme is a “bottom-up” rategy in which all chemi- als in each external source f a subject’s exposome are easured at each time point. lthoughthisapproachwould ave the advantage of relat- g important exposures to e air, water, or diet, it would quire enormous effort and ould miss essential compo- ents of the internal chemi- al environment due to such actors as gender, obesity, flammation, and stress. By ontrast, a “top-down” strat- gy would measure all chem- als (or products of their ownstream processing or ffects, so-called read-outs r signatures) in a subject’s ood. This would require nly a single blood specimen each time point and would relate directly ruptors and can be measured through serum some (telomere) length in peripheral blood mono- nuclear cells responded to chronic psychological stress, possibly mediated by the production of reac- tive oxygen species (15). Characterizing the exposome represents a tech- nological challenge like that of thehumangenomeproject,which began when DNA sequencing was in its infancy (16). Analyti- cal systems are needed to pro- cess small amounts of blood from thousands of subjects. Assays should be multiplexed for mea- suring many chemicals in each class of interest. Tandem mass spectrometry, gene and protein chips, and microfluidic systems offer the means to do this. Plat- forms for high-throughput assays shouldleadtoeconomiesofscale, again like those experienced by the human genome project. And because exposome technologies would provide feedback for thera- peuticinterventionsandpersonal- ized medicine, they should moti- vate the development of commer- cial devices for screening impor- tant environmental exposures in blood samples. With successful characterization of both Characterizing the exposome. The exposome represents the combined exposures from all sources that reach the internal chemical environment. Toxicologically important classes of exposome chemicals are shown. Signatures and biomarkers can detect these agents in blood or serum. onOctober21,2010www.sciencemag.orgrom Wild, 2005 Rappaport and Smith, 2010, 2011 Buck-Louis and Sundaram 2012 Miller and Jones, 2014 Patel CJ and Ioannidis JPAI, 2014
  21. 21. Explaining the other 50%: A new data-driven paradigm for robust discovery of E in disease via EWAS and the exposome what to measure? how to measure? PERSPECTIVES Xenobiotics Inflammation Preexisting disease Lipid peroxidation Oxidative stress Gut flora Internal chemical environment Externalenvironment ExposomeRADIATION DIET POLLUTION INFECTIONS DRUGS LIFE-STYLE STRESS Reactive electrophiles Metals Endocrine disrupters Immune modulators Receptor-binding proteins itical entity for disease eti- ogy (7). Recent discussion as focused on whether and ow to implement this vision 8). Although fully charac- rizing human exposomes daunting, strategies can be eveloped for getting “snap- hots” of critical portions of person’s exposome during ifferent stages of life. At ne extreme is a “bottom-up” rategy in which all chemi- als in each external source f a subject’s exposome are easured at each time point. lthoughthisapproachwould ave the advantage of relat- g important exposures to e air, water, or diet, it would quire enormous effort and ould miss essential compo- ents of the internal chemi- al environment due to such actors as gender, obesity, flammation, and stress. By ontrast, a “top-down” strat- gy would measure all chem- als (or products of their ownstream processing or ffects, so-called read-outs r signatures) in a subject’s ood. This would require nly a single blood specimen each time point and would relate directly ruptors and can be measured through serum some (telomere) length in peripheral blood mono- nuclear cells responded to chronic psychological stress, possibly mediated by the production of reac- tive oxygen species (15). Characterizing the exposome represents a tech- nological challenge like that of thehumangenomeproject,which began when DNA sequencing was in its infancy (16). Analyti- cal systems are needed to pro- cess small amounts of blood from thousands of subjects. Assays should be multiplexed for mea- suring many chemicals in each class of interest. Tandem mass spectrometry, gene and protein chips, and microfluidic systems offer the means to do this. Plat- forms for high-throughput assays shouldleadtoeconomiesofscale, again like those experienced by the human genome project. And because exposome technologies would provide feedback for thera- peuticinterventionsandpersonal- ized medicine, they should moti- vate the development of commer- cial devices for screening impor- tant environmental exposures in blood samples. With successful characterization of both Characterizing the exposome. The exposome represents the combined exposures from all sources that reach the internal chemical environment. Toxicologically important classes of exposome chemicals are shown. Signatures and biomarkers can detect these agents in blood or serum. onOctober21,2010www.sciencemag.orgrom Wild, 2005 Rappaport and Smith, 2010, 2011 Buck-Louis and Sundaram 2012 Miller and Jones, 2014 Patel CJ and Ioannidis JPAI, 2014
  22. 22. Explaining the other 50%: A new data-driven paradigm for robust discovery of E in disease via EWAS and the exposome what to measure? how to measure? PERSPECTIVES Xenobiotics Inflammation Preexisting disease Lipid peroxidation Oxidative stress Gut flora Internal chemical environment Externalenvironment ExposomeRADIATION DIET POLLUTION INFECTIONS DRUGS LIFE-STYLE STRESS Reactive electrophiles Metals Endocrine disrupters Immune modulators Receptor-binding proteins itical entity for disease eti- ogy (7). Recent discussion as focused on whether and ow to implement this vision 8). Although fully charac- rizing human exposomes daunting, strategies can be eveloped for getting “snap- hots” of critical portions of person’s exposome during ifferent stages of life. At ne extreme is a “bottom-up” rategy in which all chemi- als in each external source f a subject’s exposome are easured at each time point. lthoughthisapproachwould ave the advantage of relat- g important exposures to e air, water, or diet, it would quire enormous effort and ould miss essential compo- ents of the internal chemi- al environment due to such actors as gender, obesity, flammation, and stress. By ontrast, a “top-down” strat- gy would measure all chem- als (or products of their ownstream processing or ffects, so-called read-outs r signatures) in a subject’s ood. This would require nly a single blood specimen each time point and would relate directly ruptors and can be measured through serum some (telomere) length in peripheral blood mono- nuclear cells responded to chronic psychological stress, possibly mediated by the production of reac- tive oxygen species (15). Characterizing the exposome represents a tech- nological challenge like that of thehumangenomeproject,which began when DNA sequencing was in its infancy (16). Analyti- cal systems are needed to pro- cess small amounts of blood from thousands of subjects. Assays should be multiplexed for mea- suring many chemicals in each class of interest. Tandem mass spectrometry, gene and protein chips, and microfluidic systems offer the means to do this. Plat- forms for high-throughput assays shouldleadtoeconomiesofscale, again like those experienced by the human genome project. And because exposome technologies would provide feedback for thera- peuticinterventionsandpersonal- ized medicine, they should moti- vate the development of commer- cial devices for screening impor- tant environmental exposures in blood samples. With successful characterization of both Characterizing the exposome. The exposome represents the combined exposures from all sources that reach the internal chemical environment. Toxicologically important classes of exposome chemicals are shown. Signatures and biomarkers can detect these agents in blood or serum. onOctober21,2010www.sciencemag.orgrom how to analyze in relation to health? Wild, 2005 Rappaport and Smith, 2010, 2011 Buck-Louis and Sundaram 2012 Miller and Jones, 2014 Patel CJ and Ioannidis JPAI, 2014
  23. 23. Explaining the other 50%: A new data-driven paradigm for robust discovery of E in disease via EWAS and the exposome what to measure? how to measure? Xenobiotics Inflammation Preexisting disease Lipid peroxidation Oxidative stress Gut flora Internal chemical environment Externalenvironment ExposomeRADIATION DIET POLLUTION INFECTIONS DRUGS LIFE-STYLE STRESS Reactive electrophiles Metals Endocrine disrupters Immune modulators Receptor-binding proteins itical entity for disease eti- ). Recent discussion as focused on whether and ow to implement this vision ). Although fully charac- rizing human exposomes daunting, strategies can be eveloped for getting “snap- hots” of critical portions of person’s exposome during ifferent stages of life. At ne extreme is a “bottom-up” rategy in which all chemi- als in each external source f a subject’s exposome are easured at each time point. lthoughthisapproachwould ave the advantage of relat- g important exposures to e air, water, or diet, it would quire enormous effort and ould miss essential compo- ents of the internal chemi- al environment due to such actors as gender, obesity, flammation, and stress. By ontrast, a “top-down” strat- gy would measure all chem- als (or products of their ownstream processing or ffects, so-called read-outs r signatures) in a subject’s ood. This would require nly a single blood specimen exposome represents a tech- nological challenge like that of thehumangenomeproject,which began when DNA sequencing was in its infancy ( cal systems are needed to pro- cess small amounts of blood from thousands of subjects. Assays should be multiplexed for mea- suring many chemicals in each class of interest. Tandem mass spectrometry, gene and protein chips, and microfluidic systems offer the means to do this. Plat- forms for high-throughput assays shouldleadtoeconomiesofscale, again like those experienced by the human genome project. And because exposome technologies would provide feedback for thera- peuticinterventionsandpersonal- ized medicine, they should moti- vate the development of commer- cial devices for screening impor- tant environmental exposures in blood samples. Characterizing the exposome. The exposome represents the combined exposures from all sources that reach the internal chemical environment. Toxicologically important classes of exposome chemicals are shown. Signatures and biomarkers can detect these agents in blood or serum. how to analyze in relation to health? Wild, 2005 Rappaport and Smith, 2010, 2011 Buck-Louis and Sundaram 2012 Miller and Jones, 2014 Patel CJ and Ioannidis JPAI, 2014
  24. 24. Explaining the other 50%: A new data-driven paradigm for robust discovery of E in disease via EWAS and the exposome what to measure? how to measure? Xenobiotics Inflammation Preexisting disease Lipid peroxidation Oxidative stress Gut flora Internal chemical environment Externalenvironment ExposomeRADIATION DIET POLLUTION INFECTIONS DRUGS LIFE-STYLE STRESS Reactive electrophiles Metals Endocrine disrupters Immune modulators Receptor-binding proteins itical entity for disease eti- ). Recent discussion as focused on whether and ow to implement this vision ). Although fully charac- rizing human exposomes daunting, strategies can be eveloped for getting “snap- hots” of critical portions of person’s exposome during ifferent stages of life. At ne extreme is a “bottom-up” rategy in which all chemi- als in each external source f a subject’s exposome are easured at each time point. lthoughthisapproachwould ave the advantage of relat- g important exposures to e air, water, or diet, it would quire enormous effort and ould miss essential compo- ents of the internal chemi- al environment due to such actors as gender, obesity, flammation, and stress. By ontrast, a “top-down” strat- gy would measure all chem- als (or products of their ownstream processing or ffects, so-called read-outs r signatures) in a subject’s ood. This would require nly a single blood specimen exposome represents a tech- nological challenge like that of thehumangenomeproject,which began when DNA sequencing was in its infancy ( cal systems are needed to pro- cess small amounts of blood from thousands of subjects. Assays should be multiplexed for mea- suring many chemicals in each class of interest. Tandem mass spectrometry, gene and protein chips, and microfluidic systems offer the means to do this. Plat- forms for high-throughput assays shouldleadtoeconomiesofscale, again like those experienced by the human genome project. And because exposome technologies would provide feedback for thera- peuticinterventionsandpersonal- ized medicine, they should moti- vate the development of commer- cial devices for screening impor- tant environmental exposures in blood samples. Characterizing the exposome. The exposome represents the combined exposures from all sources that reach the internal chemical environment. Toxicologically important classes of exposome chemicals are shown. Signatures and biomarkers can detect these agents in blood or serum. “A more comprehensive view of environmental exposure is needed ... to discover major causes of diseases...” how to analyze in relation to health? Wild, 2005 Rappaport and Smith, 2010, 2011 Buck-Louis and Sundaram 2012 Miller and Jones, 2014 Patel CJ and Ioannidis JPAI, 2014
  25. 25. Connecting Environmental Exposure with Disease: Missing the “System” of Exposures? E+ E- diseased non- diseased ? Exposed to many things, but do not assess the multiplicity. Fragmented literature of associations. Challenge to discover E associated with disease.
  26. 26. Example of fragmentation and vibration of effects: Is everything we eat associated with cancer? AJCN, 2012 JCE, 2015 50 random ingredients from Boston Cooking School Cookbook Any associated with cancer?
  27. 27. Example of fragmentation and vibration of effects: Is everything we eat associated with cancer? AJCN, 2012 JCE, 2015 Of 50, 40 studied in a cancer risk 50 random ingredients from Boston Cooking School Cookbook Any associated with cancer?
  28. 28. Example of fragmentation and vibration of effects: Is everything we eat associated with cancer? AJCN, 2012 JCE, 2015 FIGURE 1. Effect estimates reported in the literature by malignancy type (top) or ingredient (bottom). Only ingredients with $10 studie outliers are not shown (effect estimates .10). Of 50, 40 studied in a cancer risk 50 random ingredients from Boston Cooking School Cookbook Any associated with cancer?
  29. 29. Example of fragmentation and vibration of effects: Is everything we eat associated with cancer? AJCN, 2012 JCE, 2015 FIGURE 1. Effect estimates reported in the literature by malignancy type (top) or ingredient (bottom). Only ingredients with $10 studie outliers are not shown (effect estimates .10). Of 50, 40 studied in a cancer risk Weak statistical evidence: non-replicated inconsistent effects non-standardized 50 random ingredients from Boston Cooking School Cookbook Any associated with cancer?
  30. 30. Example of fragmentation and vibration of effects: Is everything we eat associated with cancer? AJCN, 2012 JCE, 2015 FIGURE 1. Effect estimates reported in the literature by malignancy type (top) or ingredient (bottom). Only ingredients with $10 studie outliers are not shown (effect estimates .10). Of 50, 40 studied in a cancer risk Weak statistical evidence: non-replicated inconsistent effects non-standardized 50 random ingredients from Boston Cooking School Cookbook Any associated with cancer? relative risk riskprotection
  31. 31. Connecting Environmental Exposure with Disease: Missing the “System” of Exposures? E+ E- diseased non- diseased ? Exposed to many things, but do not assess the multiplicity. Fragmented literature of associations. Challenge to discover E associated with disease.
  32. 32. courtesy: colabria.com Moving beyond the lamppost for discovery
  33. 33. Gold standard for breadth of human exposure information: National Health and Nutrition Examination Survey1 since the 1960s now biannual: 1999 onwards 10,000 participants per survey The sample for the survey is selected to represent the U.S. population of all ages. To produce reli- able statistics, NHANES over-samples persons 60 and older, African Americans, and Hispanics. Since the United States has experienced dramatic growth in the number of older people during this century, the aging population has major impli- cations for health care needs, public policy, and research priorities. NCHS is working with public health agencies to increase the knowledge of the health status of older Americans. NHANES has a primary role in this endeavor. All participants visit the physician. Dietary inter- views and body measurements are included for everyone. All but the very young have a blood sample taken and will have a dental screening. Depending upon the age of the participant, the rest of the examination includes tests and proce- dures to assess the various aspects of health listed above. In general, the older the individual, the more extensive the examination. Survey Operations Health interviews are conducted in respondents’ homes. Health measurements are performed in specially-designed and equipped mobile centers, which travel to locations throughout the country. The study team consists of a physician, medical and health technicians, as well as dietary and health interviewers. Many of the study staff are bilingual (English/Spanish). An advanced computer system using high- end servers, desktop PCs, and wide-area networking collect and process all of the NHANES data, nearly eliminating the need for paper forms and manual coding operations. This system allows interviewers to use note- book computers with electronic pens. The staff at the mobile center can automatically transmit data into data bases through such devices as digital scales and stadiometers. Touch-sensi- tive computer screens let respondents enter their own responses to certain sensitive ques- tions in complete privacy. Survey information is available to NCHS staff within 24 hours of collection, which enhances the capability of collecting quality data and increases the speed with which results are released to the public. In each location, local health and government officials are notified of the upcoming survey. Households in the study area receive a letter from the NCHS Director to introduce the survey. Local media may feature stories about the survey. NHANES is designed to facilitate and en- courage participation. Transportation is provided to and from the mobile center if necessary. Participants receive compensation and a report of medical findings is given to each participant. All information collected in the survey is kept strictly confidential. Privacy is protected by public laws. Uses of the Data Information from NHANES is made available through an extensive series of publications and articles in scientific and technical journals. For data users and researchers throughout the world, survey data are available on the internet and on easy-to-use CD-ROMs. Research organizations, universities, health care providers, and educators benefit from survey information. Primary data users are federal agencies that collaborated in the de- sign and development of the survey. The National Institutes of Health, the Food and Drug Administration, and CDC are among the agencies that rely upon NHANES to provide data essential for the implementation and evaluation of program activities. The U.S. Department of Agriculture and NCHS coop- erate in planning and reporting dietary and nutrition information from the survey. NHANES’ partnership with the U.S. Environ- mental Protection Agency allows continued study of the many important environmental influences on our health. • Physical fitness and physical functioning • Reproductive history and sexual behavior • Respiratory disease (asthma, chronic bron- chitis, emphysema) • Sexually transmitted diseases • Vision 1 http://www.cdc.gov/nchs/nhanes.htm
  34. 34. Gold standard for breadth of human exposure information: National Health and Nutrition Examination Survey1 since the 1960s now biannual: 1999 onwards 10,000 participants per survey The sample for the survey is selected to represent the U.S. population of all ages. To produce reli- able statistics, NHANES over-samples persons 60 and older, African Americans, and Hispanics. Since the United States has experienced dramatic growth in the number of older people during this century, the aging population has major impli- cations for health care needs, public policy, and research priorities. NCHS is working with public health agencies to increase the knowledge of the health status of older Americans. NHANES has a primary role in this endeavor. All participants visit the physician. Dietary inter- views and body measurements are included for everyone. All but the very young have a blood sample taken and will have a dental screening. Depending upon the age of the participant, the rest of the examination includes tests and proce- dures to assess the various aspects of health listed above. In general, the older the individual, the more extensive the examination. Survey Operations Health interviews are conducted in respondents’ homes. Health measurements are performed in specially-designed and equipped mobile centers, which travel to locations throughout the country. The study team consists of a physician, medical and health technicians, as well as dietary and health interviewers. Many of the study staff are bilingual (English/Spanish). An advanced computer system using high- end servers, desktop PCs, and wide-area networking collect and process all of the NHANES data, nearly eliminating the need for paper forms and manual coding operations. This system allows interviewers to use note- book computers with electronic pens. The staff at the mobile center can automatically transmit data into data bases through such devices as digital scales and stadiometers. Touch-sensi- tive computer screens let respondents enter their own responses to certain sensitive ques- tions in complete privacy. Survey information is available to NCHS staff within 24 hours of collection, which enhances the capability of collecting quality data and increases the speed with which results are released to the public. In each location, local health and government officials are notified of the upcoming survey. Households in the study area receive a letter from the NCHS Director to introduce the survey. Local media may feature stories about the survey. NHANES is designed to facilitate and en- courage participation. Transportation is provided to and from the mobile center if necessary. Participants receive compensation and a report of medical findings is given to each participant. All information collected in the survey is kept strictly confidential. Privacy is protected by public laws. Uses of the Data Information from NHANES is made available through an extensive series of publications and articles in scientific and technical journals. For data users and researchers throughout the world, survey data are available on the internet and on easy-to-use CD-ROMs. Research organizations, universities, health care providers, and educators benefit from survey information. Primary data users are federal agencies that collaborated in the de- sign and development of the survey. The National Institutes of Health, the Food and Drug Administration, and CDC are among the agencies that rely upon NHANES to provide data essential for the implementation and evaluation of program activities. The U.S. Department of Agriculture and NCHS coop- erate in planning and reporting dietary and nutrition information from the survey. NHANES’ partnership with the U.S. Environ- mental Protection Agency allows continued study of the many important environmental influences on our health. • Physical fitness and physical functioning • Reproductive history and sexual behavior • Respiratory disease (asthma, chronic bron- chitis, emphysema) • Sexually transmitted diseases • Vision 1 http://www.cdc.gov/nchs/nhanes.htm >250 exposures (serum + urine) >1,000 genetic loci >85 quantitative clinical traits (e.g., serum glucose, lipids, body mass index)
  35. 35. What maternal E are associated with preterm birth (< 37 weeks)? Reprod Tox, 2014
  36. 36. What maternal E are associated with preterm birth (< 37 weeks)?: What did we screen in moms?
  37. 37. What maternal E are associated with preterm birth (< 37 weeks)?: What did we screen in moms? Nutrients and Vitamins vitamin D, carotenes 32
  38. 38. What maternal E are associated with preterm birth (< 37 weeks)?: What did we screen in moms? Infectious Agents hepatitis, HIV, Staph. aureus 24 Nutrients and Vitamins vitamin D, carotenes 32
  39. 39. What maternal E are associated with preterm birth (< 37 weeks)?: What did we screen in moms? Infectious Agents hepatitis, HIV, Staph. aureus 24 Nutrients and Vitamins vitamin D, carotenes 32 Plastics and consumables phthalates, bisphenol A 49
  40. 40. What maternal E are associated with preterm birth (< 37 weeks)?: What did we screen in moms? Infectious Agents hepatitis, HIV, Staph. aureus 24 Nutrients and Vitamins vitamin D, carotenes 32 Plastics and consumables phthalates, bisphenol A 49 Pesticides and air-related pollutants atrazine; cadmium; hydrocarbons; polychorinated biphenyls; volatile organic compounds 95
  41. 41. NHANES 1999-2006 5772 reporting live births What maternal E are associated with preterm birth (< 37 weeks)?: Method for screening for associations Reprod Tox, 2014
  42. 42. NHANES 1999-2006 5772 reporting live births What maternal E are associated with preterm birth (< 37 weeks)?: Method for screening for associations Reprod Tox, 2014 Pregnant year prior to survey? 842 participants
  43. 43. NHANES 1999-2006 5772 reporting live births What maternal E are associated with preterm birth (< 37 weeks)?: Method for screening for associations Reprod Tox, 2014 Pregnant year prior to survey? 842 participants Any child born preterm? < 37 weeks or earlier
  44. 44. NHANES 1999-2006 5772 reporting live births What maternal E are associated with preterm birth (< 37 weeks)?: Method for screening for associations Reprod Tox, 2014 Pregnant year prior to survey? 842 participants Any child born preterm? < 37 weeks or earlier Any preterm birth N=62 No preterm birth N=718
  45. 45. NHANES 1999-2006 5772 reporting live births What maternal E are associated with preterm birth (< 37 weeks)?: Method for screening for associations Reprod Tox, 2014 logistic regression (age, race, poverty/income, education, number of births) Pregnant year prior to survey? 842 participants Any child born preterm? < 37 weeks or earlier Any preterm birth N=62 No preterm birth N=718
  46. 46. What maternal E are associated with preterm birth (< 37 weeks)?: Volcano plot of 201 associations Reprod Tox, 2014 0 1 2 0 2 4 6 Odds Ratio −log10(p−value)
  47. 47. What maternal E are associated with preterm birth (< 37 weeks)?: Volcano plot of 201 associations Reprod Tox, 2014 0 1 2 0 2 4 6 Odds Ratio −log10(p−value) serum iron (OR: 1.6) urine Cs (OR: 1.9) urine hydroxypyrene (OR: 1.8)
  48. 48. What maternal E are associated with preterm birth (< 37 weeks)?: Volcano plot of 201 associations Reprod Tox, 2014 0 1 2 0 2 4 6 Odds Ratio −log10(p−value) urine bisphenol A (OR: 1.9) serum iron (OR: 1.6) urine Cs (OR: 1.9) urine hydroxypyrene (OR: 1.8)
  49. 49. Tentative evaluation of higher Bisphenol A levels in moms who gave preterm birth in a tertiary clinic Reprod Tox, 2014 Lucile Packard Children’s Hospital 37 consenting mothers with urine (during gestation)! bisphenol A
  50. 50. Tentative evaluation of higher Bisphenol A levels in moms who gave preterm birth in a tertiary clinic Reprod Tox, 2014 Lucile Packard Children’s Hospital 37 consenting mothers with urine (during gestation)! bisphenol A Child born preterm?
  51. 51. Tentative evaluation of higher Bisphenol A levels in moms who gave preterm birth in a tertiary clinic Reprod Tox, 2014 Lucile Packard Children’s Hospital 37 consenting mothers with urine (during gestation)! bisphenol A Child born preterm? Preterm N=16 No Preterm N=21
  52. 52. Tentative evaluation of higher Bisphenol A levels in moms who gave preterm birth in a tertiary clinic Reprod Tox, 2014 Lucile Packard Children’s Hospital 37 consenting mothers with urine (during gestation)! bisphenol A 0.07 ug/mL 0.03 ug/mL Odds Ratio (1SD change): 3.5 (p=0.1) (age, race, creatinine, gestational age) Child born preterm? Preterm N=16 No Preterm N=21
  53. 53. The spectrum of associations depends on age: What E factors are associated with mortality?
  54. 54. EWAS to search for exposures and behaviors associated with all-cause mortality. NHANES: 1999-2004 National Death Index linked mortality 246 behaviors and exposures (serum/urine/self-report) IJE, 2013
  55. 55. EWAS to search for exposures and behaviors associated with all-cause mortality. NHANES: 1999-2004 National Death Index linked mortality 246 behaviors and exposures (serum/urine/self-report) NHANES: 1999-2001 N=330 to 6008 (26 to 655 deaths) ~5.5 years of followup IJE, 2013
  56. 56. EWAS to search for exposures and behaviors associated with all-cause mortality. NHANES: 1999-2004 National Death Index linked mortality 246 behaviors and exposures (serum/urine/self-report) NHANES: 1999-2001 N=330 to 6008 (26 to 655 deaths) ~5.5 years of followup Cox proportional hazards baseline exposure and time to death IJE, 2013
  57. 57. EWAS to search for exposures and behaviors associated with all-cause mortality. NHANES: 1999-2004 National Death Index linked mortality 246 behaviors and exposures (serum/urine/self-report) NHANES: 1999-2001 N=330 to 6008 (26 to 655 deaths) ~5.5 years of followup Cox proportional hazards baseline exposure and time to death False discovery rate < 5% IJE, 2013
  58. 58. EWAS to search for exposures and behaviors associated with all-cause mortality. NHANES: 1999-2004 National Death Index linked mortality 246 behaviors and exposures (serum/urine/self-report) NHANES: 1999-2001 N=330 to 6008 (26 to 655 deaths) ~5.5 years of followup Cox proportional hazards baseline exposure and time to death False discovery rate < 5% NHANES: 2003-2004 N=177 to 3258 (20-202 deaths) ~2.8 years of followup p < 0.05 IJE, 2013
  59. 59. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age, sex, income, education, race/ethnicity, occupation [in red]
  60. 60. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) age, sex, income, education, race/ethnicity, occupation [in red]
  61. 61. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) age, sex, income, education, race/ethnicity, occupation [in red] serum lycopene [1SD]
  62. 62. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) age, sex, income, education, race/ethnicity, occupation [in red] past smoker? current smoker?serum lycopene [1SD]
  63. 63. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) any one smoke in home? age, sex, income, education, race/ethnicity, occupation [in red] past smoker? current smoker?serum lycopene [1SD]
  64. 64. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) any one smoke in home? age, sex, income, education, race/ethnicity, occupation [in red] serum and urine cadmium [1 SD] past smoker? current smoker?serum lycopene [1SD]
  65. 65. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) any one smoke in home? age, sex, income, education, race/ethnicity, occupation [in red] serum and urine cadmium [1 SD] past smoker? current smoker?serum lycopene [1SD] physical activity [low, moderate, high activity]* *derived from METs per activity and categorized by Health.gov guidelines
  66. 66. Adjusted Hazard Ratio -log10(pvalue) 0.4 0.6 0.8 1.0 1.2 1.4 1.6 2.0 2.4 2.8 02468 1 2 3 4 5 67 1 Physical Activity 2 Does anyone smoke in home? 3 Cadmium 4 Cadmium, urine 5 Past smoker 6 Current smoker 7 trans-lycopene (11) 1 2 3 4 5 6 78 9 10 1112 13 14 1516 1 age (10 year increment) 2 SES_1 3 male 4 SES_0 5 black 6 SES_2 7 SES_3 8 education_hs 9 other_eth 10 mexican 11 occupation_blue_semi 12 education_less_hs 13 occupation_never 14 occupation_blue_high 15 occupation_white_semi 16 other_hispanic (69) EWAS (re)-identifies factors associated with all-cause mortality: Volcano plot of 200 associations age (10 years) income (quintile 2) income (quintile 1) male black income (quintile 3) any one smoke in home? age, sex, income, education, race/ethnicity, occupation [in red] serum and urine cadmium [1 SD] past smoker? current smoker?serum lycopene [1SD] physical activity [low, moderate, high activity]* *derived from METs per activity and categorized by Health.gov guidelines R2 ~ 2%
  67. 67. Eye color Hair curliness Type-1 diabetes Height Schizophrenia Epilepsy Graves' disease Celiac disease Polycystic ovary syndrome Attention deficit hyperactivity disorder Bipolar disorder Obesity Alzheimer's disease Anorexia nervosa Psoriasis Bone mineral density Menarche, age at Nicotine dependence Sexual orientation Alcoholism Lupus Rheumatoid arthritis Crohn's disease Migraine Thyroid cancer Autism Blood pressure, diastolic Body mass index Depression Coronary artery disease Insomnia Menopause, age at Heart disease Prostate cancer QT interval Breast cancer Ovarian cancer Hangover Stroke Asthma Blood pressure, systolic Hypertension Osteoarthritis Parkinson's disease Longevity Type-2 diabetes Gallstone disease Testicular cancer Cervical cancer Sciatica Bladder cancer Colon cancer Lung cancer Leukemia Stomach cancer 0 25 50 75 100 Heritability: Var(G)/Var(Phenotype) Source: SNPedia.com Remember: >50% of disease risk and phenotypic variability is in E! H2 < 50%
  68. 68. Where can it be found? >50% of disease risk and phenotypic variability is in E!
  69. 69. Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- proach (an analogy to genome-wide association stud- ies).Forexample,Wangetal4 screenedmorethan2000 chemicalsinserumtodiscoverendogenousexposuresas- sociated with risk for cardiovascular disease. Therearenotablehurdlesinanalyzing“big”environ- mental data. These same problems affect epidemiology of1-risk-factor-at-a-time,butinEWAStheirprevalencebe- comes more clearly manifest at large scale. When study- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets, may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observational data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of the multiple correlations also highlights the challenge thatinterveningtomodify1putativeriskfactoralsomay inadvertently affect multiple other correlated factors. Even when a seemingly simple intervention is tested in randomizedtrials(affectingasingleriskfactoramongthe manycorrelations),theinterventionisnotreallysimple. In essence what is tested are multiple perturbations of factors correlated with the one targeted for interven- VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion JAMA, 2014 JECH, 2014 Proc Symp Biocomp, 2015 How can we proceed to study the elusive environment in large scale for discovery-based research? Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observationa data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion
  70. 70. Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- proach (an analogy to genome-wide association stud- ies).Forexample,Wangetal4 screenedmorethan2000 chemicalsinserumtodiscoverendogenousexposuresas- sociated with risk for cardiovascular disease. Therearenotablehurdlesinanalyzing“big”environ- mental data. These same problems affect epidemiology of1-risk-factor-at-a-time,butinEWAStheirprevalencebe- comes more clearly manifest at large scale. When study- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets, may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observational data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of the multiple correlations also highlights the challenge thatinterveningtomodify1putativeriskfactoralsomay inadvertently affect multiple other correlated factors. Even when a seemingly simple intervention is tested in randomizedtrials(affectingasingleriskfactoramongthe manycorrelations),theinterventionisnotreallysimple. In essence what is tested are multiple perturbations of factors correlated with the one targeted for interven- VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion JAMA, 2014 JECH, 2014 Proc Symp Biocomp, 2015 How can we proceed to study the elusive environment in large scale for discovery-based research? Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observationa data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion •new ‘omics technologies
  71. 71. Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- proach (an analogy to genome-wide association stud- ies).Forexample,Wangetal4 screenedmorethan2000 chemicalsinserumtodiscoverendogenousexposuresas- sociated with risk for cardiovascular disease. Therearenotablehurdlesinanalyzing“big”environ- mental data. These same problems affect epidemiology of1-risk-factor-at-a-time,butinEWAStheirprevalencebe- comes more clearly manifest at large scale. When study- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets, may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observational data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of the multiple correlations also highlights the challenge thatinterveningtomodify1putativeriskfactoralsomay inadvertently affect multiple other correlated factors. Even when a seemingly simple intervention is tested in randomizedtrials(affectingasingleriskfactoramongthe manycorrelations),theinterventionisnotreallysimple. In essence what is tested are multiple perturbations of factors correlated with the one targeted for interven- VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion JAMA, 2014 JECH, 2014 Proc Symp Biocomp, 2015 •longitudinal publicly available data How can we proceed to study the elusive environment in large scale for discovery-based research? Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observationa data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion •new ‘omics technologies
  72. 72. Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- proach (an analogy to genome-wide association stud- ies).Forexample,Wangetal4 screenedmorethan2000 chemicalsinserumtodiscoverendogenousexposuresas- sociated with risk for cardiovascular disease. Therearenotablehurdlesinanalyzing“big”environ- mental data. These same problems affect epidemiology of1-risk-factor-at-a-time,butinEWAStheirprevalencebe- comes more clearly manifest at large scale. When study- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets, may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observational data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of the multiple correlations also highlights the challenge thatinterveningtomodify1putativeriskfactoralsomay inadvertently affect multiple other correlated factors. Even when a seemingly simple intervention is tested in randomizedtrials(affectingasingleriskfactoramongthe manycorrelations),theinterventionisnotreallysimple. In essence what is tested are multiple perturbations of factors correlated with the one targeted for interven- VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion JAMA, 2014 JECH, 2014 Proc Symp Biocomp, 2015 •longitudinal publicly available data How can we proceed to study the elusive environment in large scale for discovery-based research? Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observationa data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion High-throughputascertainmentofendogenousindicatorsofen- vironmentalexposurethatmayreflecttheexposomeincreasinglyat- tractattention,andtheirperformanceneedstobecarefullyevaluated. These include chemical detection of indicators of exposure through US federally funded gene expression experiment data be d itedinpublicrepositoriessuchastheGeneExpressionOmnibu repositoryhasbeeninstrumentalindevelopmentoftechnolo measurement of gene expression, data standardization, and Figure. Correlation Interdependency Globes for 4 Environmental Exposures (Cotinine, Mercury, Cadmium, Trans-β-Carotene) in National Health Nutrition Examination Survey (NHANES) Participants, 2003-2004 A Serum cotinine B Serum total mercury C Serum cadmium D Serum trans-β-carotene 37 Total correlations 42 Total correlations 68 Total correlations 68 Total correlations Negative correlation Positive correl Infectious agents Pollutants Nutrients and vitamins Demographic attributes Eachcorrelationinterdependencyglobeincludes317environmentalexposures representedbythenodesaroundtheperipheryoftheglobe.Pairwisecorrelations aredepictedbyedges(lines)betweenthenodeofinterest(arrowhead)andother nodes.Correlationswithabsolutevaluesexceeding0.2areshown(stronges Thesizeofeachnodeisproportionaltothenumberofedgesforanode,and thicknessofeachedgeindicatesthemagnitudeofthecorrelation. Opinion Viewpoint •data mining and informatics to tackle complexity what causes what? confounding •new ‘omics technologies
  73. 73. Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- proach (an analogy to genome-wide association stud- ies).Forexample,Wangetal4 screenedmorethan2000 chemicalsinserumtodiscoverendogenousexposuresas- sociated with risk for cardiovascular disease. Therearenotablehurdlesinanalyzing“big”environ- mental data. These same problems affect epidemiology of1-risk-factor-at-a-time,butinEWAStheirprevalencebe- comes more clearly manifest at large scale. When study- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets, may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observational data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of the multiple correlations also highlights the challenge thatinterveningtomodify1putativeriskfactoralsomay inadvertently affect multiple other correlated factors. Even when a seemingly simple intervention is tested in randomizedtrials(affectingasingleriskfactoramongthe manycorrelations),theinterventionisnotreallysimple. In essence what is tested are multiple perturbations of factors correlated with the one targeted for interven- VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion JAMA, 2014 JECH, 2014 Proc Symp Biocomp, 2015 •longitudinal publicly available data How can we proceed to study the elusive environment in large scale for discovery-based research? Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observationa data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion High-throughputascertainmentofendogenousindicatorsofen- vironmentalexposurethatmayreflecttheexposomeincreasinglyat- tractattention,andtheirperformanceneedstobecarefullyevaluated. These include chemical detection of indicators of exposure through US federally funded gene expression experiment data be d itedinpublicrepositoriessuchastheGeneExpressionOmnibu repositoryhasbeeninstrumentalindevelopmentoftechnolo measurement of gene expression, data standardization, and Figure. Correlation Interdependency Globes for 4 Environmental Exposures (Cotinine, Mercury, Cadmium, Trans-β-Carotene) in National Health Nutrition Examination Survey (NHANES) Participants, 2003-2004 A Serum cotinine B Serum total mercury C Serum cadmium D Serum trans-β-carotene 37 Total correlations 42 Total correlations 68 Total correlations 68 Total correlations Negative correlation Positive correl Infectious agents Pollutants Nutrients and vitamins Demographic attributes Eachcorrelationinterdependencyglobeincludes317environmentalexposures representedbythenodesaroundtheperipheryoftheglobe.Pairwisecorrelations aredepictedbyedges(lines)betweenthenodeofinterest(arrowhead)andother nodes.Correlationswithabsolutevaluesexceeding0.2areshown(stronges Thesizeofeachnodeisproportionaltothenumberofedgesforanode,and thicknessofeachedgeindicatesthemagnitudeofthecorrelation. Opinion Viewpoint •data mining and informatics to tackle complexity what causes what? confounding •new ‘omics technologies EWAS
  74. 74. Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- proach (an analogy to genome-wide association stud- ies).Forexample,Wangetal4 screenedmorethan2000 chemicalsinserumtodiscoverendogenousexposuresas- sociated with risk for cardiovascular disease. Therearenotablehurdlesinanalyzing“big”environ- mental data. These same problems affect epidemiology of1-risk-factor-at-a-time,butinEWAStheirprevalencebe- comes more clearly manifest at large scale. When study- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets, may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observational data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of the multiple correlations also highlights the challenge thatinterveningtomodify1putativeriskfactoralsomay inadvertently affect multiple other correlated factors. Even when a seemingly simple intervention is tested in randomizedtrials(affectingasingleriskfactoramongthe manycorrelations),theinterventionisnotreallysimple. In essence what is tested are multiple perturbations of factors correlated with the one targeted for interven- VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion JAMA, 2014 JECH, 2014 Proc Symp Biocomp, 2015 •longitudinal publicly available data How can we proceed to study the elusive environment in large scale for discovery-based research? Studying the Elusive Environment in Large Scale Itispossiblethatmorethan50%ofcomplexdiseaserisk isattributedtodifferencesinanindividual’senvironment.1 Airpollution,smoking,anddietaredocumentedenviron- mental factors affecting health, yet these factors are but a fraction of the “exposome,” the totality of the exposure loadoccurringthroughoutaperson’slifetime.1 Investigat- ing one or a handful of exposures at a time has led to a highly fragmented literature of epidemiologic associa- tions. Much of that literature is not reproducible, and se- lectivereportingmaybeamajorreasonforthelackofre- producibility. A new model is required to discover environmental exposures associated with disease while mitigating possibilities of selective reporting. Toremedythelackofreproducibilityandconcernsof validity, multiple personal exposures can be assessed si- multaneously in terms of their association with a condi- tion or disease of interest; the strongest associations can then be tentatively validated in independent data sets (eg, as done in references 2 and 3).2,3 The main advan- tages of this process include the ability to search the list ofexposuresandadjustformultiplicitysystematicallyand reportalltheprobedassociationsinsteadofonlythemost significant results. The term “environment-wide associa- tion studies” (EWAS) has been used to describe this ap- the EWAS vantage point, intervening on β-carotene (Figure, D) seems a futile exercise given its complex rela- tionship with other nutrients and pollutants. Giventhiscomplexity,howcanstudiesofenvironmen- talriskmoveforward?First,EWASanalysesshouldbeap- pliedtomultipledatasets,andconsistencycanbeformally examinedforallassessedcorrelations.Second,thetempo- ral relationship between exposure and changes in health parametersmayofferhelpfulhintsaboutwhichofthesig- nalsaremorethansimplecorrelations.Third,standardized adjustedanalyses,inwhichadjustmentsareperformedsys- tematicallyandinthesamewayacrossmultipledatasets may also help. This is in stark contrast with the current model,wherebymostepidemiologicstudiesusesingledata setswithoutreplicationaswellasnon–time-dependentas- sessments,andreportedadjustmentsaremarkedlydiffer- entacrossreportsanddatasets,eventhoseperformedby thesameteam(differentapproachesincreasevaliditybut mustbereconciledandassimilated). However, eventually for most environmental cor- relates,theremaybeunsurpassabledifficultyestablish- ing potential causal inferences based on observationa data alone. Factors that seem protective may some- times be tested in randomized trials. The complexity of VIEWPOINT Chirag J. Patel, PhD Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts. John P. A. Ioannidis, MD, DSc Stanford Prevention Research Center, Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, California, Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, and Meta-Research Innovation Center at Stanford (METRICS), Stanford, California. Opinion High-throughputascertainmentofendogenousindicatorsofen- vironmentalexposurethatmayreflecttheexposomeincreasinglyat- tractattention,andtheirperformanceneedstobecarefullyevaluated. These include chemical detection of indicators of exposure through US federally funded gene expression experiment data be d itedinpublicrepositoriessuchastheGeneExpressionOmnibu repositoryhasbeeninstrumentalindevelopmentoftechnolo measurement of gene expression, data standardization, and Figure. Correlation Interdependency Globes for 4 Environmental Exposures (Cotinine, Mercury, Cadmium, Trans-β-Carotene) in National Health Nutrition Examination Survey (NHANES) Participants, 2003-2004 A Serum cotinine B Serum total mercury C Serum cadmium D Serum trans-β-carotene 37 Total correlations 42 Total correlations 68 Total correlations 68 Total correlations Negative correlation Positive correl Infectious agents Pollutants Nutrients and vitamins Demographic attributes Eachcorrelationinterdependencyglobeincludes317environmentalexposures representedbythenodesaroundtheperipheryoftheglobe.Pairwisecorrelations aredepictedbyedges(lines)betweenthenodeofinterest(arrowhead)andother nodes.Correlationswithabsolutevaluesexceeding0.2areshown(stronges Thesizeofeachnodeisproportionaltothenumberofedgesforanode,and thicknessofeachedgeindicatesthemagnitudeofthecorrelation. Opinion Viewpoint •data mining and informatics to tackle complexity what causes what? confounding •new ‘omics technologies EWAS
  75. 75. with Paul Avillach, Michael McDuffie, Jeremy Easton-Marks, Cartik Saravanamuthu and the BD2K PIC-SURE team 40K participants >1000 indicators of exposure Data and API available now http://nhanes.hms.harvard.edu Download all the data: NHANES exposome browser

×