Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Neo4j GraphDay Munich - Graphs to fight Diabetes

Neo4j GraphDay Munich Health & Life Sciences
Dr. Alexander Jarasch, DZD

  • Login to see the comments

  • Be the first to like this

Neo4j GraphDay Munich - Graphs to fight Diabetes

  1. 1. Graphs to fight diabetes Dr. Alexander Jarasch Head of Data and Knowledge Management The German Center for Diabetes Research (DZD)
  2. 2. Evolutionary advantage becomes disadvantage energy storage essential for survival upon lack of food energy storage essential for survival upon food abundance
  3. 3. What is diabetes mellitus? • metabolic disease • insulin production is reduced in pancreas or body poorly responds on insulin (insulin=hormone, the body needs to get glucose out of the blood stream into the cells) • consequences: • less absorbtion of sugar • sugar will not be stored in liver and muscle cells • persistently high levels of sugar in blood (hyperglycemia) • tremendous complications • currently, not curable (only treatable) diabetes T1D diabetes Gestational diabetes special types T2D diabetes
  4. 4. Diabetes TYPE 1 (T1D) • appr. 5-10 % of diabetes patients have T1D • often starts in childhood • autoimmune reaction • independent from life style • patients need external insulin source throughout their life • appr. 20 genes involved • currently, T1D is not curable
  5. 5. Diabetes TYPE 2 (T2D) • appr. 90-95 % of diabetes patients have T2D (mostly after age 40) • insulin resistance, pancreas is not able to produce enough insulin • symptoms develop slowly • >150 genes are identified that increase risk • “the cocktail of evil“: predisposition + overweight + physical inactivity
  6. 6. Some numbers (worldwide) 1 in 11 adults has diabetes (425 million) Since 1980 quadrupled 12% of global health expenditure is spent on diabetes ($727 billion) Over 1 million children and adolescents have type 1 diabetes Two-thirds of people with diabetes are of working age (327 million) 2017 Three quarters of people with diabetes live in low and middle income countries 2017 1 in 2 adults with diabetes is undiagnosed (212 million) International Diabetes Federation (IDF)
  7. 7. Some numbers (USA and Germany) 30 million have diabetes (9.4 % of adults )1 +1‘500‘000 p.a. 84 mio. prediabetes2 16 billion € costs p.a.1 7 million have diabetes (7.4 % of adults)1 +500‘000 p.a. ~ 7 mio. prediabetes and undiagnozed $327 billion USD costs p.a.1 ($237 bn. medical costs, $90 bn. reduced productivity)2 1 www.statistica.com 2 American Diabetes Association
  8. 8. Overweight/obesity in the US (1985- 2009) obese adults in the US (BMI* >= 30) *BMI=30: 5”11 = 220,46 lbs (180cm = 100 kg)
  9. 9. Complications develop after many years kidney Diabetic nephropathy 40 % of kidney failure/dialysis feet 70 % of all foot amputations eyes Diabetic retinopathy 30 % of loss of sight brain 2-4 fold increased risk for stroke acute cardiac death Main reason of death of diabetic patients (33 % of all heart attacks) nerves Diabetic Neuropathy Amputations of extremeties
  10. 10. Complex emergence / complex disease live style gene epigenetics metabolism cellular processes environment
  11. 11. Inherited lifestyle genetically identical epigenetically different
  12. 12. Epigenetics – beyond generation weight[g] age [weeks] daughters of obese mice having diabtes daughters of healthy mice Huypens and Beckers, Nat Genet. 2016
  13. 13. The German Center for Diabetes Research funded by the Federal Ministry for Education and Research and the states 5 Partners, 5 associated partners – 400 researchers (basic research and university hospitals) DZD bundles competencies so that those affected benefit more quickly from research results. academic, non-profit
  14. 14. The German Center for Diabetes Research hospitals prevention nutrition / diet beta cells genetics therapy clinial studies cohorts basic researchhealthcare diabetes treatment diabetes prevention prevention of complications
  15. 15. Goal: better diabetes prevention and therapy personalized prevention and therapy identify and cluster diabetes subtypes individualized treatment of subtypes
  16. 16. How do we fight diabetes with graphs?
  17. 17. The challenge Easy question -> Complex query Find information within our organisation
  18. 18. Originally different research areas Hospitals Basic Research Data Analysis
  19. 19. We all “serve“ the same “customer“ Hospitals Basic Research Data Analysis
  20. 20. But we all see the “customer“ a little differently “Patient“ “Gene“ “Study“ “Metabolite“ “drug“ “statistics“ 64kg, 178cm, male C6H12O6 Metformin T2D AAGCTTCACATGG cell insulin resistance inactive mice prediabetic pig microscope image complications
  21. 21. Look at our “customer“ in a new way “Patient“ “Gene“ “Study“ “Metabolite“ “drug“ “statistics“ 64kg, 178cm, male C6H12O6 Metformin T2D AAGCTTCACATGG cell insulin resistance inactive mice prediabetic pig microscope image complications
  22. 22. Look at our “customer“ from many perspectives simultaneously – connect data Hospitals Basic Research Data Analysis data
  23. 23. Connect data – one option Hospitals Basic Research Data Analysis “Patient“ 64kg, 178cm, male “drug“ Metformin “Study“ T2D insulin resistance “Gene“ AAGCTTCACATGG “Metabolite“ C6H12O6 cell inactive mice prediabetic pig “statistics“ microscope image complications
  24. 24. Connect data – better option “Patient“ 64kg, 178cm, male “drug“ Metformin “Study“ T2D “statistics“ “Gene“ AAGCTTCACATGG “Metabolite“ C6H12O6 insulin resistance cell inactive mice prediabetic pig microscope image complications
  25. 25. DZDConnect – a Neo4j graph database Graph that can help answering bio-medical questions across locations across disciplines across species extendable scalable visualizable
  26. 26. Homogenous and heterogenous data
  27. 27. (First) connect data a meta level RAW DATA RAW DATA RAW DATA RAW DATA RAW DATA
  28. 28. Classify types of data
  29. 29. Classify types of data clin. study clin. study clin. study statis tics statis tics RNA DNA RNA DNA images chem istry patient patient patient bio sample bio sample bio sample wet lab chem istry drug
  30. 30. Connect types of data statis tics statis tics RNA DNA images chem istry patient wet lab chem istry drug patient patient bio sample bio sample bio sample clin. study clin. study clin. study RNA DNA
  31. 31. Build graph model clin. study statis tics RNA DNA images bio sample wet lab chem istry drug patient
  32. 32. Why graph? • in „biology“ everything is connected anyway • data is connected • human readable – easy-to-understand for non-computer scientists • easy to query: queries are similar to human-like questions • scalable • easy-adoptable and extendable • visualization
  33. 33. Meta data name: IL-6 unit: mg/ml sample: blood organism: pig amount: 50ml aliquots: 362 location: Freezer68 name: pancreas dissection format: TIFF dimension: 3840x2160 amount: 125 staining: no staining microscope: Zeiss Light sheet Z1 location: Dresden title: „about diabetes and Alzheimer‘s“ PMID: 1255864 doi: http://doi.102r3d year: 2016 journal: Diabetes
  34. 34. Extend graph Literature protein database other diseases Electronic Laboratory Notebook
  35. 35. lipid metabolism Diabetes is a metabolic disease
  36. 36. Extending our graph RNA-seq proteomicsAssociations ~800 mio. nodes ~800 mio. relationships Dr. Martin Preusse Dr. Nikola Müller
  37. 37. Extending our graph Dr. Jan Krumsiek, Assistant Professor, Weill Cornell Medicine, NYC metabolic pathway data from 15-20 very rich data sources ~900’000 nodes ~1.7 mio. relationships phenotype associations studies
  38. 38. Summary “Patient“ 64kg, 178cm, male “drug“ Metformin “Study“ T2D “statistics“ “Gene“ AAGCTTCACATGG “Metabolite“ C6H12O6 insulin resistance cell inactive mice prediabetic pig microscope image complications
  39. 39. Examples
  40. 40. How many biosamples were aquired in visit 17 of ‘PLIS‘ and which parameters were measured?Goals: 1. Connect data from our clinical studies and biobanks 2. Researches can easily browse through measured parameters and available biosamples 3. Meta data of parameters helps to assess which samples are comparable
  41. 41. name: HMGU name: AJ position: data mgmt name: PLIS multi-center: true recruiting: closed analysis: on-going no. of patients: 1105 visit: 17 name: blood type: OGTT number of samples: 3436 organism: Human name: laboratory
  42. 42. Study
  43. 43. Study Person Visit
  44. 44. Study Person Visit BioSample Experiment Parameter
  45. 45. Can human T2D genes be studied in the pre-diabetic pig model? Goals: 1. Connect data from different species (i.e. mice, pig, human) 2. Connect multiomics data 3. Researches can easily find information between human and comparable data from animal models
  46. 46. genomics transcriptomics metabolomics proteomics
  47. 47. Human GWAS cataloge (Diabetes) 103 genes 97 genes 96 genes 16 enzymes 63 compounds 31 compounds 7 compounds 16 metabolites Targeted metabolomics analysis in prediabetic pig ENSEMBL Gennamen (human) KEGG Gen IDs KEGG Enzyme KEGG compounds Biocrates IDs 7/16 metabolites Xxaa C11:0 Xxaa C11:1 Xxaa C11:2 Xxaa C11:3 Xxaa C11:4 Xxaa C11:5 Xxaa C11:6 genomics transcriptomics proteomics metabolomics pathway analysis
  48. 48. Outlook
  49. 49. Automatically learn from large literature texts
  50. 50. Natural language processing (NLP) example Identification of genetic elements in metabolism by high-throughput mouse phenotyping. Metabolic diseases are a worldwide problem but the underlying genetic factors and their relevance to metabolic disease remain incompletely understood. Genome-wide research is needed to characterize so-far unannotated mammalian metabolic genes. Here, we generate and analyze metabolic phenotypic data of 2016 knockout mouse strains under the aegis of the International Mouse Phenotyping Consortium (IMPC) and find 974 gene knockouts with strong metabolic phenotypes. 429 of those had no previous link to metabolism and 51 genes remain functionally completely unannotated. We compared human orthologues of these uncharacterized genes in five GWAS consortia and indeed 23 candidate genes, like ABC1, XYZ2, are associated with metabolic disease. We further identify common regulatory elements in promoters of candidate genes. As each regulatory element is composed of several transcription factor binding sites, our data reveal an extensive metabolic phenotype-associated network of co-regulated genes. Our systematic mouse phenotype analysis thus paves the way for full functional annotation of the genome. Metabolic disorders, including obesity and type 2 diabetes mellitus, are major challenges for public health. Rozman and Hrabe de Angelis, Nat Commun. 2018 NLP method by GraphAware
  51. 51. Alzheimer‘s cancer cardio vascular diseases diabetes Lung diseases infectious diseases Find connections...
  52. 52. Machine learning for personalized prevention and therapy identify and cluster diabetes subtypes individualized treatment of subtypes Expert Knowledge validation of personalized treatment Graph Technology
  53. 53. DDPC – Digital Diabetes Prevention Center • pattern recognition in huge amounts of data • (un)supervised ML methods to identify subtypes of diabetes • developing/validating individulized prevention/therapy transparency to people benefit for people benefit for society
  54. 54. Next level in diabetes prevention and treatment Hospitals Basic Research Data Analysis
  55. 55. Acknowledgements The scientists of the DZD at: Funding by:
  56. 56. Thank you

×