Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Workshop negations

585 views

Published on

Using SVMs with the Command Relation Feature to Identify Negated Events in Biomedical Literature

  • Login to see the comments

  • Be the first to like this

Workshop negations

  1. 1. Using SVMs with the Command Relation Feature to Identify Negated Events in Biomedical Literature Farzaneh Sarafraz Goran Nenadic School of Computer Science University of Manchester sarafraf@cs.man.ac.uk g.nenadic@manchester.ac.uk
  2. 2. Outline • Motivation & aim • Molecular events • Data & experiments • Methods • Discussion • Summary 2 / 27
  3. 3. Motivation & aim • Biomedical literature • 2000 papers published every day • Biomedical information extraction needed • Improve IE by negation information • Negative results are interesting and reported • “The IKK complex, but not p90 (rsk), is responsible for the in vivo phosphorylation of I-kappa-B-alpha.” • Resources • Shared tasks, data • Linguistic tools (syntactic parsers) 3 / 27
  4. 4. Problem statement • Given • Pubmed abstracts • Protein/gene mentions annotated • Molecular events annotated • Wanted for every event • Negated or not • Classification problem 4 / 27
  5. 5. Molecular events participant trigger participant “We further show that Nmi interacts with all STATs except Stat2.” trigger event participation type participation type {theme, cause} {theme, cause} event type participant {binding, participant transcription, regulation, participant type expression} participant type {gene/protein, event} {gene/protein, event}/ 27 5
  6. 6. Molecular events – class I • One theme (gene/protein) • “The effect of this synergism was perceptible at the level of induction of the IL-2 gene.” • Trigger: induction • Type: gene expression • Theme: IL-2 • Types: transcription, gene expression, phosphorylation, protein catabolism, localization 6 / 27
  7. 7. Molecular events – class II • One or more themes (gene/protein) • “We further show that Nmi interacts with all STATs except Stat2.” • Trigger: interacts • Type: binding • Themes: Nmi, Stat2 • Negated • Type: Binding 7 / 27
  8. 8. Molecular events – class III • 1 theme, 0 or 1 cause • may be gene/protein or other events • “Overexpression of full-length ALG-4 induced transcription of FasL and, consequently, apoptosis.” Event Trigger Type Theme Cause Event 1 “transcription” Transcription FasL Event 2 “Overexpression” Gene expression ALG-4 Event 3 “Overexpression” Regulation Event 2 Event 4 “induced” Regulation Event 1 Event 3 8 / 27 • Types: regulation types
  9. 9. Data: BioNLP’09 • Training: 800 abstracts • Test: 260 abstracts • Gold annotations • Event trigger, type, participants, negation • Negation cue not annotated Event Training data Development data Test data class total negated total negated Class I 2,858 131 559 26 Class II 887 44 249 15 Class III 4,870 440 987 66 Total 9,685 615 1,795 107 9 / 27
  10. 10. Methodologies • Rule-based • The command relation • Classification • SVM on event representation • Lexical features: negation cue, POS • Syntactic features: command • Semantic features: event types • Baseline • NegEx: event triggers as “terms” 10 / 27
  11. 11. TP Precision = TP + FP Evaluation measures TP Precision = TP TP + FP Recall = Sensitivity = TP + FN TP Recall = Sensitivity == 2 × Precision× Recall F1 TP + FN Precision+ Recall Precision × Recall TN F1 = 2 × Specificity = Precision + Recall TN + FP TN Specificity = TN + FP 11 / 27
  12. 12. Baseline results Approach P R F1 Spec. No negation detection - 0% - 94% any negation cue present 20% 78% 32% 81% NegEx 36% 37% 36% 93% 12 / 27
  13. 13. The command relation • If a and b are nodes in the constituency parse tree of a sentence, then a X-commands b iff the lowest ancestor of a with label X is also an ancestor of b. Ronald Langacker, On Pronominalization and the Chain of Command, in D. Reibel and S. Schane (eds.) Modern Studies in English, Prentice-Hall, Englewood Cliffs, NJ. 160-186. 1969. 13 / 27
  14. 14. Example of the command relation S a S • a S-commands b. • b does not S-command a. b 14 / 27
  15. 15. X-command in action S We now VP show that S VP a mutant motif that exchanges fails to bind the p50 the terminal 3' C for a G homodimer. 15 / 27
  16. 16. Rule-based method • An event is negated if • Negation cue exists; and • Negation cue S-commands any participant • Negation cue S-commands trigger • Negation cue S-commands both • Negation cue VP-commands both 16 / 27
  17. 17. Results of rule-based method Approach P R F1 Spec. negation cue S-commands any 23% 76% 35% 84% participant negation cue 23% 68% 34% 85% S-commands trigger negation cue 23% 68% 35% 86% S-commands both negation cue 42% VP-commands both 17 / 27
  18. 18. SVM features • Semantic features • Event type • Lexical features • Sentence contains negation cue • Negation cue • Syntactic features • POS of neg cue • POS of event trigger • POS of the participants • Parse tree distance between trigger & cue • Type of smallest phrase containing trigger & cue • Cue S-commands any participant • Cue S-commands trigger 18 / 27
  19. 19. Results of single SVM, incremental feature sets Feature set P R F1 Spec. Features 1-7 43% 8% 14% 99.2% Features 1-8 73% 19% 30% 99.3% Features 1-9 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 19 / 27
  20. 20. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% Features 1-8 73% 19% 30% 99.3% Features 1-9 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 20 / 27
  21. 21. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% 8. Cue S-commands any participant 1-8 Features 73% 19% 30% 99.3% Features 1-9 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 21 / 27
  22. 22. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% 8. Cue S-commands any participant 1-8 Features 73% 19% 30% 99.3% 9. Cue S-commands Features 1-9 trigger 71% 38% 49% 99.2% Features 1-10 76% 38% 51% 99.2% 22 / 27
  23. 23. 1. Event type Results of single SVM, incremental 2. Sentence contains neg cue feature sets 3. Neg cue 4. POS of neg cue 5. POS of event trigger 6. POS of theset Feature participants P R F1 Spec. 7. Type of smallest phrase Features 1-7 containing trigger & cue 43% 8% 14% 99.2% 8. Cue S-commands any participant 1-8 Features 73% 19% 30% 99.3% 9. Cue S-commands Features 1-9 trigger 71% 38% 49% 99.2% 10.Parse tree distance Features 1-10 between trigger & cue 76% 38% 51% 99.2% 23 / 27
  24. 24. Results of separate SVMs for each class Event class P R F1 Spec. Class I 94% 65% 77% 99.8% (559 events) Class II 100% 33% 50% 100% (249 events) Class III 81% 44% 57% 99.2% (987 events) Micro-average 88% 49% 63% 99.4% (1,795 events) Macro-average 92% 47% 62% 99.7% (3 classes) 24 / 27
  25. 25. Future work • Use class-specific features • Study other variants of command • Combine negation detection with automatic event detection instead of using ‘gold’ events • Use negation detection on a larger scale dataset (MEDLINE) to find contradictions & contrasts in the biomedical literature 25 / 27
  26. 26. Conclusions • SVM for extracting negated events • >99% specificity • 63% F-measure (micro average) • Different classes of events behave differently • To detect negated molecular event • Event trigger & surface distances not enough • Semantic & command features useful • Event participants as important as triggers • Apply on large scale data – MEDLINE 26 / 27
  27. 27. Acknowledgements • Organisers of BioNLP’09 • GN TEAM • Casey Bergman’s lab – Faculty of Life Sciences, University of Manchester • James Eales – University of Manchester • Jonathan Caruana – University College London • Web service soon available at http://gnode1.mib.man.ac.uk/negmole 27 / 27
  28. 28. X-command S in action We now VP show that S VP a mutant motif that exchanges fails to bind the p50 the terminal 3' C for a G homodimer that S is upregulated in LPS tolerant human Mono Mac 6 cells. 28 / 27

×