a ... transcription factor, ARR1, activates the gene SHY2 ...” (PMID 19039136) “SHY2 also regulates the cytokinin biosynthesis enzyme IPT5 ... ” (PMC 2688277) “... disruption of IPT1, IPT3, IPT5 and IPT7 resulted in significant reductions in cytokinin (CK) levels ...” (PMC 3280229) “Cytokinins: metabolism and function in plant adaptation to environmental stresses” (PMID:22236698)
KRP2 slightly above its endogenous level inhibited the mitotic cell cycle-specific CDKA;1 kinase complexes” Gene recognition “Constitutive overexpression of KRP2 slightly above its endogenous level inhibited the mitotic cell cycle-specific CDKA;1 kinase complexes” ➔ Challenge: Lexical variants & synonyms: KRP2, KRP-2, KIP-related protein 2, ICK2 ➔ Solution: Dictionaries and/or Machine Learning (ML) to recognize lexical clues such as uppercasing, context words and abbreviations specifically mentioned in the text
KRP2 slightly above its endogenous level inhibited the mitotic cell cycle-specific CDKA;1 kinase complexes” “Constitutive overexpression of KRP2 slightly above its endogenous level inhibited the mitotic cell cycle-specific CDKA;1 kinase complexes” Event extraction ➔ Challenge: Wide variety to express the same interaction in English: different words & different grammar. E.g. “binding of X with Y” vs. “X and Y interact” ➔ Solution: Supervised ML to learn lexical, syntactic and grammatical patterns from annotated examples, generalize them, and use them for prediction on unseen text
were 26 complete responses (16%) and 0 partial responses (0%) ... ... The median progression-free survival time was 65 months ... Measure Value CR 16% PR 0% PFS 65 mo. Unstructured trial texts Structured data One-hundred sixty-seven patients were treated: 91 in arm A, 48 in arm B, and 28 in arm C. The RR was 10% in arm A and zero in arms B and C. Patients RR Responders Arm A 91 10% 9 Arm B 48 0% 0 Arm C 28 0% 0 All 167 5.4% 9
from structured databases Augmented with text mining information 75% of all protein-protein interactions extracted from text, were factually correct. Only 35% of them could be found in structured PPI databases. There is a need for (curated) NLP results to obtain a more complete picture ! Van Landeghem, et al. 2013. The Plant Cell
exposed to 25mM mannitol ➢ This induces osmotic stress similar to drought 24 samples in total ➢ Time measurements: after 1.5h, 3h, 12h, 24h ➢ 3 biological replicates + control expirements Skirycz, et al. 2011. Plant Cell At each time interval, it is analyzed which genes are expressed differently with respect to the normal plants which were not exposed to mannitol (drought)
the changes in regulation after 1.5 hour of drought stress These changes represent the plant’s coping mechanism with respect to its changing environment!
Normal conditions Drought Osmotic use-case: 3 genes were found to be interesting: • PIL5, HY5, TCH3 • Phenotypic analyses on hy5 mutants under mannitol treatment • Confirmed that HY5 is involved in mannitol-responsive networks in growing Arabidopsis leaves Van Landeghem, et al. 2016. BMC Bioinformatics
Stefanie De Bodt Thomas Van Parys Zuzanna Drebert Yves Van de Peer Dirk Inzé Turku University Filip Ginter Jari Björne Tapio Salakoski J&J Johannes Hermann Francisco Talamas Henry Lin Large-scale text mining resource for PubMed: http://evexdb.org Cytoscape app for differential network analyses: http://apps.cytoscape.org/apps/diffany