3,451 research outputs found
Some Perspectives on Network Modeling in Therapeutic Target Prediction
Drug target identification is of significant commercial interest to
pharmaceutical companies, and there is a vast amount of research done related
to the topic of therapeutic target identification. Interdisciplinary research
in this area involves both the biological network community and the graph
algorithms community. Key steps of a typical therapeutic target identification
problem include synthesizing or inferring the complex network of interactions
relevant to the disease, connecting this network to the disease-specific
behavior, and predicting which components are key mediators of the behavior.
All of these steps involve graph theoretical or graph algorithmic aspects. In
this perspective, we provide modelling and algorithmic perspectives for
therapeutic target identification and highlight a number of algorithmic
advances, which have gotten relatively little attention so far, with the hope
of strengthening the ties between these two research communities
Congenial Causal Inference with Binary Structural Nested Mean Models
Structural nested mean models (SNMMs) are among the fundamental tools for
inferring causal effects of time-dependent exposures from longitudinal studies.
With binary outcomes, however, current methods for estimating multiplicative
and additive SNMM parameters suffer from variation dependence between the
causal SNMM parameters and the non-causal nuisance parameters. Estimating
methods for logistic SNMMs do not suffer from this dependence. Unfortunately,
in contrast with the multiplicative and additive models, unbiased estimation of
the causal parameters of a logistic SNMM rely on additional modeling
assumptions even when the treatment probabilities are known. These difficulties
have hindered the uptake of SNMMs in epidemiological practice, where binary
outcomes are common. We solve the variation dependence problem for the binary
multiplicative SNMM by a reparametrization of the non-causal nuisance
parameters. Our novel nuisance parameters are variation independent of the
causal parameters, and hence allows the fitting of a multiplicative SNMM by
unconstrained maximum likelihood. It also allows one to construct true (i.e.
congenial) doubly robust estimators of the causal parameters. Along the way, we
prove that an additive SNMM with binary outcomes does not admit a variation
independent parametrization, thus explaining why we restrict ourselves to the
multiplicative SNMM
New challenges for text mining: mapping between text and manually curated pathways
<p>Abstract</p> <p>Background</p> <p>Associating literature with pathways poses new challenges to the Text Mining (TM) community. There are three main challenges to this task: (1) the identification of the mapping position of a specific entity or reaction in a given pathway, (2) the recognition of the causal relationships among multiple reactions, and (3) the formulation and implementation of required inferences based on biological domain knowledge.</p> <p>Results</p> <p>To address these challenges, we constructed new resources to link the text with a model pathway; they are: the GENIA pathway corpus with event annotation and NF-kB pathway. Through their detailed analysis, we address the untapped resource, ‘bio-inference,’ as well as the differences between text and pathway representation. Here, we show the precise comparisons of their representations and the nine classes of ‘bio-inference’ schemes observed in the pathway corpus.</p> <p>Conclusions</p> <p>We believe that the creation of such rich resources and their detailed analysis is the significant first step for accelerating the research of the automatic construction of pathway from text.</p
MKEM: a Multi-level Knowledge Emergence Model for mining undiscovered public knowledge
<p>Abstract</p> <p>Background</p> <p>Since Swanson proposed the Undiscovered Public Knowledge (UPK) model, there have been many approaches to uncover UPK by mining the biomedical literature. These earlier works, however, required substantial manual intervention to reduce the number of possible connections and are mainly applied to disease-effect relation. With the advancement in biomedical science, it has become imperative to extract and combine information from multiple disjoint researches, studies and articles to infer new hypotheses and expand knowledge.</p> <p>Methods</p> <p>We propose MKEM, a Multi-level Knowledge Emergence Model, to discover implicit relationships using Natural Language Processing techniques such as Link Grammar and Ontologies such as Unified Medical Language System (UMLS) MetaMap. The contribution of MKEM is as follows: First, we propose a flexible knowledge emergence model to extract implicit relationships across different levels such as molecular level for gene and protein and Phenomic level for disease and treatment. Second, we employ MetaMap for tagging biological concepts. Third, we provide an empirical and systematic approach to discover novel relationships.</p> <p>Results</p> <p>We applied our system on 5000 abstracts downloaded from PubMed database. We performed the performance evaluation as a gold standard is not yet available. Our system performed with a good precision and recall and we generated 24 hypotheses.</p> <p>Conclusions</p> <p>Our experiments show that MKEM is a powerful tool to discover hidden relationships residing in extracted entities that were represented by our Substance-Effect-Process-Disease-Body Part (SEPDB) model. </p
Inferring User Knowledge Level from Eye Movement Patterns
The acquisition of information and the search interaction process is influenced strongly by a person’s use of their knowledge of the domain and the task. In this paper we show that a user’s level of domain knowledge can be inferred from their interactive search behaviors without considering the content of queries or documents. A technique is presented to model a user’s information acquisition process during search using only measurements of eye movement patterns. In a user study (n=40) of search in the domain of genomics, a representation of the participant’s domain knowledge was constructed using self-ratings of knowledge of genomics-related terms (n=409). Cognitive effort features associated with reading eye movement patterns were calculated for each reading instance during the search tasks. The results show correlations between the cognitive effort due to reading and an individual’s level of domain knowledge. We construct exploratory regression models that suggest it is possible to build models that can make predictions of the user’s level of knowledge based on real-time measurements of eye movement patterns during a task session
HyperTraPS: Inferring probabilistic patterns of trait acquisition in evolutionary and disease progression pathways
The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalisable statistical platform to infer the dynamic pathways by which many, potentially interacting, discrete traits are acquired or lost over time in biomedical systems. The platform uses HyperTraPS (hypercubic transition path sampling) to learn progression pathways from cross-sectional, longitudinal, or phylogenetically-linked data with unprecedented efficiency, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. Its Bayesian structure quantifies uncertainty in pathway structure and allows interpretable predictions of behaviours, such as which symptom a patient will acquire next. We exploit the model’s topology to provide visualisation tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways
Joint Structure Learning of Multiple Non-Exchangeable Networks
Several methods have recently been developed for joint structure learning of
multiple (related) graphical models or networks. These methods treat individual
networks as exchangeable, such that each pair of networks are equally
encouraged to have similar structures. However, in many practical applications,
exchangeability in this sense may not hold, as some pairs of networks may be
more closely related than others, for example due to group and sub-group
structure in the data. Here we present a novel Bayesian formulation that
generalises joint structure learning beyond the exchangeable case. In addition
to a general framework for joint learning, we (i) provide a novel default prior
over the joint structure space that requires no user input; (ii) allow for
latent networks; (iii) give an efficient, exact algorithm for the case of time
series data and dynamic Bayesian networks. We present empirical results on
non-exchangeable populations, including a real data example from biology, where
cell-line-specific networks are related according to genomic features.Comment: To appear in Proceedings of the Seventeenth International Conference
on Artificial Intelligence and Statistics (AISTATS
Integrated Bio-Entity Network: A System for Biological Knowledge Discovery
A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs
In Silico Approaches and the Role of Ontologies in Aging Research
The 2013 Rostock Symposium on Systems Biology and Bioinformatics in Aging Research was again dedicated to dissecting the aging process using in silico means. A particular focus was on ontologies, as these are a key technology to systematically integrate heterogeneous information about the aging process. Related topics were databases and data integration. Other talks tackled modeling issues and applications, the latter including talks focussed on marker development and cellular stress as well as on diseases, in particular on diseases of kidney and skin
- …