80 research outputs found

    CausalBuilder: bringing the MI2CAST causal interaction annotation standard to the curator

    Get PDF
    Molecular causal interactions are defined as regulatory connections between biological components. They are commonly retrieved from biological experiments and can be used for connecting biological molecules together to enable the building of regulatory computational models that represent biological systems. However, including a molecular causal interaction in a model requires assessing its relevance to that model, based on the detailed knowledge about the biomolecules, interaction type and biological context. In order to standardize the representation of this knowledge in ‘causal statements’, we recently developed the Minimum Information about a Molecular Interaction Causal Statement (MI2CAST) guidelines. Here, we introduce causalBuilder: an intuitive web-based curation interface for the annotation of molecular causal interactions that comply with the MI2CAST standard. The causalBuilder prototype essentially embeds the MI2CAST curation guidelines in its interface and makes its rules easy to follow by a curator. In addition, causalBuilder serves as an original application of the Visual Syntax Method general-purpose curation technology and provides both curators and tool developers with an interface that can be fully configured to allow focusing on selected MI2CAST concepts to annotate. After the information is entered, the causalBuilder prototype produces genuine causal statements that can be exported in different formats.publishedVersio

    UniBioDicts: Unified access to biological dictionaries

    Get PDF
    We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI) and ontologies (e.g. GO, via BioPortal). This functionality can be connected to input fields (user-interface components) that offer autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and helping data users search and retrieve the right query terms.publishedVersionThis is an open access article distributed under the terms of the Creative Commons CC BY license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

    A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

    Full text link
    This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are often narrow in scope, focusing, for example, on high-dimensional data. Additionally, they may lack appropriate tuning or evaluation procedures, or are qualitative reviews, rather than quantitative comparisons. This comprehensive study aims to fill the gap by neutrally evaluating a broad range of methods and providing generalizable conclusions. We benchmark 18 models, ranging from classical statistical approaches to many common machine learning methods, on 32 publicly available datasets. The benchmark tunes for both a discrimination measure and a proper scoring rule to assess performance in different settings. Evaluating on 8 survival metrics, we assess discrimination, calibration, and overall predictive performance of the tested models. Using discrimination measures, we find that no method significantly outperforms the Cox model. However, (tuned) Accelerated Failure Time models were able to achieve significantly better results with respect to overall predictive performance as measured by the right-censored log-likelihood. Machine learning methods that performed comparably well include Oblique Random Survival Forests under discrimination, and Cox-based likelihood-boosting under overall predictive performance. We conclude that for predictive purposes in the standard survival analysis setting of low-dimensional, right-censored data, the Cox Proportional Hazards model remains a simple and robust method, sufficient for practitioners.Comment: 42 pages, 28 figure

    IIb-RAD-sequencing coupled with random forest classification indicates regional population structuring and sex-specific differentiation in salmon lice (Lepeophtheirus salmonis)

    Get PDF
    The aquaculture industry has been dealing with salmon lice problems forming serious threats to salmonid farming. Several treatment approaches have been used to control the parasite. Treatment effectiveness must be optimized, and the systematic genetic differences between subpopulations must be studied to monitor louse species and enhance targeted control measures. We have used IIb-RAD sequencing in tandem with a random forest classification algorithm to detect the regional genetic structure of the Norwegian salmon lice and identify important markers for sex differentiation of this species. We identified 19,428 single nucleotide polymorphisms (SNPs) from 95 individuals of salmon lice. These SNPs, however, were not able to distinguish the differential structure of lice populations. Using the random forest algorithm, we selected 91 SNPs important for geographical classification and 14 SNPs important for sex classification. The geographically important SNP data substantially improved the genetic understanding of the population structure and classified regional demographic clusters along the Norwegian coast. We also uncovered SNP markers that could help determine the sex of the salmon louse. A large portion of the SNPs identified to be under directional selection was also ranked highly important by random forest. According to our findings, there is a regional population structure of salmon lice associated with the geographical location along the Norwegian coastline.publishedVersio

    Fine tuning a logical model of cancer cells to predict drug synergies: combining manual curation and automated parameterization

    Get PDF
    Treatment with combinations of drugs carries great promise for personalized therapy for a variety of diseases. We have previously shown that synergistic combinations of cancer signaling inhibitors can be identified based on a logical framework, by manual model definition. We now demonstrate how automated adjustments of model topology and logic equations both can greatly reduce the workload traditionally associated with logical model optimization. Our methodology allows the exploration of larger model ensembles that all obey a set of observations, while being less restrained for parts of the model where parameterization is not guided by biological data. We benchmark the synergy prediction performance of our logical models in a dataset of 153 targeted drug combinations. We show that well-performing manual models faithfully represent measured biomarker data and that their performance can be outmatched by automated parameterization using a genetic algorithm. Whereas the predictive performance of a curated model is strongly affected by simulated curation errors, data-guided deletion of a small subset of regulatory model edges can significantly improve prediction quality. With correct topology we find evidence of some tolerance to simulated errors in the biomarker calibration data, yet performance decreases with reduced data quality. Moreover, we show that predictive logical models are valuable for proposing mechanisms underpinning observed synergies. With our framework we predict the synergy of joint inhibition of PI3K and TAK1, and further substantiate this prediction with observations in cancer cell cultures and in xenograft experiments.NTNU Health. Liaison Committee between the Central Norway Regional Health Authority (RHA) and the Norwegian University of Science and Technology (NTNU). ERA PerMed ONCOLOGICS. The Research Council of Norway (RCN) under the framework of the European Research Area (ERA) PerMed program, grant 329059.Peer ReviewedPostprint (published version

    jones-lab-tamu/usefun: v1.0.0

    No full text
    <p>CBAS implements the precision-recall bootstrap method of statistical comparison from this repository/version in Python instead of R. All credit goes to the original authors. Please visit their repository, <a href="https://github.com/bblodfon/usefun">here</a>.</p&gt
    corecore