14 research outputs found

    Efficient representation of uncertainty in multiple sequence alignments using directed acyclic graphs

    Get PDF
    Background A standard procedure in many areas of bioinformatics is to use a single multiple sequence alignment (MSA) as the basis for various types of analysis. However, downstream results may be highly sensitive to the alignment used, and neglecting the uncertainty in the alignment can lead to significant bias in the resulting inference. In recent years, a number of approaches have been developed for probabilistic sampling of alignments, rather than simply generating a single optimum. However, this type of probabilistic information is currently not widely used in the context of downstream inference, since most existing algorithms are set up to make use of a single alignment. Results In this work we present a framework for representing a set of sampled alignments as a directed acyclic graph (DAG) whose nodes are alignment columns; each path through this DAG then represents a valid alignment. Since the probabilities of individual columns can be estimated from empirical frequencies, this approach enables sample-based estimation of posterior alignment probabilities. Moreover, due to conditional independencies between columns, the graph structure encodes a much larger set of alignments than the original set of sampled MSAs, such that the effective sample size is greatly increased. Conclusions The alignment DAG provides a natural way to represent a distribution in the space of MSAs, and allows for existing algorithms to be efficiently scaled up to operate on large sets of alignments. As an example, we show how this can be used to compute marginal probabilities for tree topologies, averaging over a very large number of MSAs. This framework can also be used to generate a statistically meaningful summary alignment; example applications show that this summary alignment is consistently more accurate than the majority of the alignment samples, leading to improvements in downstream tree inference. Implementations of the methods described in this article are available at http://statalign.github.io/WeaveAlign webcite

    An alternating least square based algorithm for predicting patient survivability

    No full text
    Breast cancer is the most common cancer to females worldwide. Using machine learning technology to predict breast-cancer patients\u27 survivability has drawn a lot of research interest. However, it still faces many issues, such as missing-value imputation. As such, the main objective of this paper is to develop a novel imputation algorithm, inspired by the recommendation system. More precisely, features with missing values are regarded as items to be evaluated for recommendation. Consequently, a matrix factorisation algorithm (Alternating Least Square, ALS) is employed to replace missing values; accordingly, four different prediction strategies based on the ALS result are further discussed. The proposed ALS-based imputation algorithm is evaluated by using a large patient dataset from the Surveillance, Epidemiology, and End Results (SEER) program. Experimental results demonstrates a significant improvement on the survivability prediction, compared to existing methods

    Non-vitamin K antagonist oral anticoagulants (NOACs) for thromboembolic prevention, are they safe in congenital heart disease? Results of a worldwide study.

    Get PDF
    Current guidelines consider vitamin K antagonists (VKA) the oral anticoagulant agents of choice in adults with atrial arrhythmias (AA) and moderate or complex forms of congenital heart disease, significant valvular lesions, or bioprosthetic valves, pending safety data on non-VKA oral anticoagulants (NOACs). Therefore, the international NOTE registry was initiated to assess safety, change in adherence and quality of life (QoL) associated with NOACs in adults with congenital heart disease (ACHD). An international multicenter prospective study of NOACs in ACHD was established. Follow-up occurred at 6 months and yearly thereafter. Primary endpoints were thromboembolism and major bleeding. Secondary endpoints included minor bleeding, change in therapy adherence (≥80% medication refill rate, ≥6 out of 8 on Morisky-8 questionnaire) and QoL (SF-36 questionnaire). In total, 530 ACHD patients (mean age 47 SD 15 years; 55% male) with predominantly moderate or complex defects (85%), significant valvular lesions (46%) and/or bioprosthetic valves (11%) using NOACs (rivaroxaban 43%; apixaban 39%; dabigatran 12%; edoxaban 7%) were enrolled. The most common indication was AA (91%). Over a median follow-up of 1.0 [IQR 0.0-2.0] year, thromboembolic event rate was 1.0% [95%CI 0.4-2.0] (n = 6) per year, with 1.1% [95%CI 0.5-2.2] (n = 7) annualized rate of major bleeding and 6.3% [95%CI 4.5-8.5] (n = 37) annualized rate of minor bleeding. Adherence was sufficient during 2 years follow-up in 80-93% of patients. At 1-year follow-up, among the subset of previous VKA-users who completed the survey (n = 33), QoL improved in 6 out of 8 domains (p ≪ 0.05). Initial results from our worldwide prospective study suggest that NOACs are safe and may be effective for thromboembolic prevention in adults with heterogeneous forms of congenital heart disease
    corecore