169 research outputs found

    An Exact Algorithm for Side-Chain Placement in Protein Design

    Get PDF
    Computational protein design aims at constructing novel or improved functions on the structure of a given protein backbone and has important applications in the pharmaceutical and biotechnical industry. The underlying combinatorial side-chain placement problem consists of choosing a side-chain placement for each residue position such that the resulting overall energy is minimum. The choice of the side-chain then also determines the amino acid for this position. Many algorithms for this NP-hard problem have been proposed in the context of homology modeling, which, however, reach their limits when faced with large protein design instances. In this paper, we propose a new exact method for the side-chain placement problem that works well even for large instance sizes as they appear in protein design. Our main contribution is a dedicated branch-and-bound algorithm that combines tight upper and lower bounds resulting from a novel Lagrangian relaxation approach for side-chain placement. Our experimental results show that our method outperforms alternative state-of-the art exact approaches and makes it possible to optimally solve large protein design instances routinely

    Double Exponential Instability of Triangular Arbitrage Systems

    Full text link
    If financial markets displayed the informational efficiency postulated in the efficient markets hypothesis (EMH), arbitrage operations would be self-extinguishing. The present paper considers arbitrage sequences in foreign exchange (FX) markets, in which trading platforms and information are fragmented. In Kozyakin et al. (2010) and Cross et al. (2012) it was shown that sequences of triangular arbitrage operations in FX markets containing 4 currencies and trader-arbitrageurs tend to display periodicity or grow exponentially rather than being self-extinguishing. This paper extends the analysis to 5 or higher-order currency worlds. The key findings are that in a 5-currency world arbitrage sequences may also follow an exponential law as well as display periodicity, but that in higher-order currency worlds a double exponential law may additionally apply. There is an "inheritance of instability" in the higher-order currency worlds. Profitable arbitrage operations are thus endemic rather that displaying the self-extinguishing properties implied by the EMH.Comment: 22 pages, 22 bibliography references, expanded Introduction and Conclusion, added bibliohraphy reference

    Algorithm engineering for optimal alignment of protein structure distance matrices

    Get PDF
    Protein structural alignment is an important problem in computational biology. In this paper, we present first successes on provably optimal pairwise alignment of protein inter-residue distance matrices, using the popular Dali scoring function. We introduce the structural alignment problem formally, which enables us to express a variety of scoring functions used in previous work as special cases in a unified framework. Further, we propose the first mathematical model for computing optimal structural alignments based on dense inter-residue distance matrices. We therefore reformulate the problem as a special graph problem and give a tight integer linear programming model. We then present algorithm engineering techniques to handle the huge integer linear programs of real-life distance matrix alignment problems. Applying these techniques, we can compute provably optimal Dali alignments for the very first time

    A critical evaluation of network and pathway based classifiers for outcome prediction in breast cancer

    Get PDF
    Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single gene classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single gene classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single gene classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single gene sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single gene classifiers for predicting outcome in breast cancer

    Improving mungbean growth in a semiarid dryland system with agricultural waste biochars and cattle manure

    Get PDF
    Mungbean (Vigna radiata L.) productivity in dryland decreased recently due to the soil fertility degradation. The objective of this study was to evaluate the effect of biochar types and cattle manure rates on the growth of mungbean in semi-arid dark soil. The factorial completely randomized block design 3 x 5 with four replicates was set to arrange treatments for the field trial. Two biochars (rice husk and sawdust) at 10 t/ha in combination with four rates of cattle manure (1, 3, 5 and 10 t/ha) and control (without biochar and cattle manure) were applied to the soil, incubated for three weeks and then planted with mungbean cv. Fore Belu. The results revealed that additions of biochar and cattle manure increased soil moisture and soil electrical conductivity by 2-4% and 0.15-0.20, respectively; decreased soil temperature and bulk density by 1-2oC and 0.2 g/cm3, respectively; increased plant height, stem diameter, root length, total, shoot and root dry weights by 4 cm, 0.1 cm, 5 cm, 7 g, 0.9 g and 6 g, respectively, compared to the control. The best growth of mungbean was obtained from the additions of sawdust biochar at 10 t/ha and cattle manure at 3 t/ha

    A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes

    Get PDF
    A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities

    Optimizing topological cascade resilience based on the structure of terrorist networks

    Get PDF
    Complex socioeconomic networks such as information, finance and even terrorist networks need resilience to cascades - to prevent the failure of a single node from causing a far-reaching domino effect. We show that terrorist and guerrilla networks are uniquely cascade-resilient while maintaining high efficiency, but they become more vulnerable beyond a certain threshold. We also introduce an optimization method for constructing networks with high passive cascade resilience. The optimal networks are found to be based on cells, where each cell has a star topology. Counterintuitively, we find that there are conditions where networks should not be modified to stop cascades because doing so would come at a disproportionate loss of efficiency. Implementation of these findings can lead to more cascade-resilient networks in many diverse areas.Comment: 26 pages. v2: In review at Public Library of Science ON

    Simultaneous Optimization of Both Node and Edge Conservation in Network Alignment via WAVE

    Full text link
    Network alignment can be used to transfer functional knowledge between conserved regions of different networks. Typically, existing methods use a node cost function (NCF) to compute similarity between nodes in different networks and an alignment strategy (AS) to find high-scoring alignments with respect to the total NCF over all aligned nodes (or node conservation). But, they then evaluate quality of their alignments via some other measure that is different than the node conservation measure used to guide the alignment construction process. Typically, one measures the amount of conserved edges, but only after alignments are produced. Hence, a recent attempt aimed to directly maximize the amount of conserved edges while constructing alignments, which improved alignment accuracy. Here, we aim to directly maximize both node and edge conservation during alignment construction to further improve alignment accuracy. For this, we design a novel measure of edge conservation that (unlike existing measures that treat each conserved edge the same) weighs each conserved edge so that edges with highly NCF-similar end nodes are favored. As a result, we introduce a novel AS, Weighted Alignment VotEr (WAVE), which can optimize any measures of node and edge conservation, and which can be used with any NCF or combination of multiple NCFs. Using WAVE on top of established state-of-the-art NCFs leads to superior alignments compared to the existing methods that optimize only node conservation or only edge conservation or that treat each conserved edge the same. And while we evaluate WAVE in the computational biology domain, it is easily applicable in any domain.Comment: 12 pages, 4 figure
    • …
    corecore