14 research outputs found

    Algorithms in Computational Biology

    No full text
    In this thesis we are concerned with constructing algorithms that address problems of biological relevance. This activity is part of a broader interdisciplinary area called computational biology, or bioinformatics, that focuses on utilizing the capacities of computers to gain knowledge from biological data. The majority of problems in computational biology relate to molecular or evolutionary biology, and focus on analyzing and comparing the genetic material of organisms. One deciding factor in shaping the area of computational biology is that DNA, RNA and proteins that are responsible for storing and utilizing the genetic material in an organism, can be described as strings over finite alphabets. The string representation of biomolecules allows for a wide range of algorithmic techniques concerned with strings to be applied for analyzing and comparing biological data. We contribute to the field of computational biology by constructing and analyzing algorithms that address problems of relevance to biological sequence analysis and structure prediction

    Triplet Supertrees by

    No full text
    Phylogenetic trees are used in the interdisciplinary field of bioinformatics to relate and classify species according to evolutionary history. Constructing large trees is infeasible because the search space on trees is very large. As a result, supertree methods have been the focus of much research. By building trees from smaller, already constructed trees, supertrees are a feasible way of constructing large trees. Several supertree methods have previously been developed. One, called Q(I)LI, decomposes its input trees to quartet trees, which are trees of four species. The supertree is assembled by combining all the quartet trees. This thesis reports on the design and implementation of a new algorithm called T(I)LI which uses triplets instead of quartets. In an experimental study, it is shown that T(I)LI outperforms Q(I)LI in efficiency, and under som

    PATBox: A Toolbox for Classification and Analysis of P-Type ATPases

    No full text
    <div><p>P-Type ATPases are part of the regulatory system of the cell where they are responsible for transporting ions and lipids through the cell membrane. These pumps are found in all eukaryotes and their malfunction has been found to cause several severe diseases. Knowing which substrate is pumped by a certain P-Type ATPase is therefore vital. The P-Type ATPases can be divided into 11 subtypes based on their specificity, that is, the substrate that they pump. Determining the subtype experimentally is time-consuming. Thus it is of great interest to be able to accurately predict the subtype based on the amino acid sequence only. We present an approach to P-Type ATPase sequence classification based on the <i>k</i>-nearest neighbors, similar to a homology search, and show that this method provides performs very well and, to the best of our knowledge, better than any existing method despite its simplicity. The classifier is made available as a web service at <a href="http://services.birc.au.dk/patbox/" target="_blank">http://services.birc.au.dk/patbox/</a> which also provides access to a database of potential P-Type ATPases and their predicted subtypes.</p></div

    This document in subdirectoryDS/00/4/ Algorithms in Computational Biology

    No full text
    Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. See back inner page for a list of recent BRICS Dissertation Series publications. Copies may be obtained by contacting: BRIC

    The results of 20 runs of 5-fold cross-validation for 1 ≤ <i>k</i> ≤ 50.

    No full text
    <p>The weighed and unweighed approaches both perform well for small <i>k</i>. For <i>k</i> = 1 we obtain an accuracy of 100%. Dots are outliers. Lines show accuracy for reduced datasets.</p
    corecore