901 research outputs found

    Hyper‐Heuristics and Metaheuristics for Selected Bio‐Inspired Combinatorial Optimization Problems

    Get PDF
    Many decision and optimization problems arising in bioinformatics field are time demanding, and several algorithms are designed to solve these problems or to improve their current best solution approach. Modeling and implementing a new heuristic algorithm may be time‐consuming but has strong motivations: on the one hand, even a small improvement of the new solution may be worth the long time spent on the construction of a new method; on the other hand, there are problems for which good‐enough solutions are acceptable which could be achieved at a much lower computational cost. In the first case, specially designed heuristics or metaheuristics are needed, while the latter hyper‐heuristics can be proposed. The paper will describe both approaches in different domain problems

    Exploring Hidden Genetic Divergence Within Sunda Colugos by Means of Novel DNA Capture Methods and Next Generation Sequencing

    Get PDF
    It has been the goal of biologists to catalog and protect genetic diversity and variation among biological organisms. The amount of diversity cataloged is growing every year. In the twelve years between the latest two publications of Mammal Species of the World, the number of mammalian species increased from 4998 to 5339 (~7%). This number is expected to increase substantially, especially with the advent and application of new genomic approaches to assess levels of species diversity. This increased diversity is partially due to increased taxonomic investigation in Southeast Asia, which is known for being a hot spot of species richness. This richness has been shown in recent years to be continually threatened by human induced habitat loss, as is the case of a poorly known group of mammals, the flying lemurs, or colugos. The colugo is a small arboreal mammal that inhabits more than fifty islands in the SE Asian archipelago and adjacent mainland areas of the Malay Peninsula, Thailand and Vietnam. The colugo has extremely inefficient terrestrial locomotor capabilities, which isolate the colugo to forested areas, where it is capable of gliding over one hundred meters between trees. This study proposes a molecular phylogenetic analysis of the Sunda colugo (Galeopterus variegatus) to redefine the evolutionary relationships between disjunct populations of this poorly understood mammal, using a novel DNA capture method to isolate degraded mtDNA fragments from museum samples, by hybridization to DNA fragments derived from a modern colugo genome. The results demonstrate extremely efficient crossspecies capture of mtDNA sequences as great as 10-15% divergent from the probe, combined with Next Generation Sequencing Technologies to obtain high depth of coverage of hybridized sequences. Phylogenetic results indicate the widespread presence of species-level taxonomic units both within and between the islands of the Southeast Asian archipelago. This novel approach to ancient DNA capture has potentially broad implications for the conservation of this enigmatic mammal, and further suggest that vicariant evolutionary analysis of colugos will be invaluable for defining the biogeographic history of the SE Asian archipelago

    On the role of metaheuristic optimization in bioinformatics

    Get PDF
    Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

    New algorithms for DNA sequencing by hybridization

    Get PDF
    The reconstruction of DNA sequences from DNA fragments is one of the most challenging problems in computational biology. In recent years the specific problem of DNA sequencing by hybridization has attracted quite a lot of interest in the optimization community. Despite the fact that well-working constructive heuristics are often the basis for well-working metaheuristics, only two constructive heuristics exist. Both approaches were proposed by Blazewicz and colleagues; the first one is a look-ahead greedy technique, and the second one is a constructive technique based on constructing reliable sub-sequences. Our motivation was twofold. First, we wanted to develop better constructive heuristics. Second, on the basis of these heuristics we wanted to develop new state-of-the-art metaheuristics for DNA sequencing by hybridization. In the first part of the paper we present our constructive heuristics. We show that the results of the best constructive heuristic are comparable to the results of existing metaheuristics, while using less computational time. In the second part of the paper we propose an ant colony optimization (ACO) approach and apply it in a so-called multi-level framework. Both, the ACO algorithm and the multi-level framework are based on our constructive heuristics. The computational results show that our algorithm is currently a state-of-the-art algorithm for DNA sequencing by hybridization.Postprint (published version

    A thermodynamic approach to designing structure-free combinatorial DNA word sets

    Get PDF
    An algorithm is presented for the generation of sets of non-interacting DNA sequences, employing existing thermodynamic models for the prediction of duplex stabilities and secondary structures. A DNA ‘word’ structure is employed in which individual DNA ‘words’ of a given length (e.g. 12mer and 16mer) may be concatenated into longer sequences (e.g. four tandem words and six tandem words). This approach, where multiple word variants are used at each tandem word position, allows very large sets of non-interacting DNA strands to be assembled from combinations of the individual words. Word sets were generated and their figures of merit are compared to sets as described previously in the literature (e.g. 4, 8, 12, 15 and 16mer). The predicted hybridization behavior was experimentally verified on selected members of the sets using standard UV hyperchromism measurements of duplex melting temperatures (T(m)s). Additional experimental validation was obtained by using the sequences in formulating and solving a small example of a DNA computing problem

    Groups without cultured representatives dominate eukaryotic picophytoplankton in the oligotrophic South East Pacific Ocean

    Get PDF
    Background: Photosynthetic picoeukaryotes (PPE) with a cell size less than 3 Âľm play a critical role in oceanic primary production. In recent years, the composition of marine picoeukaryote communities has been intensively investigated by molecular approaches, but their photosynthetic fraction remains poorly characterized. This is largely because the classical approach that relies on constructing 18S rRNA gene clone libraries from filtered seawater samples using universal eukaryotic primers is heavily biased toward heterotrophs, especially alveolates and stramenopiles, despite the fact that autotrophic cells in general outnumber heterotrophic ones in the euphotic zone. Methodology/Principal Findings: In order to better assess the composition of the eukaryotic picophytoplankton in the South East Pacific Ocean, encompassing the most oligotrophic oceanic regions on earth, we used a novel approach based on flow cytometry sorting followed by construction of 18S rRNA gene clone libraries. This strategy dramatically increased the recovery of sequences from putative autotrophic groups. The composition of the PPE community appeared highly variable both vertically down the water column and horizontally across the South East Pacific Ocean. In the central gyre, uncultivated lineages dominated: a recently discovered clade of Prasinophyceae (IX), clades of marine Chrysophyceae and Haptophyta, the latter division containing a potentially new class besides Prymnesiophyceae and Pavlophyceae. In contrast, on the edge of the gyre and in the coastal Chilean upwelling, groups with cultivated representatives (Prasinophyceae clade VII and Mamiellales) dominated. Conclusions/Significance: Our data demonstrate that a very large fraction of the eukaryotic picophytoplankton still escapes cultivation. The use of flow cytometry sorting should prove very useful to better characterize specific plankton populations by molecular approaches such as gene cloning or metagenomics, and also to obtain into culture strains representative of these novel groups

    A systems-based approach for detecting molecular interactions across tissues.

    Get PDF
    Current high-throughput gene expression experiments have a straightforward design of examining the gene expression of one group or condition relative to that of another. The data is typically analyzed as if they represent strictly intracellular events, and often treats genes as coming from a homogeneous population. Although intracellular events are crucial to nearly all biological processes, cell-cell interactions are often just as important, especially when gene expression data is generated from heterogeneous cell populations, such as from whole tissues. Cell-cell molecular interactions are generally lost in the available analytical procedures and as a result, are not examined experimentally, at least not accurately or with efficiency. Most importantly, this imposes major limitations when studying gene expression changes in multiple samples that interact with one another. In order to addresses the limitations of current techniques, we have developed a novel systems-based approach that expands the traditional analysis of gene expression in two stages. This includes a novel sequence-based meta-analytic tool, AbsIDconvert, that allows for conversion of annotated features using an interval tree for storing and querying absolute genomic coordinates for comparison of multi-scale macro-molecule identifiers across platforms and/or organisms. In addition, a systems-based heuristic algorithm is developed to find intercellular interactions between two sets of genes, potentially from different tissues by utilizing location information of each gene along with the information available in the secondary databases in the form of interactions, pathways and signaling. AbsIDconvert is shown to provide a high accuracy in identifier conversion as compared to other available methodologies (typically at an average rate of 84%) while maintaining a higher efficiency (O(n*log(n)). Our intercellular interaction approach and underlying visualization shows promise in allowing researchers to uncover novel signaling pathways in an intercellular fashion that to this point has not been possible
    • …
    corecore