901 research outputs found
HyperâHeuristics and Metaheuristics for Selected BioâInspired Combinatorial Optimization Problems
Many decision and optimization problems arising in bioinformatics field are time demanding, and several algorithms are designed to solve these problems or to improve their current best solution approach. Modeling and implementing a new heuristic algorithm may be timeâconsuming but has strong motivations: on the one hand, even a small improvement of the new solution may be worth the long time spent on the construction of a new method; on the other hand, there are problems for which goodâenough solutions are acceptable which could be achieved at a much lower computational cost. In the first case, specially designed heuristics or metaheuristics are needed, while the latter hyperâheuristics can be proposed. The paper will describe both approaches in different domain problems
Exploring Hidden Genetic Divergence Within Sunda Colugos by Means of Novel DNA Capture Methods and Next Generation Sequencing
It has been the goal of biologists to catalog and protect genetic diversity and variation
among biological organisms. The amount of diversity cataloged is growing every year.
In the twelve years between the latest two publications of Mammal Species of the World,
the number of mammalian species increased from 4998 to 5339 (~7%). This number is
expected to increase substantially, especially with the advent and application of new
genomic approaches to assess levels of species diversity. This increased diversity is
partially due to increased taxonomic investigation in Southeast Asia, which is known for
being a hot spot of species richness. This richness has been shown in recent years to be
continually threatened by human induced habitat loss, as is the case of a poorly known
group of mammals, the flying lemurs, or colugos. The colugo is a small arboreal
mammal that inhabits more than fifty islands in the SE Asian archipelago and adjacent
mainland areas of the Malay Peninsula, Thailand and Vietnam. The colugo has
extremely inefficient terrestrial locomotor capabilities, which isolate the colugo to
forested areas, where it is capable of gliding over one hundred meters between trees.
This study proposes a molecular phylogenetic analysis of the Sunda colugo (Galeopterus
variegatus) to redefine the evolutionary relationships between disjunct populations of
this poorly understood mammal, using a novel DNA capture method to isolate degraded
mtDNA fragments from museum samples, by hybridization to DNA fragments derived
from a modern colugo genome. The results demonstrate extremely efficient crossspecies
capture of mtDNA sequences as great as 10-15% divergent from the probe,
combined with Next Generation Sequencing Technologies to obtain high depth of
coverage of hybridized sequences. Phylogenetic results indicate the widespread
presence of species-level taxonomic units both within and between the islands of the
Southeast Asian archipelago. This novel approach to ancient DNA capture has
potentially broad implications for the conservation of this enigmatic mammal, and
further suggest that vicariant evolutionary analysis of colugos will be invaluable for
defining the biogeographic history of the SE Asian archipelago
On the role of metaheuristic optimization in bioinformatics
Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics
New algorithms for DNA sequencing by hybridization
The reconstruction of DNA sequences from DNA fragments is one of the most challenging problems in computational biology. In recent years the specific problem of DNA sequencing by hybridization has attracted quite a lot of interest in the optimization community. Despite the fact that well-working constructive heuristics are often the basis for well-working metaheuristics, only two constructive heuristics exist. Both approaches were proposed by Blazewicz and colleagues; the first one is a look-ahead greedy technique, and the second one is a constructive technique based on constructing reliable sub-sequences. Our motivation was twofold. First, we wanted to develop better constructive heuristics. Second, on the basis of these heuristics we wanted to develop new state-of-the-art metaheuristics for DNA sequencing by hybridization. In the first part of the paper we present our constructive heuristics. We show that the results of the best constructive heuristic are comparable to the results of existing metaheuristics, while using less computational time. In the second part of the paper we propose an ant colony optimization (ACO) approach and apply it in a so-called multi-level framework. Both, the ACO algorithm and the multi-level framework are based on our constructive heuristics. The computational results show that our algorithm is currently a state-of-the-art algorithm for DNA sequencing by hybridization.Postprint (published version
A thermodynamic approach to designing structure-free combinatorial DNA word sets
An algorithm is presented for the generation of sets of non-interacting DNA sequences, employing existing thermodynamic models for the prediction of duplex stabilities and secondary structures. A DNA âwordâ structure is employed in which individual DNA âwordsâ of a given length (e.g. 12mer and 16mer) may be concatenated into longer sequences (e.g. four tandem words and six tandem words). This approach, where multiple word variants are used at each tandem word position, allows very large sets of non-interacting DNA strands to be assembled from combinations of the individual words. Word sets were generated and their figures of merit are compared to sets as described previously in the literature (e.g. 4, 8, 12, 15 and 16mer). The predicted hybridization behavior was experimentally verified on selected members of the sets using standard UV hyperchromism measurements of duplex melting temperatures (T(m)s). Additional experimental validation was obtained by using the sequences in formulating and solving a small example of a DNA computing problem
Groups without cultured representatives dominate eukaryotic picophytoplankton in the oligotrophic South East Pacific Ocean
Background: Photosynthetic picoeukaryotes (PPE) with a cell size less than 3 Âľm play a critical role in oceanic primary production. In recent years, the composition of marine picoeukaryote communities has been intensively investigated by molecular approaches, but their photosynthetic fraction remains poorly characterized. This is largely because the classical approach that relies on constructing 18S rRNA gene clone libraries from filtered seawater samples using universal eukaryotic primers is heavily biased toward heterotrophs, especially alveolates and stramenopiles, despite the fact that autotrophic cells in general outnumber heterotrophic ones in the euphotic zone.
Methodology/Principal Findings: In order to better assess the composition of the eukaryotic picophytoplankton in the South East Pacific Ocean, encompassing the most oligotrophic oceanic regions on earth, we used a novel approach based on flow cytometry sorting followed by construction of 18S rRNA gene clone libraries. This strategy dramatically increased the recovery of sequences from putative autotrophic groups. The composition of the PPE community appeared highly variable both vertically down the water column and horizontally across the South East Pacific Ocean. In the central gyre, uncultivated lineages dominated: a recently discovered clade of Prasinophyceae (IX), clades of marine Chrysophyceae and Haptophyta, the latter division containing a potentially new class besides Prymnesiophyceae and Pavlophyceae. In contrast, on the edge of the gyre and in the coastal Chilean upwelling, groups with cultivated representatives (Prasinophyceae clade VII and Mamiellales) dominated.
Conclusions/Significance: Our data demonstrate that a very large fraction of the eukaryotic picophytoplankton still escapes cultivation. The use of flow cytometry sorting should prove very useful to better characterize specific plankton populations by molecular approaches such as gene cloning or metagenomics, and also to obtain into culture strains representative of these novel groups
A systems-based approach for detecting molecular interactions across tissues.
Current high-throughput gene expression experiments have a straightforward design of examining the gene expression of one group or condition relative to that of another. The data is typically analyzed as if they represent strictly intracellular events, and often treats genes as coming from a homogeneous population. Although intracellular events are crucial to nearly all biological processes, cell-cell interactions are often just as important, especially when gene expression data is generated from heterogeneous cell populations, such as from whole tissues. Cell-cell molecular interactions are generally lost in the available analytical procedures and as a result, are not examined experimentally, at least not accurately or with efficiency. Most importantly, this imposes major limitations when studying gene expression changes in multiple samples that interact with one another. In order to addresses the limitations of current techniques, we have developed a novel systems-based approach that expands the traditional analysis of gene expression in two stages. This includes a novel sequence-based meta-analytic tool, AbsIDconvert, that allows for conversion of annotated features using an interval tree for storing and querying absolute genomic coordinates for comparison of multi-scale macro-molecule identifiers across platforms and/or organisms. In addition, a systems-based heuristic algorithm is developed to find intercellular interactions between two sets of genes, potentially from different tissues by utilizing location information of each gene along with the information available in the secondary databases in the form of interactions, pathways and signaling. AbsIDconvert is shown to provide a high accuracy in identifier conversion as compared to other available methodologies (typically at an average rate of 84%) while maintaining a higher efficiency (O(n*log(n)). Our intercellular interaction approach and underlying visualization shows promise in allowing researchers to uncover novel signaling pathways in an intercellular fashion that to this point has not been possible
Recommended from our members
Molecular phylogeny, phylogeography and population genetics of the red seaweed genus <i>Asparagopsis</i>
The red seaweed genus Asparagopsis Montagne (Bonnemaisoniales) was studied with respect to its taxonomy, phylogeny, phylogeography and population genetics. The representatives of this genus, A. armata Harvey and A. taxiformis (Delile) Trevisan, are notorious invaders. Both species occur worldwide and show disjunct distribution patterns. Such patterns may result from recent jump-dispersal or from fragmentation of once panglobally distributed species. First, a phylogeographic approach was deployed in order to delineate the taxonomic units in local scale and to assess if European populations of each of the species originated from a single introduction or multiple cryptic ones. Results showed that the two species recognized A. armata and A. taxiformis are also genetically distinct. Asparagopsis armata was found to consist of a single species worldwide, whereas A. taxiformis constituted three and probably four morphologically cryptic but genetically distinct lineages. At times, lineages were encountered in sympatry and two of them were detected in the Mediterranean Sea.
In order to confirm distinction between lineages and to assess invasive potential and colonization mechanisms of the species along the western Italian coast, eight nuclear micro satellite markers were identified against the invasive lineage 2 of A. taxiformis. The markers cross-hybridised only with lineages I and 2. Moreover, it was demonstrated that carpogonia present on many female thalli can affect microsatellite reading patterns because of external (male) allelic contribution. Even after removal of the carpogonia, gametophyte thalli exhibited multiple allelic patterns, which is indicative for polyploidy. The markers were then used to assess genetic structure and diversity within and among Mediterranean populations of A. taxiformis lineages 1 and 2. Analyses based on statistics developed for polyploid species showed that the lineage l-population (HAW) was distinct from Mediterranean lineage 2 populations. A geographically distant Californian lineage 2¡ population was genetically distinct from the Mediterranean ones as well. The Mediterranean lineage 2-samples showed panmixia. High genotypic diversity, high gene flow, and low differentiation encountered amongst these populations probably are due to a recent invasion of this lineage into the basin
- âŚ