200 research outputs found

    Computational Molecular Biology

    No full text
    Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography

    Shortest common superstring approximaation nopea toteutus sekä soveltaminen relative lempel-ziv pakkaukseen

    Get PDF
    The objective of the shortest common superstring problem is to find a string of minimum length that contains all keywords in the given input as substrings. Shortest common superstrings have many applications in the fields of data compression and bioinformatics. For example, a common superstring can be seen as a compressed form of the keywords it is generated from. Since the shortest common superstring problem is NP-hard, we focus on the approximation algorithms that implement a so-called greed heuristic. It turns out that the actual shortest common superstring is not always needed. Instead, it is often enough to find an approximate solution of sufficient quality. We provide an implementation of the Ukkonen's linear time algorithm for the greedy heuristic. The practical performance of this implementation is measured by comparing it to another implementation of the same heuristic. We also hypothesize that shortest common superstrings can be potentially used to improve the compression ratio of the Relative Lempel-Ziv data compression algorithm. This hypothesis is examined and shown to be valid

    Quantitative analyses in basic, translational and clinical biomedical research: metabolism, vaccine design and preterm delivery prediction

    Get PDF
    2 t.There is nothing more important than preserving life, and the thesis here presented is framed in the field of quantitative biomedicine (or systems biomedicine), which has as objective the application of physico-mathematical techniques in biomedical research in order to enhance the understanding of life's basis and its pathologies, and, ultimately, to defend human health. In this thesis, we have applied physico-mathematical methods in the three fundamental levels of Biomedical Research: basic, translational and clinical. At a basic level, since all pathologies have their basis in the cell, we have performed two studies to deepen in the understanding of the cellular metabolic functionality. In the first work, we have quantitatively analyzed for the first time calcium-dependent chloride currents inside the cell, which has revealed the existence of a dynamical structure characterized by highly organized data sequences, non-trivial long-term correlation that last in average 7.66 seconds, and "crossover" effect with transitions between persistent and anti-persistent behaviors. In the second investigation, by the use of delay differential equations, we have modeled the adenylate energy system, which is the principal source of cellular energy. This study has shown that the cellular energy charge is determined by an oscillatory non-stationary invariant function, bounded from 0.7 to 0.95. At a translational level, we have developed a new method for vaccine design that, besides obtaining high coverages, is capable of giving protection against viruses with high mutability rates such as HIV, HCV or Influenza. Finally, at a clinical level, first we have proven that the classic quantitative measure of uterine contractions (Montevideo Units) is incapable of predicting preterm labor immediacy. Then, by applying autoregressive techniques, we have designed a novel tool for premature delivery forecasting, based only in 30 minutes of uterine dynamics. Altogether, these investigations have originated four scientific publications, and as far as we know, our work is the first European thesis which integrates in the same framework the application of mathematical knowledge to biomedical fields in the three main stages of Biomedical Research: basic, translational and clinical

    Greedy Shortest Common Superstring Approximation in Compact Space

    Get PDF
    Given a set of strings, the shortest common superstring problem is to find the shortest possible string that contains all the input strings. The problem is NP-hard, but a lot of work has gone into designing approximation algorithms for solving the problem. We present the first time and space efficient implementation of the classic greedy heuristic which merges strings in decreasing order of overlap length. Our implementation works in O(n log σ) time and bits of space, where n is the total length of the input strings in characters, and σσ is the size of the alphabet. After index construction, a practical implementation of our algorithm uses roughly 5n log σ bits of space and reasonable time for a real dataset that consists of DNA fragments.Peer reviewe

    Subsequences and Supersequences of Strings

    Get PDF
    Stringology - the study of strings - is a branch of algorithmics which been the sub-ject of mounting interest in recent years. Very recently, two books [M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, 1995] and [G. Stephen, String Searching Algorithms, World Scientific, 1994] have been published on the subject and at least two others are known to be in preparation. Problems on strings arise in information retrieval, version control, automatic spelling correction, and many other domains. However the greatest motivation for recent work in stringology has come from the field of molecular biology. String problems occur, for example, in genetic sequence construction, genetic sequence comparison, and phylogenetic tree construction. In this thesis we study a variety of string problems from a theoretical perspective. In particular, we focus on problems involving subsequences and supersequences of strings

    A Coverage Study of the CMSSM Based on ATLAS Sensitivity Using Fast Neural Networks Techniques

    Get PDF
    We assess the coverage properties of confidence and credible intervals on the CMSSM parameter space inferred from a Bayesian posterior and the profile likelihood based on an ATLAS sensitivity study. In order to make those calculations feasible, we introduce a new method based on neural networks to approximate the mapping between CMSSM parameters and weak-scale particle masses. Our method reduces the computational effort needed to sample the CMSSM parameter space by a factor of ~ 10^4 with respect to conventional techniques. We find that both the Bayesian posterior and the profile likelihood intervals can significantly over-cover and identify the origin of this effect to physical boundaries in the parameter space. Finally, we point out that the effects intrinsic to the statistical procedure are conflated with simplifications to the likelihood functions from the experiments themselves.Comment: Further checks about accuracy of neural network approximation, fixed typos, added refs. Main results unchanged. Matches version accepted by JHE
    • …
    corecore