201 research outputs found

    Robust Padé Approximation via SVD

    Full text link

    Optimal Arrangement of Keys in a Hash Table

    Full text link

    OMA 2011: orthology inference among 1000 complete genomes

    Get PDF
    OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genomes. Initiated in 2004, the project is at its 11th release. It now includes 1000 genomes, making it one of the largest resources of its kind. Here, we describe recent developments in terms of species covered; the algorithmic pipeline—in particular regarding the treatment of alternative splicing, and new features of the web (OMA Browser) and programming interface (SOAP API). In the second part, we review the various representations provided by OMA and their typical applications. The database is publicly accessible at http://omabrowser.org

    A General Approach for Predicting the Filtration of Soft and Permeable Colloids: The Milk Example

    Get PDF
    Membrane filtration operations (ultra-, microfiltration) are now extensively used for concentrating or separating an ever-growing variety of colloidal dispersions. However, the phenomena that determine the efficiency of these operations are not yet fully understood. This is especially the case when dealing with colloids that are soft, deformable, and permeable. In this paper, we propose a methodology for building a model that is able to predict the performance (flux, concentration profiles) of the filtration of such objects in relation with the operating conditions. This is done by focusing on the case of milk filtration, all experiments being performed with dispersions of milk casein micelles, which are sort of ″natural″ colloidal microgels. Using this example, we develop the general idea that a filtration model can always be built for a given colloidal dispersion as long as this dispersion has been characterized in terms of osmotic pressure Π and hydraulic permeability k. For soft and permeable colloids, the major issue is that the permeability k cannot be assessed in a trivial way like in the case for hard-sphere colloids. To get around this difficulty, we follow two distinct approaches to actually measure k: a direct approach, involving osmotic stress experiments, and a reverse-calculation approach, that consists of estimating k through well-controlled filtration experiments. The resulting filtration model is then validated against experimental measurements obtained from combined milk filtration/SAXS experiments. We also give precise examples of how the model can be used, as well as a brief discussion on the possible universality of the approach presented here

    Fast index based algorithms and software for matching position specific scoring matrices

    Get PDF
    BACKGROUND: In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences. Searching with PSSMs in complete genomes or large sequence databases is a common, but computationally expensive task. RESULTS: We present a new non-heuristic algorithm, called ESAsearch, to efficiently find matches of PSSMs in large databases. Our approach preprocesses the search space, e.g., a complete genome or a set of protein sequences, and builds an enhanced suffix array that is stored on file. This allows the searching of a database with a PSSM in sublinear expected time. Since ESAsearch benefits from small alphabets, we present a variant operating on sequences recoded according to a reduced alphabet. We also address the problem of non-comparable PSSM-scores by developing a method which allows the efficient computation of a matrix similarity threshold for a PSSM, given an E-value or a p-value. Our method is based on dynamic programming and, in contrast to other methods, it employs lazy evaluation of the dynamic programming matrix. We evaluated algorithm ESAsearch with nucleotide PSSMs and with amino acid PSSMs. Compared to the best previous methods, ESAsearch shows speedups of a factor between 17 and 275 for nucleotide PSSMs, and speedups up to factor 1.8 for amino acid PSSMs. Comparisons with the most widely used programs even show speedups by a factor of at least 3.8. Alphabet reduction yields an additional speedup factor of 2 on amino acid sequences compared to results achieved with the 20 symbol standard alphabet. The lazy evaluation method is also much faster than previous methods, with speedups of a factor between 3 and 330. CONCLUSION: Our analysis of ESAsearch reveals sublinear runtime in the expected case, and linear runtime in the worst case for sequences not shorter than | [Formula: see text] |(m )+ m - 1, where m is the length of the PSSM and [Formula: see text] a finite alphabet. In practice, ESAsearch shows superior performance over the most widely used programs, especially for DNA sequences. The new algorithm for accurate on-the-fly calculations of thresholds has the potential to replace formerly used approximation approaches. Beyond the algorithmic contributions, we provide a robust, well documented, and easy to use software package, implementing the ideas and algorithms presented in this manuscript
    corecore