28 research outputs found

    CLUSS: Clustering of protein sequences based on a new similarity measure

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rapid burgeoning of available protein data makes the use of clustering within families of proteins increasingly important. The challenge is to identify subfamilies of evolutionarily related sequences. This identification reveals phylogenetic relationships, which provide prior knowledge to help researchers understand biological phenomena. A good evolutionary model is essential to achieve a clustering that reflects the biological reality, and an accurate estimate of protein sequence similarity is crucial to the building of such a model. Most existing algorithms estimate this similarity using techniques that are not necessarily biologically plausible, especially for hard-to-align sequences such as proteins with different domain structures, which cause many difficulties for the alignment-dependent algorithms. In this paper, we propose a novel similarity measure based on matching amino acid subsequences. This measure, named SMS for Substitution Matching Similarity, is especially designed for application to non-aligned protein sequences. It allows us to develop a new alignment-free algorithm, named CLUSS, for clustering protein families. To the best of our knowledge, this is the first alignment-free algorithm for clustering protein sequences. Unlike other clustering algorithms, CLUSS is effective on both alignable and non-alignable protein families. In the rest of the paper, we use the term "<it>phylogenetic</it>" in the sense of "<it>relatedness of biological functions</it>".</p> <p>Results</p> <p>To show the effectiveness of CLUSS, we performed an extensive clustering on COG database. To demonstrate its ability to deal with hard-to-align sequences, we tested it on the GH2 family. In addition, we carried out experimental comparisons of CLUSS with a variety of mainstream algorithms. These comparisons were made on hard-to-align and easy-to-align protein sequences. The results of these experiments show the superiority of CLUSS in yielding clusters of proteins with similar functional activity.</p> <p>Conclusion</p> <p>We have developed an effective method and tool for clustering protein sequences to meet the needs of biologists in terms of phylogenetic analysis and prediction of biological functions. Compared to existing clustering methods, CLUSS more accurately highlights the functional characteristics of the clustered families. It provides biologists with a new and plausible instrument for the analysis of protein sequences, especially those that cause problems for the alignment-dependent algorithms.</p

    Lattice Boltzmann simulations of soft matter systems

    Full text link
    This article concerns numerical simulations of the dynamics of particles immersed in a continuum solvent. As prototypical systems, we consider colloidal dispersions of spherical particles and solutions of uncharged polymers. After a brief explanation of the concept of hydrodynamic interactions, we give a general overview over the various simulation methods that have been developed to cope with the resulting computational problems. We then focus on the approach we have developed, which couples a system of particles to a lattice Boltzmann model representing the solvent degrees of freedom. The standard D3Q19 lattice Boltzmann model is derived and explained in depth, followed by a detailed discussion of complementary methods for the coupling of solvent and solute. Colloidal dispersions are best described in terms of extended particles with appropriate boundary conditions at the surfaces, while particles with internal degrees of freedom are easier to simulate as an arrangement of mass points with frictional coupling to the solvent. In both cases, particular care has been taken to simulate thermal fluctuations in a consistent way. The usefulness of this methodology is illustrated by studies from our own research, where the dynamics of colloidal and polymeric systems has been investigated in both equilibrium and nonequilibrium situations.Comment: Review article, submitted to Advances in Polymer Science. 16 figures, 76 page

    The bear in Eurasian plant names: Motivations and models

    Get PDF
    Ethnolinguistic studies are important for understanding an ethnic group's ideas on the world, expressed in its language. Comparing corresponding aspects of such knowledge might help clarify problems of origin for certain concepts and words, e.g. whether they form common heritage, have an independent origin, are borrowings, or calques. The current study was conducted on the material in Slavonic, Baltic, Germanic, Romance, Finno-Ugrian, Turkic and Albanian languages. The bear was chosen as being a large, dangerous animal, important in traditional culture, whose name is widely reflected in folk plant names. The phytonyms for comparison were mostly obtained from dictionaries and other publications, and supplemented with data from databases, the co-authors' field data, and archival sources (dialect and folklore materials). More than 1200 phytonym use records (combinations of a local name and a meaning) for 364 plant and fungal taxa were recorded to help find out the reasoning behind bear-nomination in various languages, as well as differences and similarities between the patterns among them. Among the most common taxa with bear-related phytonyms were Arctostaphylos uva-ursi (L.) Spreng., Heracleum sphondylium L., Acanthus mollis L., and Allium ursinum L., with Latin loan translation contributing a high proportion of the phytonyms. Some plants have many and various bear-related phytonyms, while others have only one or two bear names. Features like form and/or surface generated the richest pool of names, while such features as colour seemed to provoke rather few associations with bears. The unevenness of bear phytonyms in the chosen languages was not related to the size of the language nor the present occurence of the Brown Bear in the region. However, this may, at least to certain extent, be related to the amount of the historical ethnolinguistic research done on the selected languages
    corecore