398 research outputs found

    "The numerical accuracy of truncated Ewald sums for periodic systems with long-range Coulomb interactions"

    Full text link
    Ewald summation is widely used to calculate electrostatic interactions in computer simulations of condensed-matter systems. We present an analysis of the errors arising from truncating the infinite real- and Fourier-space lattice sums in the Ewald formulation. We derive an optimal choice for the Fourier-space cutoff given a screening parameter η\eta. We find that the number of vectors in Fourier space required to achieve a given accuracy scales with η3\eta^3. The proposed method can be used to determine computationally efficient parameters for Ewald sums, to assess the quality of Ewald-sum implementations, and to compare different implementations.Comment: 6 pages, 3 figures (Encapsulated PostScript), LaTe

    Making Connections: The Uses and Meanings of Needle Arts in The Color Purple and The Mountain and the Valley

    Get PDF
    In this article, the author examines the symbolic use of colour and cloth within two novels, The Color Purple and The Mountain and the Valley. By comparing the quilting and rug-hooking activities of the characters within these works, she explores the emancipatory aspects of women's needle work. Furthermore, she examines the authors' differing subjectivities of gender, race and culture by contrasting their approaches in documenting the relationship between women and their fibre arts.L'auteure de l'article suivant examine l'usage symbolique de couleurs et de tissus dans deux romans, soit The Color Purple et The Mountain and the Valley . En comparant les activités de piquage de courtepointe et de fabrication de tapis au crochet exercées par les personnages dans ces oeuvres, elle explore les aspects émancipateurs des travaux à l'aiguille pratiqués par les femmes. De plus, elle examine les subjectivités différentes de sexe, de race et de culture dans ces romans en mettant en contraste les façons de décrire la relation entre les femmes et les arts textiles

    Strength in numbers: achieving greater accuracy in MHC-I binding prediction by combining the results from multiple prediction tools

    Get PDF
    BACKGROUND: Peptides derived from endogenous antigens can bind to MHC class I molecules. Those which bind with high affinity can invoke a CD8(+ )immune response, resulting in the destruction of infected cells. Much work in immunoinformatics has involved the algorithmic prediction of peptide binding affinity to various MHC-I alleles. A number of tools for MHC-I binding prediction have been developed, many of which are available on the web. RESULTS: We hypothesize that peptides predicted by a number of tools are more likely to bind than those predicted by just one tool, and that the likelihood of a particular peptide being a binder is related to the number of tools that predict it, as well as the accuracy of those tools. To this end, we have built and tested a heuristic-based method of making MHC-binding predictions by combining the results from multiple tools. The predictive performance of each individual tool is first ascertained. These performance data are used to derive weights such that the predictions of tools with better accuracy are given greater credence. The combined tool was evaluated using ten-fold cross-validation and was found to signicantly outperform the individual tools when a high specificity threshold is used. It performs comparably well to the best-performing individual tools at lower specificity thresholds. Finally, it also outperforms the combination of the tools resulting from linear discriminant analysis. CONCLUSION: A heuristic-based method of combining the results of the individual tools better facilitates the scanning of large proteomes for potential epitopes, yielding more actual high-affinity binders while reporting very few false positives

    Determination of the minimum number of microarray experiments for discovery of gene expression patterns

    Get PDF
    BACKGROUND: One type of DNA microarray experiment is discovery of gene expression patterns for a cell line undergoing a biological process over a series of time points. Two important issues with such an experiment are the number of time points, and the interval between them. In the absence of biological knowledge regarding appropriate values, it is natural to question whether the behaviour of progressively generated data may by itself determine a threshold beyond which further microarray experiments do not contribute to pattern discovery. Additionally, such a threshold implies a minimum number of microarray experiments, which is important given the cost of these experiments. RESULTS: We have developed a method for determining the minimum number of microarray experiments (i.e. time points) for temporal gene expression, assuming that the span between time points is given and the hierarchical clustering technique is used for gene expression pattern discovery. The key idea is a similarity measure for two clusterings which is expressed as a function of the data for progressive time points. While the experiments are underway, this function is evaluated. When the function reaches its maximum, it indicates the set of experiments reach a saturated state. Therefore, further experiments do not contribute to the discrimination of patterns. CONCLUSION: The method has been verified with two previously published gene expression datasets. For both experiments, the number of time points determined with our method is less than in the published experiments. It is noted that the overall approach is applicable to other clustering techniques

    A better sequence-read simulator program for metagenomics

    Get PDF
    BACKGROUND: There are many programs available for generating simulated whole-genome shotgun sequence reads. The data generated by many of these programs follow predefined models, which limits their use to the authors' original intentions. For example, many models assume that read lengths follow a uniform or normal distribution. Other programs generate models from actual sequencing data, but are limited to reads from single-genome studies. To our knowledge, there are no programs that allow a user to generate simulated data following non-parametric read-length distributions and quality profiles based on empirically-derived information from metagenomics sequencing data. RESULTS: We present BEAR (Better Emulation for Artificial Reads), a program that uses a machine-learning approach to generate reads with lengths and quality values that closely match empirically-derived distributions. BEAR can emulate reads from various sequencing platforms, including Illumina, 454, and Ion Torrent. BEAR requires minimal user input, as it automatically determines appropriate parameter settings from user-supplied data. BEAR also uses a unique method for deriving run-specific error rates, and extracts useful statistics from the metagenomic data itself, such as quality-error models. Many existing simulators are specific to a particular sequencing technology; however, BEAR is not restricted in this way. Because of its flexibility, BEAR is particularly useful for emulating the behaviour of technologies like Ion Torrent, for which no dedicated sequencing simulators are currently available. BEAR is also the first metagenomic sequencing simulator program that automates the process of generating abundances, which can be an arduous task. CONCLUSIONS: BEAR is useful for evaluating data processing tools in genomics. It has many advantages over existing comparable software, such as generating more realistic reads and being independent of sequencing technology, and has features particularly useful for metagenomics work

    Comparing the Similarity of Different Groups of Bacteria to the Human Proteome

    Get PDF
    Numerous aspects of the relationship between bacteria and human have been investigated. One aspect that has recently received attention is sequence overlap at the proteomic level. However, there has not yet been a study that comprehensively characterizes the level of sequence overlap between bacteria and human, especially as it relates to bacterial characteristics like pathogenicity, G-C content, and proteome size. In this study, we began by performing a general characterization of the range of bacteria-human similarity at the proteomic level, and identified characteristics of the most- and least-similar bacterial species. We then examined the relationship between proteomic similarity and numerous other variables. While pathogens and nonpathogens had comparable similarity to the human proteome, pathogens causing chronic infections were found to be more similar to the human proteome than those causing acute infections. Although no general correspondence between a bacterium’s proteome size and its similarity to the human proteome was noted, no bacteria with small proteomes had high similarity to the human proteome. Finally, we discovered an interesting relationship between similarity and a bacterium’s G-C content. While the relationship between bacteria and human has been studied from many angles, their proteomic similarity still needs to be examined in more detail. This paper sheds further light on this relationship, particularly with respect to immunity and pathogenicity

    The oligodeoxynucleotide sequences corresponding to never-expressed peptide motifs are mainly located in the non-coding strand

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We study the usage of specific peptide platforms in protein composition. Using the pentapeptide as a unit of length, we find that in the universal proteome many pentapeptides are heavily repeated (even thousands of times), whereas some are quite rare, and a small number do not appear at all. To understand the physico-chemical-biological basis underlying peptide usage at the proteomic level, in this study we analyse the energetic costs for the synthesis of rare and never-expressed versus frequent pentapeptides. In addition, we explore residue bulkiness, hydrophobicity, and codon number as factors able to modulate specific peptide frequencies. Then, the possible influence of amino acid composition is investigated in zero- and high-frequency pentapeptide sets by analysing the frequencies of the corresponding inverse-sequence pentapeptides. As a final step, we analyse the pentadecamer oligodeoxynucleotide sequences corresponding to the never-expressed pentapeptides.</p> <p>Results</p> <p>We find that only DNA context-dependent constraints (such as oligodeoxynucleotide sequence location in the minus strand, introns, pseudogenes, frameshifts, etc.) provide a coherent mechanistic platform to explain the occurrence of never-expressed versus frequent pentapeptides in the protein world.</p> <p>Conclusions</p> <p>This study is of importance in cell biology. Indeed, the rarity (or lack of expression) of specific 5-mer peptide modules implies the rarity (or lack of expression) of the corresponding <it>n</it>-mer peptide sequences (with <it>n </it>< 5), so possibly modulating protein compositional trends. Moreover the data might further our understanding of the role exerted by rare pentapeptide modules as critical biological effectors in protein-protein interactions.</p
    corecore