115 research outputs found

    Lower bounds on multiple sequence alignment using exact 3-way alignment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiple sequence alignment is fundamental. Exponential growth in computation time appears to be inevitable when an optimal alignment is required for many sequences. Exact costs of optimum alignments are therefore rarely computed. Consequently much effort has been invested in algorithms for alignment that are heuristic, or explore a restricted class of solutions. These give an upper bound on the alignment cost, but it is equally important to determine the quality of the solution obtained. In the absence of an optimal alignment with which to compare, lower bounds may be calculated to assess the quality of the alignment. As more effort is invested in improving upper bounds (alignment algorithms), it is therefore important to improve lower bounds as well. Although numerous cost metrics can be used to determine the quality of an alignment, many are based on sum-of-pairs (SP) measures and their generalizations.</p> <p>Results</p> <p>Two standard and two new methods are considered for using exact 2-way and 3-way alignments to compute lower bounds on total SP alignment cost; one new method fares well with respect to accuracy, while the other reduces the computation time. The first employs exhaustive computation of exact 3-way alignments, while the second employs an efficient heuristic to compute a much smaller number of exact 3-way alignments. Calculating all 3-way alignments exactly and computing their average improves lower bounds on sum of SP cost in <it>v</it>-way alignments. However judicious selection of a subset of all 3-way alignments can yield a further improvement with minimal additional effort. On the other hand, a simple heuristic to select a random subset of 3-way alignments (a random packing) yields accuracy comparable to averaging all 3-way alignments with substantially less computational effort.</p> <p>Conclusion</p> <p>Calculation of lower bounds on SP cost (and thus the quality of an alignment) can be improved by employing a mixture of 3-way and 2-way alignments.</p

    Minimal Absent Words in Four Human Genome Assemblies

    Get PDF
    Minimal absent words have been computed in genomes of organisms from all domains of life. Here, we aim to contribute to the catalogue of human genomic variation by investigating the variation in number and content of minimal absent words within a species, using four human genome assemblies. We compare the reference human genome GRCh37 assembly, the HuRef assembly of the genome of Craig Venter, the NA12878 assembly from cell line GM12878, and the YH assembly of the genome of a Han Chinese individual. We find the variation in number and content of minimal absent words between assemblies more significant for large and very large minimal absent words, where the biases of sequencing and assembly methodologies become more pronounced. Moreover, we find generally greater similarity between the human genome assemblies sequenced with capillary-based technologies (GRCh37 and HuRef) than between the human genome assemblies sequenced with massively parallel technologies (NA12878 and YH). Finally, as expected, we find the overall variation in number and content of minimal absent words within a species to be generally smaller than the variation between species

    Application of Homozygosity Haplotype Analysis to Genetic Mapping with High-Density SNP Genotype Data

    Get PDF
    BACKGROUND: In families segregating a monogenic genetic disorder with a single disease gene introduction, patients share a mutation-carrying chromosomal interval with identity-by-descent (IBD). Such a shared chromosomal interval or haplotype, surrounding the actual pathogenic mutation, is typically detected and defined by multipoint linkage and phased haplotype analysis using microsatellite or SNP genotype data. High-density SNP genotype data presents a computational challenge for conventional genetic analyses. A novel non-parametric method termed Homozygosity Haplotype (HH) was recently proposed for the genome-wide search of the autosomal segments shared among patients using high density SNP genotype data. METHODOLOGY/PRINCIPAL FINDINGS: The applicability and the effectiveness of HH in identifying the potential linkage of disease causative gene with high-density SNP genotype data were studied with a series of monogenic disorders ascertained in eastern Canadian populations. The HH approach was validated using the genotypes of patients from a family affected with a rare autosomal dominant disease Schnyder crystalline corneal dystrophy. HH accurately detected the approximately 1 Mb genomic interval encompassing the causative gene UBIAD1 using the genotypes of only four affected subjects. The successful application of HH to identify the potential linkage for a family with pericentral retinal disorder indicates that HH can be applied to perform family-based association analysis by treating affected and unaffected family members as cases and controls respectively. A new strategy for the genome-wide screening of known causative genes or loci with HH was proposed, as shown the applications to a myoclonus dystonia and a renal failure cohort. CONCLUSIONS/SIGNIFICANCE: Our study of the HH approach demonstrates that HH is very efficient and effective in identifying potential disease linked region. HH has the potential to be used as an efficient alternative approach to sequencing or microsatellite-based fine mapping for screening the known causative genes in genetic disease study

    TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequencing metagenomes that were pre-amplified with primer-based methods requires the removal of the additional tag sequences from the datasets. The sequenced reads can contain deletions or insertions due to sequencing limitations, and the primer sequence may contain ambiguous bases. Furthermore, the tag sequence may be unavailable or incorrectly reported. Because of the potential for downstream inaccuracies introduced by unwanted sequence contaminations, it is important to use reliable tools for pre-processing sequence data.</p> <p>Results</p> <p>TagCleaner is a web application developed to automatically identify and remove known or unknown tag sequences allowing insertions and deletions in the dataset. TagCleaner is designed to filter the trimmed reads for duplicates, short reads, and reads with high rates of ambiguous sequences. An additional screening for and splitting of fragment-to-fragment concatenations that gave rise to artificial concatenated sequences can increase the quality of the dataset. Users may modify the different filter parameters according to their own preferences.</p> <p>Conclusions</p> <p>TagCleaner is a publicly available web application that is able to automatically detect and efficiently remove tag sequences from metagenomic datasets. It is easily configurable and provides a user-friendly interface. The interactive web interface facilitates export functionality for subsequent data processing, and is available at <url>http://edwards.sdsu.edu/tagcleaner</url>.</p

    The Consumption, Production and Regulation of Alcohol in the UK: The Relevance of the Ambivalence of the Carnivalesque

    Get PDF
    Alcohol consumption in 21st-century Britain is of significant interest to government, media and academics. Some have referred to a ‘new culture of intoxication’ or ‘calculated hedonism’, fostered by the drinks industry, and enabled by a neoliberal policymaking context. This article argues that the ‘carnivalesque’ is a better concept through which to understand alcohol’s place in British society today. The concept of the carnivalesque conveys an earthy yet extraordinary culture of drinking, as well as ritual elements with a lack of comfort and security that characterise the night-time economy for many people. This night-time carnival, as well as being something experienced by participants, is also a spectacle, with gendered and classed dynamics. It is suggested that this concept is helpful in making sense of common understandings of alcohol that run through the spheres not only of alcohol consumption but also production and regulation

    Construction, Concentration, and (Dis)Continuities in Social Valuations

    Get PDF
    I review and integrate recent sociological research that makes progress on three interrelated questions pertaining to social valuation: (a) the degree of social construction relative to objective constraints; (b) the degree of concentration in social valuations at a single point in time; and (c) the conditions that govern two broad forms of temporal discontinuity—(i) fashion cycles, especially in cultural expression and in managerial practices, and (ii) bubble/crash dynamics, as witnessed in such domains as authoritarian regimes and financial markets. In the course of the review, I argue for the importance of identifying how objective conditions constrain social construction and suggest two contrarian mechanisms by which this is accomplished—valuation opportunism and valuation entrepreneurship—and the conditions under which they are more or less effective

    Current anti-doping policy: a critical appraisal

    Get PDF
    BACKGROUND: Current anti-doping in competitive sports is advocated for reasons of fair-play and concern for the athlete's health. With the inception of the World Anti Doping Agency (WADA), anti-doping effort has been considerably intensified. Resources invested in anti-doping are rising steeply and increasingly involve public funding. Most of the effort concerns elite athletes with much less impact on amateur sports and the general public. DISCUSSION: We review this recent development of increasingly severe anti-doping control measures and find them based on questionable ethical grounds. The ethical foundation of the war on doping consists of largely unsubstantiated assumptions about fairness in sports and the concept of a "level playing field". Moreover, it relies on dubious claims about the protection of an athlete's health and the value of the essentialist view that sports achievements reflect natural capacities. In addition, costly antidoping efforts in elite competitive sports concern only a small fraction of the population. From a public health perspective this is problematic since the high prevalence of uncontrolled, medically unsupervised doping practiced in amateur sports and doping-like behaviour in the general population (substance use for performance enhancement outside sport) exposes greater numbers of people to potential harm. In addition, anti-doping has pushed doping and doping-like behaviour underground, thus fostering dangerous practices such as sharing needles for injection. Finally, we argue that the involvement of the medical profession in doping and anti-doping challenges the principles of non-maleficience and of privacy protection. As such, current anti-doping measures potentially introduce problems of greater impact than are solved, and place physicians working with athletes or in anti-doping settings in an ethically difficult position. In response, we argue on behalf of enhancement practices in sports within a framework of medical supervision. SUMMARY: Current anti-doping strategy is aimed at eradication of doping in elite sports by means of all-out repression, buttressed by a war-like ideology similar to the public discourse sustaining international efforts against illicit drugs. Rather than striving for eradication of doping in sports, which appears to be an unattainable goal, a more pragmatic approach aimed at controlled use and harm reduction may be a viable alternative to cope with doping and doping-like behaviour
    • …
    corecore