2,266 research outputs found

    Twinscan: A Software Package for Homology-Based Gene Prediction

    Get PDF
    A complete mapping from genome to proteome would constitute a foundation for genome-based biology and provide targets for pharmaceutical and therapeutic intervention. This is one reason gene structure prediction has been a major subfield of computational biology for over 20 years. Many of the widely used gene prediction systems were developed in the 1990s and are unable to take advantage of the revolution in comparative genomics brought on by the sequencing of the entire genomes of an increasing numbers of vertebrates. Twinscan is a new system for high-throughput gene-structure prediction that exploits the patterns of conservation observed in alignments between a target genomic sequence and its homologous sequence in other organisms. The approach employs a symbolic conservation sequence that effectively combines many local alignments into a single global alignment. This has several important properties that make Twinscan particularly useful for high-throughput gene prediction. For mammals, Twinscan has been shown to be significantly more accurate and reliable by all measures than any non-comparative genomic method. Twinscan is based on, and includes as a component, the same hidden Markov model topology as Genscan, a popular non-homology based gene prediction program. Twinscan has an object-oriented design and is implemented in the C++ programming language. Twinscan’s three major components consist of probabilistic models of both the DNA sequence and the conservation sequence as well as a dynamic programming framework. Both the models and the computational structure are complicated aggregate classes. In this report, the design and implementation of Twinscan is described at the source-code level for the first time

    Improving Data Collection and Documentation within a Post-Hospital Discharge Follow-Up Phone Call Program

    Get PDF
    Hospital discharge with quality patient education and instruction is paramount for the success of recovery and mitigation of readmission. Identification of gaps within hospital discharge can lead to strategies to improve the discharge process. A project was designed to improve data collection, documentation of discrepancies (gaps/events), and analysis of data relating to patient and process outcomes of a post-discharge follow-up phone call program. Interventions included a newly implemented algorithm to outline a phone call workflow, refine data collection, and evaluate outcomes. Data collection to increase identification of gaps in care from discharge to follow-up aid in improving patient outcomes

    Chapter Functional Annotation of Rare Genetic Variants

    Get PDF
    Genome-wide association studies have successfully identified a growing number of common variants that robustly associate with a wide range of complex diseases and phenotypes. In the majority of cases though, the variants are predicted to have small to modest effect sizes, and, due to the technologies used, many of the signals discovered so far may not be the causal loci. As rare variation studies begin to explore the lower ranges of the allele frequency spectrum, using whole genome or whole exome sequencing to capture a larger proportion of variants, we expect to find variants with a more direct causal role in the phenotype(s) of interest. Interpreting possible functional mechanisms linking variants with phenotypes will become increasingly important

    Using several pair-wise informant sequences for de novo prediction of alternatively spliced transcripts

    Get PDF
    BACKGROUND: As part of the ENCODE Genome Annotation Assessment Project (EGASP), we developed the MARS extension to the Twinscan algorithm. MARS is designed to find human alternatively spliced transcripts that are conserved in only one or a limited number of extant species. MARS is able to use an arbitrary number of informant sequences and predicts a number of alternative transcripts at each gene locus. RESULTS: MARS uses the mouse, rat, dog, opossum, chicken, and frog genome sequences as pairwise informant sources for Twinscan and combines the resulting transcript predictions into genes based on coding (CDS) region overlap. Based on the EGASP assessment, MARS is one of the more accurate dual-genome prediction programs. Compared to the GENCODE annotation, we find that predictive sensitivity increases, while specificity decreases, as more informant species are used. MARS correctly predicts alternatively spliced transcripts for 11 of the 236 multi-exon GENCODE genes that are alternatively spliced in the coding region of their transcripts. For these genes a total of 24 correct transcripts are predicted. CONCLUSION: The MARS algorithm is able to predict alternatively spliced transcripts without the use of expressed sequence information, although the number of loci in which multiple predicted transcripts match multiple alternatively spliced transcripts in the GENCODE annotation is relatively small

    Sharp contact corners, fretting and cracks

    Get PDF
    Contacts with sharp edges subject to oscillatory loading are likely to nucleate cracks from thecorners, if the loading is sufficiently severe. To a first approximation, the corners behave like notches, where thelocal elastic behaviour is relieved by plasticity, and which in turn causes irreversibilities that give rise to cracknucleation, but also by frictional slip. One question we aim to answer here is; when is the frictional slipenveloped by plastic slip, so that the corner is effectively a notch in a monolithic material? We do this byemploying the classical Williams asymptotic solution to model the contact corner, and, in doing so, we renderthe solution completely general in the sense that it is independent of the overall geometry of the components.We then re-define the independent parameters describing the properties of the Williams solution by using theinherent length scale, a procedure that was described at the first IJFatigue and FFEMS joint workshop [1]. Byproceeding in this way, we can provide a self-contained solution that can be ‘pasted in’ to any complete contactproblem, and hence the likelihood of crack nucleation, and the circumstances under which it might occur, canbe classified. Further, this reformulation of Williams' solution provides a clear means of obtaining the strength(defined by crack nucleation conditions) of a material pair with a particular contact angle. This means that theresults from a test carried out using a laboratory specimen may easily be carried over to any complicated contactproblem found in engineering practice, and a mechanical test of the prototypical geometry, which may often bequite difficult, is avoided

    Uncovering information on expression of natural antisense transcripts in Affymetrix MOE430 datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The function and significance of the widespread expression of natural antisense transcripts (NATs) is largely unknown. The ability to quantitatively assess changes in NAT expression for many different transcripts in multiple samples would facilitate our understanding of this relatively new class of RNA molecules.</p> <p>Results</p> <p>Here, we demonstrate that standard expression analysis Affymetrix MOE430 and HG-U133 GeneChips contain hundreds of probe sets that detect NATs. Probe sets carrying a "Negative Strand Matching Probes" annotation in NetAffx were validated using Ensembl by manual and automated approaches. More than 50 % of the 1,113 probe sets with "Negative Strand Matching Probes" on the MOE430 2.0 GeneChip were confirmed as detecting NATs. Expression of selected antisense transcripts as indicated by Affymetrix data was confirmed using strand-specific RT-PCR. Thus, Affymetrix datasets can be mined to reveal information about the regulated expression of a considerable number of NATs. In a correlation analysis of 179 sense-antisense (SAS) probe set pairs using publicly available data from 1637 MOE430 2.0 GeneChips a significant number of SAS transcript pairs were found to be positively correlated.</p> <p>Conclusion</p> <p>Standard expression analysis Affymetrix GeneChips can be used to measure many different NATs. The large amount of samples deposited in microarray databases represents a valuable resource for a quantitative analysis of NAT expression and regulation in different cells, tissues and biological conditions.</p

    Consistent annotation of gene expression arrays

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene expression arrays are valuable and widely used tools for biomedical research. Today's commercial arrays attempt to measure the expression level of all of the genes in the genome. Effectively translating the results from the microarray into a biological interpretation requires an accurate mapping between the probesets on the array and the genes that they are targeting. Although major array manufacturers provide annotations of their gene expression arrays, the methods used by various manufacturers are different and the annotations are difficult to keep up to date in the rapidly changing world of biological sequence databases.</p> <p>Results</p> <p>We have created a consistent microarray annotation protocol applicable to all of the major array manufacturers. We constantly keep our annotations updated with the latest Ensembl Gene predictions, and thus cross-referenced with a large number of external biomedical sequence database identifiers. We show that these annotations are accurate and address in detail reasons for the minority of probesets that cannot be annotated. Annotations are publicly accessible through the Ensembl Genome Browser and programmatically through the Ensembl Application Programming Interface. They are also seamlessly integrated into the BioMart data-mining tool and the biomaRt package of BioConductor.</p> <p>Conclusions</p> <p>Consistent, accurate and updated gene expression array annotations remain critical for biological research. Our annotations facilitate accurate biological interpretation of gene expression profiles.</p
    corecore