11 research outputs found

    <i>CPHL1,</i> a Novel Ceruloplasmin-Like Gene

    No full text
    <div><p>Standard UCSC Genome Browser view of the <i>CP</i> locus showing a 90-kb “desert” separating it from the next known gene, <i>LOC116441,</i> and GESTALT view of the same locus, indicating the extent of the transcribed region predicted by ROAST (red bar in ROAST track) and the predicted gene structure for <i>CPHL1.</i> Interspersed repeats are color-coded, with red, green, pink, and brown bars representing Alu, MIR, LINE, and other repeats, respectively, and bar height indicating repeat age (younger repeats are taller); the megabase scale starts at the p telomere. The newly discovered gene overlaps with a gene structure predicted by Twinscan (chr3.151.005.a) but shares only seven of 21 exons, one imprecisely. GenScan predicts a much longer structure continuous with the <i>CP</i> gene, sharing 14 exons with <i>CPHL1</i>, of which ten are precisely predicted.</p><p>Inset: Phylogenetic analysis of the CP/CPHL1 family rooted using the hephaestin protein sequence as outgroup. Numbers above branches represent percentage bootstrap support over 1,000 replicates; the horizontal bar indicates 10% divergence along each branch.</p></div

    A Third Basic Concept

    No full text
    <p>By studying various sources of sequence information (pink boxes), genes have been identified using a variety of computational methods based on the identification of gene structure and/or the identification of sequence conservation. The FEAST methods represent a third basic concept, in which sustained transcriptional activity is inferred by its mutational and selective effects on the genomic sequence, the “transcriptional footprints.” Light blue boxes indicate the three basic concepts for gene prediction. The dashed vertical line separates gene prediction (to the left), from gene identification (to the right): the latter is based on the analysis of sequences expressed from the same locus.</p

    The Highest-Scoring Novel Predicted Transcript, <i>LOC401237</i>

    No full text
    <div><p>VISTA and GESTALT analyses of the <i>LOC401237</i> locus, showing sequence conservation with the mouse, chicken, and frog orthologous loci; the observed intron-exon structure of <i>LOC401237</i> and location of neighboring genes, with black circles representing CpG islands; the integrated FEAST scores for the forward (+) and reverse (−) strands, with the black arrow representing the calculated maximal segment; the repeat distribution on both strands, with red, green, pink, and brown bars, respectively, representing Alu, MIR, LINE, and other repeats, and bar height indicating repeat age (younger repeats are taller); the megabase scale, range 21.7 to 22.4 Mb from the p telomere.</p><p>Inset on top: Detail on the conserved intronic noncoding sequences, between two nonconserved exons.</p></div

    FEAST Reanalysis of Known Genes

    No full text
    <p>Scatterplot of FEAST scores versus gene length for known genes from the UCSC Genome Bioinformatics Site [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020018#pcbi-0020018-b020" target="_blank">20</a>]. Genes overlapping known genes on the complementary strand were excluded. Scores greater than 3 are considered significant.</p

    Genomewide Comparison of Gene Annotations

    No full text
    <p>The matrix of disagreement measures for all pairs of annotation methods is represented by point in two dimensions using MDS. Filled black circles represent experimentally observed transcripts, the vast majority being in the “RNA” set. Triangles represent methods involving significant manual curation and/or based on the RNA set. “S,” “H,” and “F” represent methods based on gene structure prediction, hybrid methods (gene structure and sequence similarity), and methods measuring footprints of transcription, respectively. The combined FEAST method was excluded from the MDS analysis, and its projected location (squared F) was calculated later (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020018#s4" target="_blank">Materials and Methods</a>). Note that, like geographical maps of intercity distances, MDS representations have no axes.</p

    Information Flow in FEAST

    No full text
    <p>The genomic sequence is analyzed using RepeatMasker, yielding a masked sequence (studied for its base composition), a repeat table, and an alignment file, which is used to list mutations in repeats and to produce a “sequence mask.” Both the original sequence and the sequence mask are studied using polyadq, yielding tables of predicted PASs. The nucleotide composition of the unique sequence, and the mutations within repeats, is tabulated as well. The tables are then analyzed to calculate skews, which are finally used to produce predictive scores, separately for each method (Greens, ROAST, CHOWDER, and PASTA) or in combination (FEAST).</p

    GESTALT View of the <i>AGBL1</i> Locus between the <i>AKAP13</i> and <i>NTRK3</i> Genes on Human Chromosome 15, 84.1 to 86.1 Mb from the p Telomere

    No full text
    <p>PASTA, Greens, CHOWDER, and FEAST predictions are displayed for each strand in brown, green, pink, and red, respectively, with lighter shades indicating less significant scores. In the FEAST track, actual scores are indicated in red, and maximal segments are displayed in blue. The <i>AGBL1</i> gene structure was modeled based on translated sequence similarity to the AGTPBP1 protein.</p

    FEAST Reanalysis of Existing Annotation

    No full text
    <p>Success rates for FEAST reanalysis of known genes (top left), experimental gene annotations (center and bottom left), and gene predictions (right). Gene annotations were stratified by length into three classes: short (<10 kb), medium (10 to 100 kb), and long (>100 kb); the number of genes in each class is given above each bar. FEAST scores were stratified into nonsignificant (white, −2 < Z < 2), giving significant scores for the expected strand (shades of brown, Z > 2) and giving significant scores for the wrong strand (shades of red, Z < −2). The Z < −4 and Z > 4 bins include potentially large values as displayed in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020018#pcbi-0020018-g002" target="_blank">Figure 2</a>. Columns labeled with asterisks include the gene regions longer than 100 kb remaining after subtraction of overlaps with known genes, on which FEAST had been trained.</p

    FEAST Scores at Gene Boundaries

    No full text
    <p>The average FEAST scores for known genes (thick black, <i>n</i> = 10,023), aligned at the position of gene start, show a sharp shift from nonsignificant values (near 0) outside the gene, to significant values at the 5′ end of the gene. The opposite shift is seen at the gene end, although it is more gradual. RNA cluster sequences (thin red, <i>n</i> = 13,749) show a very similar graph. Twinscan predictions (dashed green, <i>n</i> = 9,131) display positive FEAST scores outside the predicted regions, suggesting an underprediction of gene ends, particularly toward the 5′ end. Known genes, RNA clusters, and Twinscan predictions shorter than 20 kb were excluded from this analysis.</p

    Glycocapture-Assisted Global Quantitative Proteomics (gagQP) Reveals Multiorgan Responses in Serum Toxicoproteome

    No full text
    Blood is an ideal window for viewing our health and disease status. Because blood circulates throughout the entire body and carries secreted, shed, and excreted signature proteins from every organ and tissue type, it is thus possible to use the blood proteome to achieve a comprehensive assessment of multiple-organ physiology and pathology. To date, the blood proteome has been frequently examined for diseases of individual organs; studies on compound insults impacting multiple organs are, however, elusive. We believe that a characterization of peripheral blood for organ-specific proteins affords a powerful strategy to allow early detection, staging, and monitoring of diseases and their treatments at a whole-body level. In this paper we test this hypothesis by examining a mouse model of acetaminophen (APAP)-induced hepatic and extra-hepatic toxicity. We used a glycocapture-assisted global quantitative proteomics (gagQP) approach to study serum proteins and validated our results using Western blot. We discovered in mouse sera both hepatic and extra-hepatic organ-specific proteins. From our validation, it was determined that selected organ-specific proteins had changed their blood concentration during the course of toxicity development and recovery. Interestingly, the peak responding time of proteins specific to different organs varied in a time-course study. The collected molecular information shed light on a complex, dynamic, yet interweaving, multiorgan-enrolled APAP toxicity. The developed technique as well as the identified protein markers is translational to human studies. We hope our work can broaden the utility of blood proteomics in diagnosis and research of the whole-body response to pathogenic cues
    corecore