50 research outputs found

    The UCSC Proteome Browser

    Get PDF
    The University of California Santa Cruz (UCSC) Proteome Browser provides a wealth of protein information presented in graphical images and with links to other protein-related Internet sites. The Proteome Browser is tightly integrated with the UCSC Genome Browser. For the first time, Genome Browser users have both the genome and proteome worlds at their fingertips simultaneously. The Proteome Browser displays tracks of protein and genomic sequences, exon structure, polarity, hydrophobicity, locations of cysteine and glycosylation potential, Superfamily domains and amino acids that deviate from normal abundance. Histograms show genome-wide distribution of protein properties, including isoelectric point, molecular weight, number of exons, InterPro domains and cysteine locations, together with specific property values of the selected protein. The Proteome Browser also provides links to gene annotations in the Genome Browser, the Known Genes details page and the Gene Sorter; domain information from Superfamily, InterPro and Pfam; three-dimensional structures at the Protein Data Bank and ModBase; and pathway data at KEGG, BioCarta/CGAP and BioCyc. As of August 2004, the Proteome Browser is available for human, mouse and rat proteomes. The browser may be accessed from any Known Genes details page of the Genome Browser at http://genome.ucsc.edu. A user's guide is also available on this website

    UCSC genome browser tutorial

    Get PDF
    AbstractThe University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/

    UCSC Data Integrator and Variant Annotation Integrator.

    Get PDF
    UnlabelledTwo new tools on the UCSC Genome Browser web site provide improved ways of combining information from multiple datasets, optionally including the user's own custom track data and/or data from track hubs. The Data Integrator combines columns from multiple data tracks, showing all items from the first track along with overlapping items from the other tracks. The Variant Annotation Integrator is tailored to adding functional annotations to variant calls; it offers a more restricted set of underlying data tracks but adds predictions of each variant's consequences for any overlapping or nearby gene transcript. When available, it optionally adds additional annotations including effect prediction scores from dbNSFP for missense mutations, ENCODE regulatory summary tracks and conservation scores.Availability and implementationThe web tools are freely available at http://genome.ucsc.edu/ and the underlying database is available for download at http://hgdownload.cse.ucsc.edu/ The software (written in C and Javascript) is available from https://genome-store.ucsc.edu/ and is freely available for academic and non-profit usage; commercial users must obtain a [email protected] informationSupplementary data are available at Bioinformatics online

    The ENCODE Project at UC Santa Cruz

    Get PDF
    The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website () is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples

    ENCODE whole-genome data in the UCSC genome browser (2011 update)

    Get PDF
    The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser (http://genome.ucsc.edu/). ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC (http://encodeproject.org/) provides information about the ENCODE data and links for access

    The UCSC cancer genomics browser: update 2011

    Get PDF
    The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu) comprises a suite of web-based tools to integrate, visualize and analyze cancer genomics and clinical data. The browser displays whole-genome views of genome-wide experimental measurements for multiple samples alongside their associated clinical information. Multiple data sets can be viewed simultaneously as coordinated ‘heatmap tracks’ to compare across studies or different data modalities. Users can order, filter, aggregate, classify and display data interactively based on any given feature set including clinical features, annotated biological pathways and user-contributed collections of genes. Integrated standard statistical tools provide dynamic quantitative analysis within all available data sets. The browser hosts a growing body of publicly available cancer genomics data from a variety of cancer types, including data generated from the Cancer Genome Atlas project. Multiple consortiums use the browser on confidential prepublication data enabled by private installations. Many new features have been added, including the hgMicroscope tumor image viewer, hgSignature for real-time genomic signature evaluation on any browser track, and ‘PARADIGM’ pathway tracks to display integrative pathway activities. The browser is integrated with the UCSC Genome Browser; thus inheriting and integrating the Genome Browser’s rich set of human biology and genetics data that enhances the interpretability of the cancer genomics data

    Gene Expression Variability within and between Human Populations and Implications toward Disease Susceptibility

    Get PDF
    Variations in gene expression level might lead to phenotypic diversity across individuals or populations. Although many human genes are found to have differential mRNA levels between populations, the extent of gene expression that could vary within and between populations largely remains elusive. To investigate the dynamic range of gene expression, we analyzed the expression variability of ∼18, 000 human genes across individuals within HapMap populations. Although ∼20% of human genes show differentiated mRNA levels between populations, our results show that expression variability of most human genes in one population is not significantly deviant from another population, except for a small fraction that do show substantially higher expression variability in a particular population. By associating expression variability with sequence polymorphism, intriguingly, we found SNPs in the untranslated regions (5′ and 3′UTRs) of these variable genes show consistently elevated population heterozygosity. We performed differential expression analysis on a genome-wide scale, and found substantially reduced expression variability for a large number of genes, prohibiting them from being differentially expressed between populations. Functional analysis revealed that genes with the greatest within-population expression variability are significantly enriched for chemokine signaling in HIV-1 infection, and for HIV-interacting proteins that control viral entry, replication, and propagation. This observation combined with the finding that known human HIV host factors show substantially elevated expression variability, collectively suggest that gene expression variability might explain differential HIV susceptibility across individuals

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    The UCSC Genome Browser database: 2014 update.

    No full text

    The UCSC Genome Browser database: 2014 update.

    No full text
    corecore