292 research outputs found

    Multidimensional annotation of the Escherichia coli K-12 genome

    Get PDF
    The annotation of the Escherichia coli K-12 genome in the EcoCyc database is one of the most accurate, complete and multidimensional genome annotations. Of the 4460 E. coli genes, EcoCyc assigns biochemical functions to 76%, and 66% of all genes had their functions determined experimentally. EcoCyc assigns E. coli genes to Gene Ontology and to MultiFun. Seventy-five percent of gene products contain reviews authored by the EcoCyc project that summarize the experimental literature about the gene product. EcoCyc information was derived from 15 000 publications. The database contains extensive descriptions of E. coli cellular networks, describing its metabolic, transport and transcriptional regulatory processes. A comparison to genome annotations for other model organisms shows that the E. coli genome contains the most experimentally determined gene functions in both relative and absolute terms: 2941 (66%) for E. coli, 2319 (37%) for Saccharomyces cerevisiae, 1816 (5%) for Arabidopsis thaliana, 1456 (4%) for Mus musculus and 614 (4%) for Drosophila melanogaster. Database queries to EcoCyc survey the global properties of E. coli cellular networks and illuminate the extent of information gaps for E. coli, such as dead-end metabolites. EcoCyc provides a genome browser with novel properties, and a novel interactive display of transcriptional regulatory networks

    A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale

    Get PDF
    In this era of complete genomes, our knowledge of neuroanatomical circuitry remains surprisingly sparse. Such knowledge is however critical both for basic and clinical research into brain function. Here we advocate for a concerted effort to fill this gap, through systematic, experimental mapping of neural circuits at a mesoscopic scale of resolution suitable for comprehensive, brain-wide coverage, using injections of tracers or viral vectors. We detail the scientific and medical rationale and briefly review existing knowledge and experimental techniques. We define a set of desiderata, including brain-wide coverage; validated and extensible experimental techniques suitable for standardization and automation; centralized, open access data repository; compatibility with existing resources, and tractability with current informatics technology. We discuss a hypothetical but tractable plan for mouse, additional efforts for the macaque, and technique development for human. We estimate that the mouse connectivity project could be completed within five years with a comparatively modest budget.Comment: 41 page

    A Visual Spreadsheet using HTML5 for Whole Genome Display

    Get PDF
    Modern sequencing technology has enabled the cheap, rapid production of whole genomes. There is a need for visualization tools to show the data collected about a whole genome such as genes, proteins, annotations, and expression data. Many common approaches are developed such as the genome browser where sequence features are displayed as visual elements in tracks and features are aligned with their genome coordinates, visual networks where the data elements represented as nodes and relationship as edges, and traditional spreadsheet where each row captures the information about a gene/genome where the information is textual in nature, such as identifiers, descriptions, or sequences. Our study is focusing in the last approach with introducing some advanced features. To build the system, the common used similar systems are reviewed, and during the implementation some software artifacts are reused such as reusing some JavaScript libraries to reduce the complexity of software development. Generally, an incremental method is used to develop the webpage starting from collecting the data from AspGD database, analyzing them, coding then testing them once at time. Our research group studies fungal genomes, so the spreadsheets are tested by displaying each of the Aspergilli genomes in the AspGD database (www.aspgd.org). We have developed CGene and CGenome, pronounced See-Gene and See-Genome respectively, as a HTML5 web-based spreadsheets that can incorporate visual displays, as well as text, within the spreadsheet cells. Current displays use Scalable Vector Graphics (SVG) to present these spreadsheets which are generated from standard GFF3 files, standard output files from InterProScan, aspgd files from AspGD Gene Ontology Annotations File, and Chromosomal Feature File. All these files are analyzed to present them in a visual way that requires less effort to understand. The main aim of our study is to take the advantages of the ability of humans to recognize patterns. The user can see the gene/genomes of interest as row-by-row of visualization. This can play powerful roll to ease the understanding of quantitive data by replacing them by graphical figures that make the comparison easier

    BioIMAX : a Web2.0 approach to visual data mining in bioimage data

    Get PDF
    Loyek C. BioIMAX : a Web2.0 approach to visual data mining in bioimage data. Bielefeld: Universität Bielefeld; 2012

    Knowledge visualization: From theory to practice

    Get PDF
    Visualizations have been known as efficient tools that can help users analyze com- plex data. However, understanding the displayed data and finding underlying knowl- edge is still difficult. In this work, a new approach is proposed based on understanding the definition of knowledge. Although there are many definitions used in different ar- eas, this work focuses on representing knowledge as a part of a visualization and showing the benefit of adopting knowledge representation. Specifically, this work be- gins with understanding interaction and reasoning in visual analytics systems, then a new definition of knowledge visualization and its underlying knowledge conversion processes are proposed. The definition of knowledge is differentiated as either explicit or tacit knowledge. Instead of directly representing data, the value of the explicit knowledge associated with the data is determined based on a cost/benefit analysis. In accordance to its importance, the knowledge is displayed to help the user under- stand the complex data through visual analytical reasoning and discovery
    corecore