5 research outputs found

    Statistical Viewer: a tool to upload and integrate linkage and association data as plots displayed within the Ensembl genome browser

    Get PDF
    BACKGROUND: To facilitate efficient selection and the prioritization of candidate complex disease susceptibility genes for association analysis, increasingly comprehensive annotation tools are essential to integrate, visualize and analyze vast quantities of disparate data generated by genomic screens, public human genome sequence annotation and ancillary biological databases. We have developed a plug-in package for Ensembl called "Statistical Viewer" that facilitates the analysis of genomic features and annotation in the regions of interest defined by linkage analysis. RESULTS: Statistical Viewer is an add-on package to the open-source Ensembl Genome Browser and Annotation System that displays disease study-specific linkage and/or association data as 2 dimensional plots in new panels in the context of Ensembl's Contig View and Cyto View pages. An enhanced upload server facilitates the upload of statistical data, as well as additional feature annotation to be displayed in DAS tracts, in the form of Excel Files. The Statistical View panel, drawn directly under the ideogram, illustrates lod score values for markers from a study of interest that are plotted against their position in base pairs. A module called "Get Map" easily converts the genetic locations of markers to genomic coordinates. The graph is placed under the corresponding ideogram features a synchronized vertical sliding selection box that is seamlessly integrated into Ensembl's Contig- and Cyto- View pages to choose the region to be displayed in Ensembl's "Overview" and "Detailed View" panels. To resolve Association and Fine mapping data plots, a "Detailed Statistic View" plot corresponding to the "Detailed View" may be displayed underneath. CONCLUSION: Features mapping to regions of linkage are accentuated when Statistic View is used in conjunction with the Distributed Annotation System (DAS) to display supplemental laboratory information such as differentially expressed disease genes in private data tracks. Statistic View is a novel and powerful visual feature that enhances Ensembl's utility as valuable resource for integrative genomic-based approaches to the identification of candidate disease susceptibility genes. At present there are no other tools that provide for the visualization of 2-dimensional plots of quantitative data scores against genomic coordinates in the context of a primary public genome annotation browser

    A History of Genomics across Species, Communities and Projects

    Get PDF

    Dissemination and visualisation of biological data

    Get PDF
    With the recent advent of various waves of technological advances, the amount of biological data being generated has exploded. As a consequence of this data deluge, new challenges have emerged in the field of biological data management. In order to maximize the knowledge extracted from the huge amount of biological data produced it is of great importance for the research community that data dissemination and visualisation challenges are tackled. Opening and sharing our data and working collaboratively will benefit the scientific community as a whole and to move towards that end, new developements, tools and techniques are needed. Nowadays, many small research groups are capable of producing important and interesting datasets. The release of those datasets can greatly increase their scientific value. In addition, the development of new data analysis algorithms greatly benefits from the availability of a big corpus of annotated datasets for training and testing purposes, giving new and better algorithms to biomedical sciences in return. None of these would be feasible without large amounts of biological data made freely and publicly available. Dissemination The Distributed Annotation System (DAS) is a protocol designed to publish and integrate annotations on biological entities in a distributed way. DAS is structured as a client-server system where the client retrieves data from one or more servers and to further process and visualise. Nowadays, setting up a DAS server imposes some requirements not met by many research groups. With the aim of removing the hassle of setting up a DAS server, a new software platform has been developed: easyDAS. easyDAS is a hosted platform to automatically create DAS servers. Using a simple web interface the user can upload a data file, describe its contents and a new DAS server will be automatically created and data will be publicly available to DAS clients. Visualisation One of the most broadly used visualization paradigms for genomic data are genomic browsers. A genomic browser is capable of displaying different sets of features positioned relative to a sequence. It is possible to explore the sequence and the features by moving around and zooming in and out. When this project was started, in 2007, all major genome browsers offered quite an static experience. It was possible to browse and explore data, but is was done through a set of buttons to the genome a certain amount of bases to left or right or zooming in and out. From an architectural point of view, all web-based genome browsers were very similar: they all had a relatively thin clien-side part in charge of showing images and big backend servers taking care of everything else. Every change in the display parameters made by the user triggered a request to the server, impacting the perceived responsiveness. We created a new prototype genome browser called GenExp, an interactive web-based browser with canvas based client side data rendering. It offers fluid direct interaction with the genome representation and it's possible to use the mouse drag it and use the mouse wheel to change the zoom level. GenExp offers also some quite unique features, such as its multi-window capabilities that allow a user to create an arbitrary number of independent or linked genome windows and its ability to save and share browsing sessions. GenExp is a DAS client and all data is retrieved from DAS sources. It is possible to add any available DAS data source including all data in Ensembl, UCSC and even the custom ones created with easyDAS. In addition, we developed a javascript DAS client library, jsDAS. jsDAS is a complete DAS client library that will take care of everything DAS related in a javascript application. jsDAS is javascript library agnostic and can be used to add DAS capabilities to any web application. All software developed in this thesis is freely available under an open source license.Les recents millores tecnològiques han portat a una explosió en la quantitat de dades biològiques que es generen i a l'aparició de nous reptes en el camp de la gestió de les dades biològiques. Per a maximitzar el coneixement que podem extreure d'aquestes ingents quantitats de dades cal que solucionem el problemes associats al seu anàlisis, i en particular a la seva disseminació i visualització. La compartició d'aquestes dades de manera lliure i gratuïta pot beneficiar en gran mesura a la comunitat científica i a la societat en general, però per a fer-ho calen noves eines i tècniques. Actualment, molts grups són capaços de generar grans conjunts de dades i la seva publicació en pot incrementar molt el valor científic. A més, la disponibilitat de grans conjunts de dades és necessària per al desenvolupament de nous algorismes d'anàlisis. És important, doncs, que les dades biològiques que es generen siguin accessibles de manera senzilla, estandaritzada i lliure. Disseminació El Sistema d'Anotació Distribuïda (DAS) és un protocol dissenyat per a la publicació i integració d'anotacions sobre entitats biològiques de manera distribuïda. DAS segueix una esquema de client-servidor, on el client obté dades d'un o més servidors per a combinar-les, processar-les o visualitzar-les. Avui dia, però, crear un servidor DAS necessita uns coneixements i infraestructures que van més enllà dels recursos de molts grups de recerca. Per això, hem creat easyDAS, una plataforma per a la creació automàtica de servidors DAS. Amb easyDAS un usuari pot crear un servidor DAS a través d'una senzilla interfície web i amb només alguns clics. Visualització Els navegadors genomics són un dels paradigmes de de visualització de dades genòmiques més usats i permet veure conjunts de dades posicionades al llarg d'una seqüència. Movent-se al llarg d'aquesta seqüència és possibles explorar aquestes dades. Quan aquest projecte va començar, l'any 2007, tots els grans navegadors genomics oferien una interactivitat limitada basada en l'ús de botons. Des d'un punt de vista d'arquitectura tots els navegadors basats en web eren molt semblants: un client senzill encarregat d'ensenyar les imatges i un servidor complex encarregat d'obtenir les dades, processar-les i generar les imatges. Així, cada canvi en els paràmetres de visualització requeria una nova petició al servidor, impactant molt negativament en la velocitat de resposta percebuda. Vam crear un prototip de navegador genòmic anomenat GenExp. És un navegador interactiu basat en web que fa servir canvas per a dibuixar en client i que ofereix la possibilitatd e manipulació directa de la respresentació del genoma. GenExp té a més algunes característiques úniques com la possibilitat de crear multiples finestres de visualització o la possibilitat de guardar i compartir sessions de navegació. A més, com que és un client DAS pot integrar les dades de qualsevol servidor DAS com els d'Ensembl, UCSC o fins i tot aquells creats amb easyDAS. A més, hem desenvolupat jsDAS, la primera llibreria de client DAS completa escrita en javascript. jsDAS es pot integrar en qualsevol aplicació DAS per a dotar-la de la possibilitat d'accedir a dades de servidors DAS. Tot el programari desenvolupat en el marc d'aquesta tesis està lliurement disponible i sota una llicència de codi lliure

    Requirements of Modern Genome Browsers

    Get PDF
    Genome browsers are widely used tools for the visualization of a genome and related data. The demands placed on genome browsers due to the size, variety, and complexity of the data produced by modern biotechnology is increasing. These demands are poorly understood, and are not documented. Our study is establishing and documenting a clear set of requirements for genome browsers. Our study reviewed all widely used genome browsers, as well as notable research prototypes of genome browsers. This involved a review of the literature, executing typical uses of the genome browsers, program comprehension, reverse engineering, and code analysis. The key outcome of the study is a clear set of requirements in the form of a requirement document which conforms to the IEEE Std 830-1998 Standard of a Software Requirement Specification. This contains a domain model of concepts, the functional requirements as use cases, a definition of visualizations as metaphors, glyphs, or icons, formal specification of the system in Z notation and a specification of all widely used file formats. Genome browsers share a set of basic features like display, scroll, zoom, and search. However, they differ in their performance, maturity level and the implementation technologies. Our requirements also document the major non-functional requirements. The outcome of our study can be used in several ways: it can be used as a guide for future developers of Genome Browsers; it can form the basis of future enhancements of features in existing genome browsers; and it can motivate the invention of new algorithms, data structures, or file formats for implementations
    corecore