21 research outputs found

    Performance of spliced alignments with GNSAP and STAR.

    No full text
    <p>Alignments were performed with the GobyWeb and the GSNAP or STAR alignment plugin. One 50 bp single end RNA-Seq sample with about 43 million reads.</p

    Pathogen detection performance.

    No full text
    <p>Pathogen detection took 53 m for the 72 Pickrell et al RNA-Seq samples. N: Number of samples where GobyWeb identified at least one viral contig from the specified organism. See tag: DNOAOZI.</p

    GobyWeb user interface menus.

    No full text
    <p>Increasing numbers indicate the typical order in which a user would navigate the interface, from data upload (1) to download or sharing results with others (8). See Supporting information description for a detailed description of each step.</p

    Scalable Table Views.

    No full text
    <p>GobyWeb offers web-based table views that scale to support tables of results with hundred of millions of rows. Users can subset the table to keep specific columns, as well as rows that match complex filters on column values. This mechanism makes it possible for end-users to work with very large tables and download only interesting subsets of the data, even over slow Internet connections. In this snapshot, the table viewer displays results from a base-level methylation analysis (tagā€Š=ā€ŠRQLDONK). The panel ā€œFiltered list of elementsā€ displays the current view of the table. The panel at the bottom makes it possible for end-users to select which subset of columns they need to visualize/download. The filters help users identify columns by keyword. Text boxes under each column are used to enter filtering criteria on the specific column.</p

    Overview of the system architecture.

    No full text
    <p>An installation of GobyWeb relies on three pieces of infrastructure. (a) The web front-end is deployed as a Java web application on one or more application server(s). Several servers can be used to scale the application up under heavy usage. (b) Meta-data about samples, alignments, analyses and users are stored persistently in a Database Management System (DBMS). (c) A compute grid is used to process large datasets efficiently. All datasets (reads, alignments, processed results) are stored as large files on local disks directly attached to each compute node, and the web application servers, as well as in a shared network file system. The software automatically performs data transfers between the shared file system and local storage disks and optimizes these transfers to maximize the overall analysis throughput of the system. The system relies on production quality software components (Apache web server, Tomcat application server, Oracle/JDBC DBMS, and Sun Grid Engine, Linux and Network File System) that are already available and used in many academic institutions.</p

    Visualizing spliced RNA-Seq alignments done with GobyWeb and the GSNAP or STAR aligners.

    No full text
    <p>This figure was constructed with the Integrative Genomics Viewer (IGV), which directly supports alignments in Goby format. Alignments in the Goby format are substantially smaller than in BAM format, and can be directly downloaded from GobyWeb for interactive visualization with IGV. The plot provides a visual comparison of spliced alignments generated with the GobyWeb GSNAP and STAR plugins over the LAD1 gene (human).</p

    Comparison of GobyWeb with other NGS Analysis Systems.

    No full text
    <p>Notes: (1) Supports text files.</p><p>(2) Supports any file format.</p><p>(3) Supports Goby compressed file formats.</p><p>(4) Supports BAM.</p><p>(5) Provides a scripting language specialized for pipeline development.</p><p>(6) Pipelines can be constructed by connecting components graphically.</p><p>(7) Pipelines are optimized for analysis of large HTS datasets.</p><p>(8) For example, installs Perl on Windows, but not on other platforms.</p><p>(9) Data often needs to be installed on execution server.</p><p>(10) Relies on remote web services.</p><p>(11) Very few file formats support parallelization.</p><p>(12) Runs analyses on a single server.</p><p>(13) All analyses are fully parallel.</p><p>(14) Deploys to Amazon AWS for Hadoop.</p><p>(15) Some extensions support execution on grid.</p

    Spreading of aberrant methylation to neighboring probesets in the ABC samples.

    No full text
    <p>(A) A schematic representation of how the genome was divided into blocks of genes to study spreading of altered DNA methylation. (Bā€“C) Analysis of spreading of aberrant methylation within genomic neighborhoods. Loci ā€œ<i>i</i>ā€ represent probesets that are significantly hypo- (black) or hyper-methylated (grey) in lymphoma samples compared to normal tissues, and loci ā€œ<i>i</i>Ā±<i>j</i>ā€ represent both the (<i>i</i>+<i>j</i>)-th and (<i>i</i>āˆ’<i>j</i>)-th neighbors of those probesets. For instance, when we focused on probeset #10 (i.e. <i>i</i>ā€Š=ā€Š10), we analyzed spreading of aberrant methylation at probesets #5, 6, 7, 8, 9, 11, 12, 13, 14 and 15. Panel B displays the change in methylation states while panel C shows the change in IQR (variability between samples).</p

    Genome-wide patterns of aberrant methylation.

    No full text
    <p>(A) Graphical explanation of how the distribution of M-scores and IQR are transformed into violin distribution plots to enable more efficient visualization and comparison on intra- and inter-sample variability. (B) Distribution of the methylation score (M-score, left) and inter-quartile ranges (IQR, right) at probesets in centromeric, telomeric, and intermediate regions for normal and diseased tissues. Bar width is proportional to the number of data points, and the colors are the same as in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003137#pgen-1003137-g001" target="_blank">Figure 1A</a>. (C) Distributions of M-score (left) and IQR (right) are shown for gene-poor, gene-rich, and intermediate regions.</p

    The extent of DNA methylation aberration is predictive of patient survival.

    No full text
    <p>(A) Phylogenetic tree, as estimated based on the correlation of group-averaged M-scores. Departure from normal methylation patterns is correlated with disease severity of the lymphoma samples. (Bā€“C) Kaplan-Meier curves for risk groups defined according to their methylation distance score (i.e. distance from normal B-cells), which reflects how different a sample's methylation profile is from that of NBC or NGC, for all DLBCL (GCB and ABC) samples. (B) Multivariate analysis with the International Prognostic Index (IPI) and distance to NBC. (C) Only IPI.</p
    corecore