32 research outputs found

    Prediction area visualisation on the Small Round Blue Cell Tumors data (SRBCT [35]) data, described in the Results Section, with respect to the prediction distance.

    No full text
    <p>From left to right: ‘maximum distance’, ‘Centroid distance’ and ‘Mahalanobis distance’. Sample prediction area plots from a PLS-DA model applied on a microarray data set with the expression levels of 2,308 genes on 63 samples. Samples are classified into four classes: Burkitt Lymphoma (BL), Ewing Sarcoma (EWS), Neuroblastoma (NB), and Rhabdomyosarcoma (RMS).</p

    Overview of the mixOmics multivariate methods for single and integrative ‘omics supervised analyses.

    No full text
    <p><i>X</i> denote a predictor ‘omics data set, and <i>y</i> a categorical outcome response (<i>e.g</i>. healthy <i>vs</i>. sick). Integrative analyses include <i>N</i>-integration with DIABLO (the same <i>N</i> samples are measured on different ‘omics platforms), and <i>P</i>-integration with MINT (the same <i>P</i> ‘omics predictors are measured in several independent studies). Sample plots depicted here use the mixOmics functions (from left to right) plotIndiv, plotArrow and plotIndiv in 3D; variable plots use the mixOmics functions network, cim, plotLoadings, plotVar and circosPlot. The graphical output functions are detailed in Supporting Information <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005752#pcbi.1005752.s001" target="_blank">S1 Text</a>.</p

    Illustration of <i>N</i>-integrative supervised analysis with DIABLO.

    No full text
    <p><b>A</b>: sample plot per data set, <b>B</b>: sample scatterplot from plotDiablo displaying the first component in each data set (upper diagonal plot) and Pearson correlation between each component (lower diagonal plot). <b>C</b>: Clustered Image Map (Euclidean distance, Complete linkage) of the multi-omics signature. Samples are represented in rows, selected features on the first component in columns. <b>D</b>: Circos plot shows the positive (negative) correlation (<i>r</i> > 0.7) between selected features as indicated by the brown (black) links, feature names appear in the quadrants, <b>E</b>: Correlation Circle plot representing each type of selected features, <b>F</b>: relevance network visualisation of the selected features.</p

    Example of computational time for the data sets presented in the Results section with a macbook pro 2013, 2.6GHz, 16Go Ram.

    No full text
    <p>Example of computational time for the data sets presented in the Results section with a macbook pro 2013, 2.6GHz, 16Go Ram.</p

    Illustration of a single ‘omics analysis with mixOmics.

    No full text
    <p><b>A) Unsupervised preliminary analysis with PCA</b>, <b>A1</b>: PCA sample plot, <b>A2</b>: percentage of explained variance per component. <b>B) Supervised analysis with PLS-DA</b>, <b>B1</b>: PLS-DA sample plot with confidence ellipse plots, <b>B2</b>: classification performance per component (overall and BER) for three prediction distances using repeated stratified cross-validation (10×5-fold CV). <b>C) Supervised analysis and feature selection with sparse PLS-DA</b>, <b>C1</b>: sPLS-DA sample plot with confidence ellipse plots, <b>C2</b>: arrow plot representing each sample pointing towards its outcome category, see more details in Supporting Information <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005752#pcbi.1005752.s001" target="_blank">S1 Text</a>. <b>C3</b>: Clustered Image Map (Euclidean Distance, Complete linkage) where samples are represented in rows and selected features in columns (10, 300 and 30 genes selected on each component respectively), <b>C4</b>: ROC curve and AUC averaged using one-vs-all comparisons.</p

    Summary of the eighteen multivariate projection-based methods available in mixOmics version 6.0.0 or above for different types of analysis frameworks.

    No full text
    <p>Note that our block.pls/plsda and sparse variants differ from the approaches from [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005752#pcbi.1005752.ref028" target="_blank">28</a>–<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005752#pcbi.1005752.ref031" target="_blank">31</a>]. The wrappers for rgcca and sgcca are originally from the RGCCA package [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005752#pcbi.1005752.ref032" target="_blank">32</a>] but the argument inputs were further improved for mixOmics.</p

    Simulation results.

    No full text
    <p>Averaged sensitivity for LMMSDE and LIMMA after 100 simulations. Differential expression between groups and/or time was tested with increasing noise and fold change (FC) levels.</p

    A Linear Mixed Model Spline Framework for Analysing Time Course ‘Omics’ Data

    Get PDF
    <div><p>Time course ‘omics’ experiments are becoming increasingly important to study system-wide dynamic regulation. Despite their high information content, analysis remains challenging. ‘Omics’ technologies capture quantitative measurements on tens of thousands of molecules. Therefore, in a time course ‘omics’ experiment molecules are measured for multiple subjects over multiple time points. This results in a large, high-dimensional dataset, which requires computationally efficient approaches for statistical analysis. Moreover, methods need to be able to handle missing values and various levels of noise. We present a novel, robust and powerful framework to analyze time course ‘omics’ data that consists of three stages: quality assessment and filtering, profile modelling, and analysis. The first step consists of removing molecules for which expression or abundance is highly variable over time. The second step models each molecular expression profile in a linear mixed model framework which takes into account subject-specific variability. The best model is selected through a serial model selection approach and results in dimension reduction of the time course data. The final step includes two types of analysis of the modelled trajectories, namely, clustering analysis to identify groups of correlated profiles over time, and differential expression analysis to identify profiles which differ over time and/or between treatment groups. Through simulation studies we demonstrate the high sensitivity and specificity of our approach for differential expression analysis. We then illustrate how our framework can bring novel insights on two time course ‘omics’ studies in breast cancer and kidney rejection. The methods are publicly available, implemented in the R CRAN package lmms.</p></div

    Clustering of filter ratios on proteomic datasets.

    No full text
    <p>Scatterplots of filter ratios <i>R</i><sub><i>T</i></sub> on the x-axis against <i>R</i><sub><i>I</i></sub> on the y-axis for <b>A</b>) iTraq breast cancer dataset and <b>B</b>) and <b>C</b>) the iTraq kidney rejection dataset for group Allograft Rejection (AR) and Non-Rejection (NR) respectively. Colors indicate clusters from a 2-cluster model-based clustering, with red squares indicating molecules that cluster as ‘informative’ and will remain in the analysis and blue circles indicating ‘non-informative’ molecules that will be removed prior to analysis.</p

    Types of models used to summarize profiles.

    No full text
    <p>The number (proportion) of profiles modelled with each model selected by our proposed LMMS approach. Models are abbreviated as linear (LIN), spline (SPL), subject-specific intercept (SSI), and subject-specific intercept and slope (SSIS). Models were applied to cell line breast cancer data (Cell), <i>Saccharomyces paradoxus</i> evolution data (Yeast), <i>Mus musculus</i> chemotherapy data (Mouse), and <i>Homo Sapiens</i> kidney rejection Non-Rejection (NR) data (Human). The row ‘Removed’ indicates the percentage of filtered profiles using the 2-cluster model-based clustering on <i>R</i><sub><i>T</i></sub> and <i>R</i><sub><i>I</i></sub>.</p
    corecore