Search CORE

27 research outputs found

MDQC: a new quality assessment method for microarrays based on quality control reports

Author: Balshaw Robert
Cohen Freue Gabriela V.
Hollander Zsuzsanna
Keown Paul
McManus Bruce
McMaster W. Robert
Ng Raymond T.
Scherer Andreas
Shen Enqing
Zamar Ruben H.
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: The process of producing microarray data involves multiple steps, some of which may suffer from technical problems and seriously damage the quality of the data. Thus, it is essential to identify those arrays with low quality. This article addresses two questions: (1) how to assess the quality of a microarray dataset using the measures provided in quality control (QC) reports; (2) how to identify possible sources of the quality problems. Results: We propose a novel multivariate approach to evaluate the quality of an array that examines the ‘Mahalanobis distance' of its quality attributes from those of other arrays. Thus, we call it Mahalanobis Distance Quality Control (MDQC) and examine different approaches of this method. MDQC flags problematic arrays based on the idea of outlier detection, i.e. it flags those arrays whose quality attributes jointly depart from those of the bulk of the data. Using two case studies, we show that a multivariate analysis gives substantially richer information than analyzing each parameter of the QC report in isolation. Moreover, once the QC report is produced, our quality assessment method is computationally inexpensive and the results can be easily visualized and interpreted. Finally, we show that computing these distances on subsets of the quality measures in the report may increase the method's ability to detect unusual arrays and helps to identify possible reasons of the quality problems. Availability: The library to implement MDQC will soon be available from Bioconductor Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

RERO DOC Digital Library

Predicting sepsis severity at first clinical presentation:The role of endotypes and mechanistic signatures

Author: An Andy
Baghela Arjun
Baker Andrew
Baquir Beverlie
Bouma Hjalmar R.
dos Santos Claudia C.
Falsafi Reza
Farmer Susan W.
Freue Gabriela V. Cohen
Hancock Robert E. W.
Hurlburt Andrew
Jimenez-Canizales Carlos Eduardo
Lee Amy H.
Mondragon-Cardona Alvaro
Pena Olga M.
Rivera Juan Diego
Shojaei Maryam
Tang Benjamin
Trahtemberg Uriel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

BACKGROUND: Inter-individual variability during sepsis limits appropriate triage of patients. Identifying, at first clinical presentation, gene expression signatures that predict subsequent severity will allow clinicians to identify the most at-risk groups of patients and enable appropriate antibiotic use. METHODS: Blood RNA-Seq and clinical data were collected from 348 patients in four emergency rooms (ER) and one intensive-care-unit (ICU), and 44 healthy controls. Gene expression profiles were analyzed using machine learning and data mining to identify clinically relevant gene signatures reflecting disease severity, organ dysfunction, mortality, and specific endotypes/mechanisms. FINDINGS: Gene expression signatures were obtained that predicted severity/organ dysfunction and mortality in both ER and ICU patients with accuracy/AUC of 77–80%. Network analysis revealed these signatures formed a coherent biological program, with specific but overlapping mechanisms/pathways. Given the heterogeneity of sepsis, we asked if patients could be assorted into discrete groups with distinct mechanisms (endotypes) and varying severity. Patients with early sepsis could be stratified into five distinct and novel mechanistic endotypes, named Neutrophilic-Suppressive/NPS, Inflammatory/INF, Innate-Host-Defense/IHD, Interferon/IFN, and Adaptive/ADA, each based on ∼200 unique gene expression differences, and distinct pathways/mechanisms (e.g., IL6/STAT3 in NPS). Endotypes had varying overall severity with two severe (NPS/INF) and one relatively benign (ADA) groupings, consistent with reanalysis of previous endotype studies. A 40 gene-classification tool (accuracy=96%) and several gene-pairs (accuracy=89–97%) accurately predicted endotype status in both ER and ICU validation cohorts. INTERPRETATION: The severity and endotype signatures indicate that distinct immune signatures precede the onset of severe sepsis and lethality, providing a method to triage early sepsis patients

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

Can we predict protein from mRNA levels?

Author: B Schwanhäusser
Christopher M. Overall
F Edfors
Gabriela V. Cohen Freue
JJ Li
JJ Li
M Friendly
M Wilhelm
Nikolaus Fortelny
Paul Pavlidis
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

PGCA: An algorithm to link protein groups created from MS/MS data

Author: Bruce McManus (128550)
David Kepplinger (4056793)
Derek Smith (399177)
Gabriela V. Cohen Freue (399175)
Mandeep Takhar (399181)
Mayu Sasaki (399179)
Raymond T. Ng (253624)
W. Robert McMaster (219114)
Zsuzsanna Hollander (229895)
Publication venue
Publication date: 01/01/2017
Field of study

<div><p>The quantitation of proteins using shotgun proteomics has gained popularity in the last decades, simplifying sample handling procedures, removing extensive protein separation steps and achieving a relatively high throughput readout. The process starts with the digestion of the protein mixture into peptides, which are then separated by liquid chromatography and sequenced by tandem mass spectrometry (MS/MS). At the end of the workflow, recovering the identity of the proteins originally present in the sample is often a difficult and ambiguous process, because more than one protein identifier may match a set of peptides identified from the MS/MS spectra. To address this identification problem, many MS/MS data processing software tools combine all plausible protein identifiers matching a common set of peptides into a protein group. However, this solution introduces new challenges in studies with multiple experimental runs, which can be characterized by three main factors: <i>i)</i> protein groups’ identifiers are local, i.e., they vary run to run, <i>ii)</i> the composition of each group may change across runs, and <i>iii)</i> the supporting evidence of proteins within each group may also change across runs. Since in general there is no conclusive evidence about the absence of proteins in the groups, protein groups need to be linked across different runs in subsequent statistical analyses. We propose an algorithm, called Protein Group Code Algorithm (PGCA), to link groups from multiple experimental runs by forming global protein groups from connected local groups. The algorithm is computationally inexpensive and enables the connection and analysis of lists of protein groups across runs needed in biomarkers studies. We illustrate the identification problem and the stability of the PGCA mapping using 65 iTRAQ experimental runs. Further, we use two biomarker studies to show how PGCA enables the discovery of relevant candidate protein group markers with similar but non-identical compositions in different runs.</p></div

Directory of Open Access Journals

FigShare

A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers

Author: Balshaw Robert F.
Chen Virginia
Cohen Freue Gabriela V.
Günther Oliver P.
Hollander Zsuzsanna
Keown Paul A.
McManus Bruce M.
McMaster W. R.
Ng Raymond T.
Takhar Mandeep
Tebbutt Scott J.
Publication venue: BioMed Central
Publication date: 06/08/2015
Field of study

Background: Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble? Results The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity. Conclusion Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway.Computer Science, Department ofMedical Genetics, Department ofMedicine, Department ofPathology and Laboratory Medicine, Department ofRespiratory Medicine, Division ofScience, Faculty ofStatistics, Department ofNon UBCMedicine, Faculty ofReviewedFacult

University of British Columbia: cIRcle - UBC's Information Repository

Novel Blood-based Transcriptional Biomarker Panels Predict the Late-Phase Asthmatic Response

Author: Balshaw Robert
Boulet Louis-Philippe
Cohen Freue Gabriela V
FitzGerald J Mark
Gauvreau Gail M
Kim Young Woong
O'Byrne Paul M
Shannon Casey P
Singh Amrit
Tebbutt Scott J
Yang Chen Xi
Publication venue: 'American Thoracic Society'
Publication date: 15/02/2018
Field of study

Crossref

The University of Manchester - Institutional Repository

Sizes of protein groups within experimental runs.

Author: Bruce McManus (128550)
David Kepplinger (4056793)
Derek Smith (399177)
Gabriela V. Cohen Freue (399175)
Mandeep Takhar (399181)
Mayu Sasaki (399179)
Raymond T. Ng (253624)
W. Robert McMaster (219114)
Zsuzsanna Hollander (229895)
Publication venue
Publication date
Field of study

<p>Sizes of the local groups identified by Proteome Discoverer within each of the 12 spectral count-based experiments in Dataset B.</p

FigShare