Search CORE

35 research outputs found

Candidate long-range regulatory sites acting on the IL17-pathway genes TRAF3IP2 and IL17RA are associated with psoriasis

Author: Dand
Gordon
Grubert
Huffmeier
Joehanes
Kisiel
Langley
Lebwohl
Lek
Mumbach
Nair
Tang
Wenric
Zhu
Publication venue: 'Wiley'
Publication date: 01/11/2018
Field of study

Crossref

University of Dundee Online Publications

VPS51 biallelic variants cause microcephaly with brain malformations: A confirmatory report

Author: Bours Vincent
Caberg Jean-Hubert
Drunat Séverine
El Ghouzzi Vincent
Hitayezu Janvier
Mutesa Leon
Passemard Sandrine
Uwineza Annette
Verloes Alain
Vial Yoann
Wenric Stephane
Publication venue: 'Elsevier BV'
Publication date: 01/08/2019
Field of study

International audienc

HAL - Normandie Université

Hal-Diderot

Normalization and correction for batch effects via RUV for RNA-seq data: practical implications for Breast Cancer Research

Author: Bours Vincent
Debit Ahmed
JOSSE Claire
Van Steen Kristel
Wenric Stéphane
Publication venue
Publication date: 01/05/2017
Field of study

The whole transcriptome contains information about nonsense, missense, silent, in-frame and frameshift mutations, as observed at whole-exome level, as well as splicing and (allelic) gene-expression changes which are missed by DNA analysis. One important step in the analysis of gene expression data arising from RNA-seq is the detection of differential expression (DE) levels. Several methods are available and the choice is sometimes controversial. For a reliable DE analysis that reduces False Positive DE genes, and accurate estimation of gene expression levels, a good and suitable normalization approach (including correction for confounders) is mandatory. Several normalization methods have been proposed to correct for both within-sample and between-sample biases. RUV (Removing Unwanted Variation) is one of them and has the advantage to correct for batch effects including potentially unknown unwanted variation in gene expression. In this study, we present a comparison on real-life Illumina paired-end sequencing data for Estrogen-Receptor-Positive (ER+) Breast Cancer tissues versus matched controls between RUV (RUVg using in silico negative control genes) and more commonly used methods for RNA-seq data normalization, such as DESeq2, edgeR, and UQ. The set of in silico empirical negative control genes for RUVg was defined as the set of least significant DE genes obtained after a first DE analysis performed prior to RUVg correction. Box plots of relative log expression (RLE) among the samples and PCA plots show that RUVg performs well and leads to a stabilization of read count across samples with a clear clustering of biological replicates

Open Repository and Bibliography - Liège

Bioinformatics contribution to the analysis of omics data in the clinical, technical, and molecular domains of human cancer

Author: Wenric Stéphane
Publication venue: ULiège - Université de Liège
Publication date: 01/07/2017
Field of study

Open Repository and Bibliography - Liège

Using Supervised Learning Methods for Gene Selection in RNA-Seq Case-Control Studies

Author: Ruhollah Shemirani
Stephane Wenric
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Whole transcriptome studies typically yield large amounts of data, with expression values for all genes or transcripts of the genome. The search for genes of interest in a particular study setting can thus be a daunting task, usually relying on automated computational methods. Moreover, most biological questions imply that such a search should be performed in a multivariate setting, to take into account the inter-genes relationships. Differential expression analysis commonly yields large lists of genes deemed significant, even after adjustment for multiple testing, making the subsequent study possibilities extensive. Here, we explore the use of supervised learning methods to rank large ensembles of genes defined by their expression values measured with RNA-Seq in a typical 2 classes sample set. First, we use one of the variable importance measures generated by the random forests classification algorithm as a metric to rank genes. Second, we define the EPS (extreme pseudo-samples) pipeline, making use of VAEs (Variational Autoencoders) and regressors to extract a ranking of genes while leveraging the feature space of both virtual and comparable samples. We show that, on 12 cancer RNA-Seq data sets ranging from 323 to 1,210 samples, using either a random forests-based gene selection method or the EPS pipeline outperforms differential expression analysis for 9 and 8 out of the 12 datasets respectively, in terms of identifying subsets of genes associated with survival. These results demonstrate the potential of supervised learning-based gene selection methods in RNA-Seq studies and highlight the need to use such multivariate gene selection methods alongside the widely used differential expression analysis. © 2018 Wenric and Shemirani

Frontiers - Publisher Connector

Open Repository and Bibliography - Liège

A miRNA expression based diagnostic tool for breast cancer using random forests

Author: Bours Vincent
Freres Pierre
Jerusalem Guy
Josse Claire
Wenric Stéphane
Publication venue
Publication date: 09/12/2013
Field of study

We developed a novel diagnostic tool for breast cancer using circulating miRNA expression levels as features of a supervised machine learning problem. We showed very good results on an independent validation cohort.microARNs circulants dans le cancer du sei

Open Repository and Bibliography - Liège

Table_4_Using Supervised Learning Methods for Gene Selection in RNA-Seq Case-Control Studies.XLSX

Author: Ruhollah Shemirani (5591672)
Stephane Wenric (3538613)
Publication venue
Publication date: 01/01/2018
Field of study

<p>Whole transcriptome studies typically yield large amounts of data, with expression values for all genes or transcripts of the genome. The search for genes of interest in a particular study setting can thus be a daunting task, usually relying on automated computational methods. Moreover, most biological questions imply that such a search should be performed in a multivariate setting, to take into account the inter-genes relationships. Differential expression analysis commonly yields large lists of genes deemed significant, even after adjustment for multiple testing, making the subsequent study possibilities extensive. Here, we explore the use of supervised learning methods to rank large ensembles of genes defined by their expression values measured with RNA-Seq in a typical 2 classes sample set. First, we use one of the variable importance measures generated by the random forests classification algorithm as a metric to rank genes. Second, we define the EPS (extreme pseudo-samples) pipeline, making use of VAEs (Variational Autoencoders) and regressors to extract a ranking of genes while leveraging the feature space of both virtual and comparable samples. We show that, on 12 cancer RNA-Seq data sets ranging from 323 to 1,210 samples, using either a random forests-based gene selection method or the EPS pipeline outperforms differential expression analysis for 9 and 8 out of the 12 datasets respectively, in terms of identifying subsets of genes associated with survival. These results demonstrate the potential of supervised learning-based gene selection methods in RNA-Seq studies and highlight the need to use such multivariate gene selection methods alongside the widely used differential expression analysis.</p

Frontiers - Publisher Connector

Open Repository and Bibliography - Liège

The Francis Crick Institute