Search CORE

36 research outputs found

A variable selection approach for highly correlated predictors in high-dimensional genomic data

Author: Lévy-Leduc Céline
Ternès Nils
Zhu Wencan
Publication venue
Publication date: 21/07/2020
Field of study

In genomic studies, identifying biomarkers associated with a variable of interest is a major concern in biomedical research. Regularized approaches are classically used to perform variable selection in high-dimensional linear models. However, these methods can fail in highly correlated settings. We propose a novel variable selection approach called WLasso, taking these correlations into account. It consists in rewriting the initial high-dimensional linear model to remove the correlation between the biomarkers (predictors) and in applying the generalized Lasso criterion. The performance of WLasso is assessed using synthetic data in several scenarios and compared with recent alternative approaches. The results show that when the biomarkers are highly correlated, WLasso outperforms the other approaches in sparse high-dimensional frameworks. The method is also successfully illustrated on publicly available gene expression data in breast cancer. Our method is implemented in the WLasso R package which is available from the Comprehensive R Archive Network

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Parenting style and children emotion management skills among Chinese children aged 3–6: the chain mediation effect of self-control and peer interactions

Author: Dexian Li
Wencan Li
Xingchen Zhu
Publication venue: Frontiers Media S.A.
Publication date: 01/09/2023
Field of study

Drawing on ecosystem theory, which is based on the interaction of family environment, individual characteristics, and social adaptation, this study aimed to examine the effects of parenting style on emotion management skills and the mediating roles of self-control and peer interactions among Chinese children aged 3–6 years. Some studies have investigated the relationship between parenting style and emotion management skills. However, research on the underlying mechanisms is still deficient. A sample of 2,303 Chinese children completed the PSDQ-Short Version, the Self-Control Teacher Rating Questionnaire, the Peer Interaction Skills Scale, and the Emotion Management Skills Questionnaire. The results show that: (1) Authoritarian parenting style negatively predicted children’s emotion management skills, self-control, and peer interactions; (2) Authoritative parenting style positively predicted children’s emotion management skills, self-control, and peer interactions; (3) Structural equation models indicated that self-control and peer interactions partially mediated the effects of authoritarian and authoritative parenting styles. The parenting style of Chinese children aged 3–6 years is related to emotion management skills, and self-control and peer interactions have chain mediating effects between parenting style and children’s emotion management skills. These results provide further guidance for the prevention and intervention of emotional and mental health problems in children

Directory of Open Access Journals

Structural Behavior of Thin-Walled Concrete-Filled Steel Tube Used in Cable Tunnel: An Experimental and Numerical Investigation

Author: Bing Qu
Hetao Hou
Lei Chen
Su Ma
Wencan Zhu
Yanhong Liang
Yanjun Jin
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

One steel grid and five thin-walled concrete-filled steel tubes (CTST) used as the supports of tunnel were tested in site for investigating the mechanical behavior. The mechanical influences of thickness, node form, and concrete on CTST were gained and compared with the impacts on steel grid. It is indicated that high antideformation capacity of CTST improved the stability of surrounding rock in short time. The cementitious grouted sleeve connection exhibited superior flexibility when CTST was erected and built. Although the deformation of rock and soil in the tunnel was increasing, good compression resistance was observed by CTST with the new connection type. It was also seen that vault, tube foot, and connections were with larger absolute strain values. The finite element analysis (FEA) was carried out using ABAQUS program. The results were validated by comparison with experimental results. The FE model could be referred by similar projects

Crossref

Directory of Open Access Journals

RNA-Seq reveals the key pathways and genes involved in the light-regulated flavonoids biosynthesis in mango (Mangifera indica L.) peel

Author: Aiping Gao
Bin Shi
Bin Shi
Bin Zheng
Chengkun Yang
Chengkun Yang
Hongxia Wu
Kaibing Zhou
Kaibing Zhou
Minjie Qian
Songbiao Wang
Wencan Zhu
Wencan Zhu
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

IntroductionFlavonoids are important water soluble secondary metabolites in plants, and light is one of the most essential environmental factors regulating flavonoids biosynthesis. In the previous study, we found bagging treatment significantly inhibited the accumulation of flavonols and anthocyanins but promoted the proanthocyanidins accumulation in the fruit peel of mango (Mangifera indica L.) cultivar ‘Sensation’, while the relevant molecular mechanism is still unknown.MethodsIn this study, RNA-seq was conducted to identify the key pathways and genes involved in the light-regulated flavonoids biosynthesis in mango peel.ResultsBy weighted gene co-expression network analysis (WGCNA), 16 flavonoids biosynthetic genes were crucial for different flavonoids compositions biosynthesis under bagging treatment in mango. The higher expression level of LAR (mango026327) in bagged samples might be the reason why light inhibits proanthocyanidins accumulation in mango peel. The reported MYB positively regulating anthocyanins biosynthesis in mango, MiMYB1, has also been identified by WGCNA in this study. Apart from MYB and bHLH, ERF, WRKY and bZIP were the three most important transcription factors (TFs) involved in the light-regulated flavonoids biosynthesis in mango, with both activators and repressors. Surprisingly, two HY5 transcripts, which are usually induced by light, showed higher expression level in bagged samples.DiscussionOur results provide new insights of the regulatory effect of light on the flavonoids biosynthesis in mango fruit peel

Directory of Open Access Journals

Développement de méthodes d'apprentissage statistique pour l'identification de biomarqueurs pronostiques et prédictifs à l'aide de données "-omiques" de grande dimension dans le domaine de la médecine de précision

Author: Zhu Wencan
Publication venue
Publication date: 26/09/2022
Field of study

Avec la révolution génomique et l'arrivée de la médecine de précision, l'identification de biomarqueurs qui sont explicatifs (biomarqueurs actifs) d'une réponse clinique devient de plus en plus importante dans la recherche clinique. Ces biomarqueurs sont utiles pour mieux comprendre la progression d'une maladie (biomarqueurs pronostiques) et pour mieux identifier les patients les plus susceptibles de bénéficier d'un traitement donné (biomarqueurs prédictifs). Les données relatives aux biomarqueurs (génomique, transcriptomique et protéomique, par exemple) sont en général de grande dimension, le nombre de biomarqueurs mesurés (variables) étant beaucoup plus important que la taille de l'échantillon. Cependant, seule une fraction des biomarqueurs est réellement active, d'où la nécessité de sélectionner les variables. Parmi les divers algorithmes d'apprentissage statistique, les approches régularisées telles que le Lasso sont très utilisées pour faire de la sélection de variables dans des contextes de grande dimension en raison de leurs performances statistiques et numériques. Cependant, la consistance de leur sélection n'est pas garantie lorsque les biomarqueurs sont fortement corrélés. Au cours de ma thèse, plusieurs nouvelles approches ont été développées pour effectuer la sélection de variables dans ce contexte difficile. Plus précisément, quatre méthodes sont mises en place dans différents modèles statistiques (modèle de régression linéaire, modèle de type ANCOVA et modèle de régression logistique). L'idée principale est de supprimer les corrélations en blanchissant la matrice de design. Pour l'une d'entre elles, des résultats de la consistance en signe ont été obtenus sous des hypothèses peu restrictives. Les approches proposées ont été évaluées par des études de simulation et appliquées à des données publiques. Les résultats montrent que les performances statistiques de nos méthodes sont meilleures que celles de l'état de l'art. Nos méthodes sont implémentées dans les packages R suivants : WLasso, PPLasso, et WLogit.With the genomic revolution and the new era of precision medicine, the identification of biomarkers that are informative (i.e. active) for a response (endpoint) is becoming increasingly important in clinical research. These biomarkers are beneficial to better understand the progression of a disease (prognostic biomarkers) and to better identify patients more likely to benefit from a given treatment (predictive biomarkers). Biomarker data (e.g. genomics, transcriptomics, and proteomics) usually have a high-dimensional nature, with the number of measured biomarkers (variables) much larger than the sample size. However, only a fraction of biomarkers is truly active, therefore raising the need for variable selection. Among various statistical learning approaches, regularized methods such as Lasso have become very popular for high-dimensional variable selection due to their statistical and numerical performance. However, their selection consistency is not guaranteed when the biomarkers are highly correlated. Throughout my PhD, several novel regularized approaches were developed to perform variable selection in this challenging context. More precisely, four methods were proposed in different statistical models (linear regression model, ANCOVA-type model, and logistic regression model). The main idea is to remove the correlations by whitening the design matrix. For one of the methods, results of the sign consistency were established under mild conditions. The proposed approaches were evaluated through simulation studies and applications on publicly available datasets. The results suggest that our approaches are more performant than compared methods for selecting prognostic and predictive biomarkers in high-dimensional (correlated) settings. Three of our methods are implemented in the R packages: WLasso, PPLasso, and WLogit, available from the CRAN (Comprehensive R Archive Network)

Theses.fr

Développement de méthodes d'apprentissage statistique pour l'identification de biomarqueurs pronostiques et prédictifs à l'aide de données "-omiques" de grande dimension dans le domaine de la médecine de précision

Author: Zhu Wencan
Publication venue: HAL CCSD
Publication date: 26/09/2022
Field of study

With the genomic revolution and the new era of precision medicine, the identification of biomarkers that are informative (i.e. active) for a response (endpoint) is becoming increasingly important in clinical research. These biomarkers are beneficial to better understand the progression of a disease (prognostic biomarkers) and to better identify patients more likely to benefit from a given treatment (predictive biomarkers). Biomarker data (e.g. genomics, transcriptomics, and proteomics) usually have a high-dimensional nature, with the number of measured biomarkers (variables) much larger than the sample size. However, only a fraction of biomarkers is truly active, therefore raising the need for variable selection. Among various statistical learning approaches, regularized methods such as Lasso have become very popular for high-dimensional variable selection due to their statistical and numerical performance. However, their selection consistency is not guaranteed when the biomarkers are highly correlated. Throughout my PhD, several novel regularized approaches were developed to perform variable selection in this challenging context. More precisely, four methods were proposed in different statistical models (linear regression model, ANCOVA-type model, and logistic regression model). The main idea is to remove the correlations by whitening the design matrix. For one of the methods, results of the sign consistency were established under mild conditions. The proposed approaches were evaluated through simulation studies and applications on publicly available datasets. The results suggest that our approaches are more performant than compared methods for selecting prognostic and predictive biomarkers in high-dimensional (correlated) settings. Three of our methods are implemented in the R packages: WLasso, PPLasso, and WLogit, available from the CRAN (Comprehensive R Archive Network).Avec la révolution génomique et l'arrivée de la médecine de précision, l'identification de biomarqueurs qui sont explicatifs (biomarqueurs actifs) d'une réponse clinique devient de plus en plus importante dans la recherche clinique. Ces biomarqueurs sont utiles pour mieux comprendre la progression d'une maladie (biomarqueurs pronostiques) et pour mieux identifier les patients les plus susceptibles de bénéficier d'un traitement donné (biomarqueurs prédictifs). Les données relatives aux biomarqueurs (génomique, transcriptomique et protéomique, par exemple) sont en général de grande dimension, le nombre de biomarqueurs mesurés (variables) étant beaucoup plus important que la taille de l'échantillon. Cependant, seule une fraction des biomarqueurs est réellement active, d'où la nécessité de sélectionner les variables. Parmi les divers algorithmes d'apprentissage statistique, les approches régularisées telles que le Lasso sont très utilisées pour faire de la sélection de variables dans des contextes de grande dimension en raison de leurs performances statistiques et numériques. Cependant, la consistance de leur sélection n'est pas garantie lorsque les biomarqueurs sont fortement corrélés. Au cours de ma thèse, plusieurs nouvelles approches ont été développées pour effectuer la sélection de variables dans ce contexte difficile. Plus précisément, quatre méthodes sont mises en place dans différents modèles statistiques (modèle de régression linéaire, modèle de type ANCOVA et modèle de régression logistique). L'idée principale est de supprimer les corrélations en blanchissant la matrice de design. Pour l'une d'entre elles, des résultats de la consistance en signe ont été obtenus sous des hypothèses peu restrictives. Les approches proposées ont été évaluées par des études de simulation et appliquées à des données publiques. Les résultats montrent que les performances statistiques de nos méthodes sont meilleures que celles de l'état de l'art. Nos méthodes sont implémentées dans les packages R suivants : WLasso, PPLasso, et WLogit

Thèses en Ligne

Theses.fr

A variable selection approach for highly correlated predictors in high-dimensional genomic data

Author: Lévy-Leduc Céline
Ternes Nils
Zhu Wencan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

International audienceIn genomic studies, identifying biomarkers associated with a variable of interest is a major concern in biomedical research. Regularized approaches are classically used to perform variable selection in high-dimensional linear models. However, these methods can fail in highly correlated settings.We propose a novel variable selection approach called WLasso, taking these correlations into account. It consists in rewriting the initial high-dimensional linear model to remove the correlation between the biomarkers (predictors) and in applying the generalized Lasso criterion. The performance of WLasso is assessed using synthetic data in several scenarios and compared with recent alternative approaches. The results show that when the biomarkers are highly correlated, WLasso outperforms the other approaches in sparse high-dimensional frameworks. The method is also illustrated on publicly available gene expression data in breast cancer. Our method is implemented in the WLasso R package which is available from the Comprehensive R Archive Network (CRAN)

HAL Descartes

Identification of prognostic and predictive biomarkers in high-dimensional data with PPLasso

Author: Lévy-Leduc Céline
Ternès Nils
Zhu Wencan
Publication venue: HAL CCSD
Publication date: 07/02/2022
Field of study

In clinical trials, identification of prognostic and predictive biomarkers is essential to precision medicine. Prognostic biomarkers can be useful for the prevention of the occurrence of the disease, and predictive biomarkers can be used to identify patients with potential benefit from the treatment. Previous researches were mainly focused on clinical characteristics, and the use of genomic data in such an area is hardly studied. A new method is required to simultaneously select prognostic and predictive biomarkers in high dimensional genomic data where biomarkers are highly correlated. We propose a novel approach called PPLasso (Prognostic Predictive Lasso) integrating prognostic and predictive effects into one statistical model. PPLasso also takes into account the correlations between biomarkers that can alter the biomarker selection accuracy. Our method consists in transforming the design matrix to remove the correlations between the biomarkers before applying the generalized Lasso. In a comprehensive numerical evaluation, we show that PPLasso outperforms the traditional Lasso approach on both prognostic and predictive biomarker identification in various scenarios. Finally, our method is applied to publicly available transcriptomic data from clinical trial RV144. Our method is implemented in the PPLasso R package which will be soon available from the Comprehensive R Archive Network (CRAN)

HAL Descartes

STUDY ON THE 3.5-CELL DC-SC PHOTO-INJECTOR*

Author: Feng Zhu
Kui Zhao
Shengwen Quan
Wencan Xu
Publication venue
Publication date: 24/04/2020
Field of study

Abstract In order to get high quality electron beam for PKU-ERL-FEL project. A 3.5-cell DC-SC photo-injector was designed and optimized. The pierce gun and 3.5-cell superconducting Nb cavity are DC acceleration section and RF acceleration section, respectively. A tuner for the whole 3.5-cell superconducting cavity has been designed. The beam parameters of 3.5-cell DC-SC photo-injector are also presented in this paper. The disadvantage and problem of 1.5-cell DC-SC photo cathode injector which was for principle demonstration have been overcame in the design of 3.5-cell DC-SC photo cathode injector

CiteSeerX

Sign Consistency of the Generalized Elastic Net Estimator

Author: Adjakossa Eric,
Lévy-Leduc Céline
Ternès Nils
Zhu Wencan
Publication venue: HAL CCSD
Publication date: 08/06/2021
Field of study

In this paper, we propose a novel variable selection approach in the framework of high-dimensional linear models where the columns of the design matrix are highly correlated. It consists in rewriting the initial high-dimensional linear model to remove the correlation between the columns of the design matrix and in applying a generalized Elastic Net criterion since it can be seen as an extension of the generalized Lasso.The properties of our approach called gEN (generalized Elastic Net) are investigated both from a theoretical and a numerical point ofview. More precisely, we provide a new condition called GIC (Generalized Irrepresentable Condition) which generalizes the EIC (Elastic Net Irrepresentable Condition) of Jia and Yu (2010) under which we prove that our estimator can recover the positions of the null and non-null entries of the coefficients when the sample size tends to infinity.We also assess the performance of our methodology using synthetic data and compare it with alternative approaches. Our numerical experiments show that our approach improves the variable selection performance in many cases

HAL Descartes