Search CORE

68 research outputs found

FracSim: An R Package to Simulate Multifractional Lévy Motions

Author: Serge Cohen
Sébastien Déjean
Publication venue
Publication date
Field of study

In this article a procedure is proposed to simulate fractional fields, which are non Gaussian counterpart of the fractional Brownian motion. These fields, called real harmonizable (multi)fractional LÃÂ©vy motions, allow fixing the HÃÂ¶lder exponent at each point. FracSim is an R package developed in R and C language. Parallel computers have been also used.

Research Papers in Economics

CCA: An R Package to Extend Canonical Correlation Analysis

Author: Alain Baccini
Ignacio González
Pascal G. P. Martin
Sébastien Déjean
Publication venue
Publication date
Field of study

Canonical correlations analysis (CCA) is an exploratory statistical method to highlight correlations between two data sets acquired on the same experimental units. The cancor() function in R (R Development Core Team 2007) performs the core of computations but further work was required to provide the user with additional tools to facilitate the interpretation of the results. We implemented an R package, CCA, freely available from the Comprehensive R Archive Network (CRAN, http://CRAN.R-project.org/), to develop numerical and graphical outputs and to enable the user to handle missing values. The CCA package also includes a regularized version of CCA to deal with data sets with more variables than units. Illustrations are given through the analysis of a data set coming from a nutrigenomic study in the mouse.

Research Papers in Economics

Learning to Choose the Best System Configuration in Information Retrieval: the case of repeated queries

Author: Bigot Anthony
Déjean Sébastien
Mothe Josiane
Publication venue: Consortium J.UCS
Publication date: 01/01/2015
Field of study

This paper presents a method that automatically decides which system configuration should be used to process a query. This method is developed for the case of repeated queries and implements a new kind of meta-system. It is based on a training process: the meta-system learns the best system configuration to use on a per query basis. After training, the meta-search system knows which configuration should treat a given query. The Learning to Choose method we developed selects the best configurations among many. This selective process rests on data analytics applied to system parameter values and their link with system effectiveness. Moreover, we optimize the parameters on a per-query basis. The training phase uses a limited amount of document relevance judgment. When the query is repeated or when an equal-query is submitted to the system, the meta-system automatically knows which parameters it should use to treat the query. This method its the case of changing collections since what is learnt is the relationship between a query and the best parameters to use to process it, rather than the relationship between a query and documents to retrieve. In this paper, we describe how data analysis can help to select among various configurations the ones that will be useful. The "Learning to choose" method is presented and evaluated using simulated data from TREC campaigns. We show that system performance highly increases in terms of precision, specifically for the queries that are difficult or medium difficult to answer. The other parameters of the method are also studied

Scientific Publications of the University of Toulouse II Le Mirail

ZENODO

Open Archive Toulouse Archive Ouverte

HAL-INSA Toulouse

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Improvement of variables interpretability in kernel PCA

Author: Briscik Mitja
Dillies Marie-Agnès
Déjean Sébastien
Publication venue
Publication date: 27/03/2023
Field of study

Kernel methods have been proven to be a powerful tool for the integration and analysis of highthroughput technologies generated data. Kernels offer a nonlinear version of any linear algorithm solely based on dot products. The kernelized version of Principal Component Analysis is a valid nonlinear alternative to tackle the nonlinearity of biological sample spaces. This paper proposes a novel methodology to obtain a data-driven feature importance based on the KPCA representation of the data. The proposed method, kernel PCA Interpretable Gradient (KPCA-IG), provides a datadriven feature importance that is computationally fast and based solely on linear algebra calculations. It has been compared with existing methods on three benchmark datasets. The accuracy obtained using KPCA-IG selected features is equal to or greater than the other methods' average. Also, the computational complexity required demonstrates the high efficiency of the method. An exhaustive literature search has been conducted on the selected genes from a publicly available Hepatocellular carcinoma dataset to validate the retained features from a biological point of view. The results once again remark on the appropriateness of the computed ranking. The black-box nature of kernel PCA needs new methods to interpret the original features. Our proposed methodology KPCA-IG proved to be a valid alternative to select influential variables in high-dimensional high-throughput datasets, potentially unravelling new biological and medical biomarkers

arXiv.org e-Print Archive

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse

HAL-Pasteur

integrOmics: an R package to unravel relationships between two omics datasets

Author: Déjean Sébastien
González Ignacio
Lê Cao Kim-Anh
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: With the availability of many ‘omics’ data, such as transcriptomics, proteomics or metabolomics, the integrative or joint analysis of multiple datasets from different technology platforms is becoming crucial to unravel the relationships between different biological functional levels. However, the development of such an analysis is a major computational and technical challenge as most approaches suffer from high data dimensionality. New methodologies need to be developed and validated

Scientific Publications of the University of Toulouse II Le Mirail

PubMed Central

HAL-INSA Toulouse

University of Melbourne Institutional Repository

University of Queensland eSpace

Muscle atrophy phenotype gene expression during spaceflight is linked to a metabolic crosstalk in both the liver and the muscle in mice

Author: Behesti Afshin
DA SILVEIRA Willian
Déjean Sébastien
Finch Rebecca
Larose Tricia
Mcstay Gavin
Vitry Geraldine
Wotring Virginia
Publication venue: 'Elsevier BV'
Publication date
Field of study

Human expansion in space is hampered by the physiological risks of spaceflight. The muscle and the liver are among the most affected tissues during spaceflight and their relationships in response to space exposure have never been studied. We compared the transcriptome response of liver and quadriceps from mice on NASA RR1 mission, after 37 days of exposure to spaceflight using GSEA, ORA, and sparse partial least square-differential analysis. We found that lipid metabolism is the most affected biological process between the two organs. A specific gene cluster expression pattern in the liver strongly correlated with glucose sparing and an energy-saving response affecting high energy demand process gene expression such as DNA repair, autophagy, and translation in the muscle. Our results show that impaired lipid metabolism gene expression in the liver and muscle atrophy gene expression are two paired events during spaceflight, for which dietary changes represent a possible countermeasure

STORE - Staffordshire Online Repository

Urinary amine and organic acid metabolites evaluated as markers for childhood aggression : the ACTION biomarker study

Author: Bartels Meike
Boomsma Dorret I.
Colins Olivier
Davies Gareth E.
de Zeeuw Eveline L.
Déjean Sébastien
Ehli Erik A.
Fanos Vassilios
Hagenbeek Fiona A.
Hankemeier Thomas
Harms Amy C.
Hottenga Jouke Jan
Kluft Cornelis
Pool René
Roetman Peter J.
Talens Simone
van Beijsterveldt Catharina E. M.
van Dongen Jenny
Vandenbosch Marjolein M. L. J. Z.
Vermeiren Robert R. J. M.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Biomarkers are of interest as potential diagnostic and predictive instruments in personalized medicine. We present the first urinary metabolomics biomarker study of childhood aggression. We aim to examine the association of urinary metabolites and neurotransmitter ratios involved in key metabolic and neurotransmitter pathways in a large cohort of twins (N = 1,347) and clinic-referred children (N = 183) with an average age of 9.7 years. This study is part of ACTION (Aggression in Children: Unraveling gene-environment interplay to inform Treatment and InterventiON strategies), in which we developed a standardized protocol for large-scale collection of urine samples in children. Our analytical design consisted of three phases: a discovery phase in twins scoring low or high on aggression (N = 783); a replication phase in twin pairs discordant for aggression (N = 378); and a validation phase in clinical cases and matched twin controls (N = 367). In the discovery phase, 6 biomarkers were significantly associated with childhood aggression, of which the association of O-phosphoserine (beta = 0.36; SE = 0.09; p = 0.004), and gamma-L-glutamyl-L-alanine (beta = 0.32; SE = 0.09; p = 0.01) remained significant after multiple testing. Although non-significant, the directions of effect were congruent between the discovery and replication analyses for six biomarkers and two neurotransmitter ratios and the concentrations of 6 amines differed between low and high aggressive twins. In the validation analyses, the top biomarkers and neurotransmitter ratios, with congruent directions of effect, showed no significant associations with childhood aggression. We find suggestive evidence for associations of childhood aggression with metabolic dysregulation of neurotransmission, oxidative stress, and energy metabolism. Although replication is required, our findings provide starting points to investigate causal and pleiotropic effects of these dysregulations on childhood aggression

VU Research Portal

Ghent University Academic Bibliography

Leiden University Scholary Publications

Analysis of the real EADGENE data set: Multivariate approaches and post analysis (Open Access publication)

Springer - Publisher Connector

Analysis of the real EADGENE data set: Comparison of methods and guidelines for data normalisation and selection of differentially expressed genes (Open Access publication)

A large variety of methods has been proposed in the literature for microarray data analysis. The aim of this paper was to present techniques used by the EADGENE (European Animal Disease Genomics Network of Excellence) WP1.4 participants for data quality control, normalisation and statistical methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two different mastitis causing bacteria: Escherichia coli and Staphylococcus aureus. It was reassuring to see that most of the teams found the same main biological results. In fact, most of the differentially expressed genes were found for infection by E. coli between uninfected and 24 h challenged udder quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised a biological problem of cross-talk between infected and uninfected quarters which will have to be dealt with for further microarray studies

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

ProdInra

University of Melbourne Institutional Repository

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

FracSim: An R Package to Simulate Multifractional Lévy Motions

Author: Serge Cohen
Sébastien Déjean
Publication venue: Foundation for Open Access Statistics
Publication date: 01/01/2005
Field of study

In this article a procedure is proposed to simulate fractional fields, which are non Gaussian counterpart of the fractional Brownian motion. These fields, called real harmonizable (multi)fractional Lvy motions, allow fixing the Hlder exponent at each point. FracSim is an R package developed in R and C language. Parallel computers have been also used

Crossref

Directory of Open Access Journals

Journal of Statistical Software