Search CORE

806 research outputs found

Finite mixture regression: A sparse variable selection by model selection for clustering

Author: Devijver Emilie
Publication venue
Publication date: 04/09/2014
Field of study

We consider a finite mixture of Gaussian regression model for high- dimensional data, where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by a maximum likelihood estimator, restricted on relevant variables selected by an 1-penalized maximum likelihood estimator. We get an oracle inequality satisfied by this estimator with a Jensen-Kullback-Leibler type loss. Our oracle inequality is deduced from a general model selection theorem for maximum likelihood estimators with a random model collection. We can derive the penalty shape of the criterion, which depends on the complexity of the random model collection.Comment: 20 pages. arXiv admin note: text overlap with arXiv:1103.2021 by other author

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

L'iconographie de la stèle funéraire de T. Exomnius Mansuetus, praefectus cohortis (pl. II A)

Author: Devijver Hubert
Publication venue
Publication date: 28/02/2011
Field of study

RERO DOC Digital Library

Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

Author: Chebre Meriam
Clausel Marianne
Devijver Emilie
Dulac Adrien
Gaussier Eric
Janaqi Stefan
Tami Myriam
Publication venue
Publication date: 18/11/2018
Field of study

Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty in the output variable, using for example a quantile loss in Random Forests (Meinshausen, 2006). To the best of our knowledge, no extension has been provided yet for dealing with uncertainties in the input variables, even though such uncertainties are common in practical situations. We propose here such an extension by showing how standard regression trees optimizing a quadratic loss can be adapted and learned while taking into account the uncertainties in the inputs. By doing so, one no longer assumes that an observation lies into a single region of the regression tree, but rather that it belongs to each region with a certain probability. Experiments conducted on several data sets illustrate the good behavior of the proposed extension.Comment: 9 page

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

Human and machine perception of biological motion

Author: Aha
Ahlström
Blake
Cutting
Cutting
Cutting
Cédras
Davies
Devijver
Dror
Duda
Duda
Fox
Giese
Hoffman
Hogg
J.N. Carter
Johansson
Johansson
Johansson
Kolen
Kozlowski
Mather
O’Rourke
Pavlova
Pinto
Pollick
R.I. Damper
Rumelhart
Sejnowski
Stevenage
Sumi
Troje
Vijay Laxmi
Wachter
Webb
Yam
Publication venue
Publication date: 01/12/2006
Field of study

Southampton (e-Prints Soton)

Crossref

Improved model identification for non-linear systems using a random subsampling and multifold modelling (RSMM) approach

Author: Aguirre LA
Billings SA
Brown M
Chen S
Cherkassky V
Devijver PA
H.L. Wei
Hansen LK
Ljung L
Ljung L
Montgomery DC
Murray-Smith R
Pearson RK
S.A. Billings
Shao J
Shao J
Stone M
Tsang KM
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

In non-linear system identification, the available observed data are conventionally partitioned into two parts: the training data that are used for model identification and the test data that are used for model performance testing. This sort of 'hold-out' or 'split-sample' data partitioning method is convenient and the associated model identification procedure is in general easy to implement. The resultant model obtained from such a once-partitioned single training dataset, however, may occasionally lack robustness and generalisation to represent future unseen data, because the performance of the identified model may be highly dependent on how the data partition is made. To overcome the drawback of the hold-out data partitioning method, this study presents a new random subsampling and multifold modelling (RSMM) approach to produce less biased or preferably unbiased models. The basic idea and the associated procedure are as follows. First, generate K training datasets (and also K validation datasets), using a K-fold random subsampling method. Secondly, detect significant model terms and identify a common model structure that fits all the K datasets using a new proposed common model selection approach, called the multiple orthogonal search algorithm. Finally, estimate and refine the model parameters for the identified common-structured model using a multifold parameter estimation method. The proposed method can produce robust models with better generalisation performance

Crossref

White Rose Research Online

Data mining: a tool for detecting cyclical disturbances in supply networks.

Author: Chan F. T. S.
Chatfield C.
Davis T.
Devijver P. A.
Fayyad U. M.
Forrester J. W.
Han J.
Harding J. A.
Jolliffe I. T.
Kaufman L.
Klösgen W.
Koopmans L. H.
Mason-Jones R.
Monostori L.
Pyle D.
Witten I. H.
Publication venue: 'SAGE Publications'
Publication date: 21/12/2007
Field of study

Disturbances in supply chains may be either exogenous or endogenous. The ability automatically to detect, diagnose, and distinguish between the causes of disturbances is of prime importance to decision makers in order to avoid uncertainty. The spectral principal component analysis (SPCA) technique has been utilized to distinguish between real and rogue disturbances in a steel supply network. The data set used was collected from four different business units in the network and consists of 43 variables; each is described by 72 data points. The present paper will utilize the same data set to test an alternative approach to SPCA in detecting the disturbances. The new approach employs statistical data pre-processing, clustering, and classification learning techniques to analyse the supply network data. In particular, the incremental k-means clustering and the RULES-6 classification rule-learning algorithms, developed by the present authors’ team, have been applied to identify important patterns in the data set. Results show that the proposed approach has the capability automatically to detect and characterize network-wide cyclical disturbances and generate hypotheses about their root cause

Crossref

Middlesex University Research Repository

Stable network inference in high-dimensional graphical model using single-linkage

Author: Devijver Emilie
Gallopin Mélina
Molinier Rémi
Publication venue
Publication date: 14/06/2024
Field of study

Stability, akin to reproducibility, is crucial in statistical analysis. This paper examines the stability of sparse network inference in high-dimensional graphical models, where selected edges should remain consistent across different samples. Our study focuses on the Graphical Lasso and its decomposition into two steps, with the first step involving hierarchical clustering using single linkage.We provide theoretical proof that single linkage is stable, evidenced by controlled distances between two dendrograms inferred from two samples. Practical experiments further illustrate the stability of the Graphical Lasso's various steps, including dendrograms, variable clusters, and final networks. Our results, validated through both theoretical analysis and practical experiments using simulated and real datasets, demonstrate that single linkage is more stable than other methods when a modular structure is present

arXiv.org e-Print Archive

Emotion Recognition using Wireless Signals

Author: Adib F.
Alelis G.
Devijver P. A.
Droitcour A. D.
Droitcour A. D.
Ekman P.
Fernández-Caballero A.
Hu W.
Jerritta S.
Kahou S. E.
Kaltiokallio O.
Massagram W.
McDuff D.
McKinley S.
Patwari N.
Penzel T.
Postolache O.
Sakamoto T.
Wiens A. D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2016
Field of study

This paper demonstrates a new technology that can infer a person's emotions from RF signals reflected off his body. EQ-Radio transmits an RF signal and analyzes its reflections off a person's body to recognize his emotional state (happy, sad, etc.). The key enabler underlying EQ-Radio is a new algorithm for extracting the individual heartbeats from the wireless signal at an accuracy comparable to on-body ECG monitors. The resulting beats are then used to compute emotion-dependent features which feed a machine-learning emotion classifier. We describe the design and implementation of EQ-Radio, and demonstrate through a user study that its emotion recognition accuracy is on par with state-of-the-art emotion recognition systems that require a person to be hooked to an ECG monitor. Keywords: Wireless Signals; Wireless Sensing; Emotion Recognition; Affective Computing; Heart Rate VariabilityNational Science Foundation (U.S.)United States. Air Forc

DSpace@MIT

Crossref