Search CORE

2,103 research outputs found

Statistical Learning and Kernel Methods in Bioinformatics

Author: Guyon I.
Schölkopf B.
Weston J.
Publication venue
Publication date: 01/01/2003
Field of study

Personalized Pancreatic Tumor Growth Prediction via Group Learning

Author: C Hogea
CC Chang
H Greenspan
I Guyon
J Yao
KCL Wong
M Morris
O Clatz
Publication venue
Publication date: 01/06/2017
Field of study

Tumor growth prediction, a highly challenging task, has long been viewed as a mathematical modeling problem, where the tumor growth pattern is personalized based on imaging and clinical data of a target patient. Though mathematical models yield promising results, their prediction accuracy may be limited by the absence of population trend data and personalized clinical characteristics. In this paper, we propose a statistical group learning approach to predict the tumor growth pattern that incorporates both the population trend and personalized data, in order to discover high-level features from multimodal imaging data. A deep convolutional neural network approach is developed to model the voxel-wise spatio-temporal tumor progression. The deep features are combined with the time intervals and the clinical factors to feed a process of feature selection. Our predictive model is pretrained on a group data set and personalized on the target patient data to estimate the future spatio-temporal progression of the patient's tumor. Multimodal imaging data at multiple time points are used in the learning, personalization and inference stages. Our method achieves a Dice coefficient of 86.8% +- 3.6% and RVD of 7.9% +- 5.4% on a pancreatic tumor data set, outperforming the DSC of 84.4% +- 4.0% and RVD 13.9% +- 9.8% obtained by a previous state-of-the-art model-based method

arXiv.org e-Print Archive

Crossref

Heuristic Search over a Ranking for Feature Selection

Author: E. Xing
H. Almuallim
H. Liu
H. Liu
I. Guyon
I. Guyon
I. Inza
I. Witten
L. Yu
M. Hall
M. Xiong
R. Kohavi
Publication venue
Publication date: 01/01/2005
Field of study

In this work, we suggest a new feature selection technique that lets us use the wrapper approach for finding a well suited feature set for distinguishing experiment classes in high dimensional data sets. Our method is based on the relevance and redundancy idea, in the sense that a ranked-feature is chosen if additional information is gained by adding it. This heuristic leads to considerably better accuracy results, in comparison to the full set, and other representative feature selection algorithms in twelve well–known data sets, coupled with notable dimensionality reduction

CiteSeerX

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Digging into acceptor splice site prediction : an iterative feature selection approach

Author: A.I. Blum
A.K. Jain
C. Mathé
D. Mladenić
E. Alpaydin
G.R. Harik
H. Mühlenbein
I. Guyon
I. Guyon
J. Weston
M. Kudo
M. Pertea
P. Larrañaga
R. Kohavi
R.O. Duda
S. Degroeve
T. Joachims
X. Zhang
Y. Saeys
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Feature selection techniques are often used to reduce data dimensionality, increase classification performance, and gain insight into the processes that generated the data. In this paper, we describe an iterative procedure of feature selection and feature construction steps, improving the classification of acceptor splice sites, an important subtask of gene prediction. We show that acceptor prediction can benefit from feature selection, and describe how feature selection techniques can be used to gain new insights in the classification of acceptor sites. This is illustrated by the identification of a new, biologically motivated feature: the AG-scanning feature. The results described in this paper contribute both to the domain of gene prediction, and to research in feature selection techniques, describing a new wrapper based feature weighting method that aids in knowledge discovery when dealing with complex datasets

Crossref

Ghent University Academic Bibliography

A feature selection method for air quality forecasting

Author: A. Hyvarinen
A. Sharma
E. Cogliani
E. Parzen
I. Guyon
P. Perez
R.J. May
S. Haykin
T. Slini
T.M. Cover
Publication venue: Springer
Publication date: 01/01/2010
Field of study

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

A Methodology for the Diagnostic of Aircraft Engine Based on Indicators Aggregation

Author: D. Ruta
E. Côme
H. Peng
I. Guyon
J. Lacaille
L. Breiman
L. Breiman
L. Vasov
M. Basseville
N. Japkowicz
X. Flandrois
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Aircraft engine manufacturers collect large amount of engine related data during flights. These data are used to detect anomalies in the engines in order to help companies optimize their maintenance costs. This article introduces and studies a generic methodology that allows one to build automatic early signs of anomaly detection in a way that is understandable by human operators who make the final maintenance decision. The main idea of the method is to generate a very large number of binary indicators based on parametric anomaly scores designed by experts, complemented by simple aggregations of those scores. The best indicators are selected via a classical forward scheme, leading to a much reduced number of indicators that are tuned to a data set. We illustrate the interest of the method on simulated data which contain realistic early signs of anomalies.Comment: Proceedings of the 14th Industrial Conference, ICDM 2014, St. Petersburg : Russian Federation (2014

arXiv.org e-Print Archive

Crossref

HAL-Paris1

Is This a Joke? Detecting Humor in Spanish Tweets

Author: A Reyes
A Reyes
C Gruner
I Guyon
J Sjöbergh
M Minsky
R Basili
R Mihalcea
RF Mihalcea
S Attardo
V Raskin
W Ruch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/03/2017
Field of study

While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Computational Linguistics. There exist some previous works, but a characterization of humor that allows its automatic recognition and generation is far from being specified. In this work we build a crowdsourced corpus of labeled tweets, annotated according to its humor value, letting the annotators subjectively decide which are humorous. A humor classifier for Spanish tweets is assembled based on supervised learning, reaching a precision of 84% and a recall of 69%.Comment: Preprint version, without referra

arXiv.org e-Print Archive

Crossref

Predicting sentence translation quality using extrinsic and language independent features

Author: AJ Smola
Declan Groves
Ergun Biçici
FJ Och
I Guyon
I Guyon
Josef van Genabith
JS Albrecht
L Specia
L Wasserman
P Koehn
PF Brown
T Hastie
TM Cover
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/12/2013
Field of study

We develop a top performing model for automatic, accurate, and language independent prediction of sentence-level statistical machine translation (SMT) quality with or without looking at the translation outputs. We derive various feature functions measuring the closeness of a given test sentence to the training data and the difficulty of translating the sentence. We describe \texttt{mono} feature functions that are based on statistics of only one side of the parallel training corpora and \texttt{duo} feature functions that incorporate statistics involving both source and target sides of the training data. Overall, we describe novel, language independent, and SMT system extrinsic features for predicting the SMT performance, which also rank high during feature ranking evaluations. We experiment with different learning settings, with or without looking at the translations, which help differentiate the contribution of different feature sets. We apply partial least squares and feature subset selection, both of which improve the results and we present ranking of the top features selected for each learning setting, providing an exhaustive analysis of the extrinsic features used. We show that by just looking at the test source sentences and not using the translation outputs at all, we can achieve better performance than a baseline system using SMT model dependent features that generated the translations. Furthermore, our prediction system is able to achieve the

2

nd best performance overall according to the official results of the Quality Estimation Task (QET) challenge when also looking at the translation outputs. Our representation and features achieve the top performance in QET among the models using the SVR learning model

Crossref

Irish Universities

DCU Online Research Access Service

Orientational instabilities in nematics with weak anchoring under combined action of steady flow and external fields

Author: A. P. Krekhov
E. Dubois-Violette
E. Guyon
I. Janossy
I. Sh. Nasibullayev
L. Kramer
O. S. Tarasov
P. G. de Gennes
P. Manneville
P. Manneville
P. Pieranski
S. Chandrasekhar
V. G. Chigrinov
Publication venue: 'American Physical Society (APS)'
Publication date: 10/03/2005
Field of study

We study the homogeneous and the spatially periodic instabilities in a nematic liquid crystal layer subjected to steady plane {\em Couette} or {\em Poiseuille} flow. The initial director orientation is perpendicular to the flow plane. Weak anchoring at the confining plates and the influence of the external {\em electric} and/or {\em magnetic} field are taken into account. Approximate expressions for the critical shear rate are presented and compared with semi-analytical solutions in case of Couette flow and numerical solutions of the full set of nematodynamic equations for Poiseuille flow. In particular the dependence of the type of instability and the threshold on the azimuthal and the polar anchoring strength and external fields is analysed.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Crossref

CERN Document Server

Overcoming Calibration Problems in Pattern Labeling with Pairwise Ratings: Application to Personality Traits

Author: Chen B
Escalera S
Guyon I
Ponce-Lopez V
Shah N
Simon MO
Publication venue: 14th European Conference on Computer Vision (ECCV)
Publication date: 24/11/2016
Field of study

We address the problem of calibration of workers whose task is to label patterns with continuous variables, which arises for instance in labeling images of videos of humans with continuous traits. Worker bias is particularly difficult to evaluate and correct when many workers contribute just a few labels, a situation arising typically when labeling is crowd-sourced. In the scenario of labeling short videos of people facing a camera with personality traits, we evaluate the feasibility of the pairwise ranking method to alleviate bias problems. Workers are exposed to pairs of videos at a time and must order by preference. The variable levels are reconstructed by fitting a Bradley-Terry-Luce model with maximum likelihood. This method may at first sight, seem prohibitively expensive because for N videos, p=N(N−1)/2 pairs must be potentially processed by workers rather that N videos. However, by performing extensive simulations, we determine an empirical law for the scaling of the number of pairs needed as a function of the number of videos in order to achieve a given accuracy of score reconstruction and show that the pairwise method is affordable. We apply the method to the labeling of a large scale dataset of 10,000 videos used in the ChaLearn Apparent Personality Trait challenge

UCL Discovery