Search CORE

698 research outputs found

Hybrid Collaborative Filtering with Autoencoders

Author: Gaudel Romaric
Mary Jeremie
Strub Florian
Publication venue
Publication date: 18/07/2016
Field of study

Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

AUC Optimisation and Collaborative Filtering

Author: Clemencon Stephan
Dhanjal Charanpal
Gaudel Romaric
Publication venue
Publication date: 21/08/2015
Field of study

In recommendation systems, one is interested in the ranking of the predicted items as opposed to other losses such as the mean squared error. Although a variety of ways to evaluate rankings exist in the literature, here we focus on the Area Under the ROC Curve (AUC) as it widely used and has a strong theoretical underpinning. In practical recommendation, only items at the top of the ranked list are presented to the users. With this in mind, we propose a class of objective functions over matrix factorisations which primarily represent a smooth surrogate for the real AUC, and in a special case we show how to prioritise the top of the list. The objectives are differentiable and optimised through a carefully designed stochastic gradient-descent-based algorithm which scales linearly with the size of the data. In the special case of square loss we show how to improve computational complexity by leveraging previously computed measures. To understand theoretically the underlying matrix factorisation approaches we study both the consistency of the loss functions with respect to AUC, and generalisation using Rademacher theory. The resulting generalisation analysis gives strong motivation for the optimisation under study. Finally, we provide computation results as to the efficacy of the proposed method using synthetic and real data

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Bandits Warm-up Cold Recommender Systems

Author: Gaudel Romaric
Mary Jérémie
Philippe Preux
Publication venue
Publication date: 10/07/2014
Field of study

We address the cold start problem in recommendation systems assuming no contextual information is available neither about users, nor items. We consider the case in which we only have access to a set of ratings of items by users. Most of the existing works consider a batch setting, and use cross-validation to tune parameters. The classical method consists in minimizing the root mean square error over a training subset of the ratings which provides a factorization of the matrix of ratings, interpreted as a latent representation of items and users. Our contribution in this paper is 5-fold. First, we explicit the issues raised by this kind of batch setting for users or items with very few ratings. Then, we propose an online setting closer to the actual use of recommender systems; this setting is inspired by the bandit framework. The proposed methodology can be used to turn any recommender system dataset (such as Netflix, MovieLens,...) into a sequential dataset. Then, we explicit a strong and insightful link between contextual bandit algorithms and matrix factorization; this leads us to a new algorithm that tackles the exploration/exploitation dilemma associated to the cold start problem in a strikingly new perspective. Finally, experimental evidence confirm that our algorithm is effective in dealing with the cold start problem on publicly available datasets. Overall, the goal of this paper is to bridge the gap between recommender systems based on matrix factorizations and those based on contextual bandits

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Canopy gap characteristics, their size-distribution and spatial pattern in a mountainous cool temperate forest of Japan

Author: Gaudel Rabins
Publication venue: Helsingfors universitet
Publication date: 01/01/2019
Field of study

Canopy gaps and their characteristic features (e.g. area and shape) influence the availability of nutrients, moisture and light in a forest ecosystem, and consequently affect the regeneration process and species composition in the forest. Most of the earlier research on canopy gap used field measurement and conventional remote sensing to quantify gap and these methods have limitations and accuracy problems. However, the development in Light Detecting and Ranging (LiDAR) technology has been effective in overcoming limitations and challenges associated with conventional remote sensing. The ability of LiDAR to represent the three-dimensional structure of the canopies and the sub-canopy resulting in high-resolution topographic maps, highly accurate estimated of vegetation height, cover and canopy structure makes it suitable technology for gap studies. LiDAR-based digital surface model (DSM) and digital elevation model (DEM) were used to quantify the canopy gaps over 5124ha of University of Tokyo Chichibu Forests (UTCF) consisting of three forest-types; primary, secondary and plantation forest. Disturbance driven canopy gaps might have spatial and characteristic variation due to differences in disturbance history, nature, frequency and intensity in different forest and land-types. Quantifying gap characteristics and studying variation and size distribution in different forest types and topography help to understand the different gap dynamics and their ecological perspectives. In this study, a gap was defined as an opening with a maximum height of 2m and minimum area threshold of 10m2. The minimum area threshold, which represents the gap area created by the death of at least a single tree, was determined through a random sampling of 100 tree crowns at UTCF using high resolution aerial photographs. Gap size distribution was analyzed in different forest types and land types. Spatial autocorrelation of gap occurrence was studied using semivariance analysis and distance to the nearest gap (DNG), which is the distance to the nearest gap for an individual gap. Canopy gap size frequency distribution in different forest-types was investigated using power-law. The negative exponent (α), which is also the scaling component of the power-law distribution, was compared between forest-types. Altogether, 6179 gaps with area 10-11603 m2 were found. Gap size distribution in UTCF showed skewness with a high frequency of smaller gaps and a few large gaps. Half of the gaps were smaller than 19 m2 and less than one percent of gaps (0.73 %) were larger than 400 m2. Primary forest contained high gap density (1.85 gaps per ha), shortest mean-DNG (22m) and second-largest gap-area fraction (0.72 %) after plantation forest area (0.76 %). Secondary forest had the lowest gap density (1.03 gaps per hectare) but had the larger mean gap-area (43 m2) than in primary forest (39 m2). The Kolmogorov–Smirnov test showed differences (p2400 m2) were absent in the secondary forest. Gap size frequency distribution followed a power-law distribution only in plantation forest area (p>0.1, α =2.27). The scaling parameter in the primary and secondary forest was 2.56 (p=0.01) and 2.20 (p=0.02), respectively. Gap distribution showed some spatial autocorrelation in primary and secondary forest at least with distance up to 1300m. Most of the gaps in the primary forest were concentrated in the valley and middle slope, whereas the upper and middle slope had fewest gaps

Helsingin yliopiston digitaalinen arkisto

Checking experiments for stream X-machines

Author: Aguado
Balanescu
Bernardini
Bogdanov
Chow
Eilenberg
Gaudel
Gaudel
Gaudel
Gonenc
Harel
Hierons
Hierons
Hierons
Hierons
Hierons
Hierons
Hierons
Hierons
Holcombe
Holcombe
Holcombe
Ipate
Ipate
Ipate
Ipate
Jackson
Lee
Lei
Luo
Luo
Moore
Petrenko
Rabin
Robert M. Hierons
Tai
Ural
Yevtushenko
Publication venue: 'Elsevier BV'
Publication date: 01/08/2010
Field of study

This article is a post-print version of the published article which may be accessed at the link below. Copyright © 2010 Elsevier B.V. All rights reserved.Stream X-machines are a state based formalism that has associated with it a particular development process in which a system is built from trusted components. Testing thus essentially checks that these components have been combined in a correct manner and that the orders in which they can occur are consistent with the specification. Importantly, there are test generation methods that return a checking experiment: a test that is guaranteed to determine correctness as long as the implementation under test (IUT) is functionally equivalent to an unknown element of a given fault domain Ψ. Previous work has show how three methods for generating checking experiments from a finite state machine (FSM) can be adapted to testing from a stream X-machine. However, there are many other methods for generating checking experiments from an FSM and these have a variety of benefits that correspond to different testing scenarios. This paper shows how any method for generating a checking experiment from an FSM can be adapted to generate a checking experiment for testing an implementation against a stream X-machine. This is the case whether we are testing to check that the IUT is functionally equivalent to a specification or we are testing to check that every trace (input/output sequence) of the IUT is also a trace of a nondeterministic specification. Interestingly, this holds even if the fault domain Ψ used is not that traditionally associated with testing from a stream X-machine. The results also apply for both deterministic and nondeterministic implementations

Elsevier - Publisher Connector

Crossref

Brunel University Research Archive

Unimodal Mono-Partite Matching in a Bandit Setting

Author: Gaudel Romaric
Rodet Matthieu
Publication venue
Publication date: 02/08/2022
Field of study

We tackle a new emerging problem, which is finding an optimal monopartite matching in a weighted graph. The semi-bandit version, where a full matching is sampled at each iteration, has been addressed by \cite{ADMA}, creating an algorithm with an expected regret matching

O(\frac{L\log(L)}{\Delta}\log(T))

with

2L

players,

T

iterations and a minimum reward gap

\Delta

. We reduce this bound in two steps. First, as in \cite{GRAB} and \cite{UniRank} we use the unimodality property of the expected reward on the appropriate graph to design an algorithm with a regret in

O(L\frac{1}{\Delta}\log(T))

. Secondly, we show that by moving the focus towards the main question `\emph{Is user

i

better than user

j

?}' this regret becomes

O(L\frac{\Delta}{\tilde{\Delta}^2}\log(T))

, where \Tilde{\Delta} > \Delta derives from a better way of comparing users. Some experimental results finally show these theoretical results are corroborated in practice

arXiv.org e-Print Archive

SoftwareTesting with Active Learning in a Graph

Author: Baskiotis Nicolas
Gaudel Marie-Claude
Publication venue: Dagstuhl Seminar Proceedings. 08351 - Evolutionary Test Generation
Publication date: 01/01/2009
Field of study

Motivated by Structural Statistical Software Testing (SSST), this paper is interested in sampling the feasible execution paths in the control flow graph of the program being tested. For some complex programs, the fraction of feasible paths becomes tiny, ranging in

[10^{-10}, 10^{-5}]

. When relying on the uniform sampling of the program paths, SSST is thus hindered by the non-Markovian nature of the ``feasible path\u27\u27 concept, due to the long-range dependencies between the program nodes. A divide and generate approach relying on an extended Parikh Map representation is proposed to address this limitation; experimental validation on real-world and artificial problems demonstrates gains of orders of magnitude compared to the state of the art

Dagstuhl Research Online Publication Server

Metabolic Functions of Peroxisome Proliferator-Activated Receptor β/δ in Skeletal Muscle

Author: Gaudel Céline
Grimaldi Paul A.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2007
Field of study

Peroxisome proliferator-activated receptors (PPARs) are transcription factors that act as lipid sensors and adapt the metabolic rates of various tissues to the concentration of dietary lipids. PPARs are pharmacological targets for the treatment of metabolic disorders. PPARα and PPARγ are activated by hypolipidemic and insulin-sensitizer compounds, such as fibrates and thiazolidinediones. The roles of PPARβ/δ in metabolic regulations remained unclear until recently. Treatment of obese monkeys and rodents by specific PPARβ/δ agonists promoted normalization of metabolic parameters and reduction of adiposity. Recent evidences strongly suggested that some of these beneficial actions are related to activation of fatty acid catabolism in skeletal muscle and also that PPARβ/δ is involved in the adaptive responses of skeletal muscle to environmental changes, such as long-term fasting or physical exercise, by controlling the number of oxidative myofibers. These observations indicated that PPARβ/δ agonists might have therapeutic usefulness in metabolic syndrome by increasing fatty acid consumption in skeletal muscle and reducing obesity

Crossref

Directory of Open Access Journals

PubMed Central

Uniform Random Sampling of Traces in Very Large Models

Author: Collaboration the RaST
Denise Alain
Gaudel Marie-Claude
Gouraud Sandrine-Dominique
Lasseigne Richard
Peyronnet Sylvain
Publication venue
Publication date: 01/01/2006
Field of study

This paper presents some first results on how to perform uniform random walks (where every trace has the same probability to occur) in very large models. The models considered here are described in a succinct way as a set of communicating reactive modules. The method relies upon techniques for counting and drawing uniformly at random words in regular languages. Each module is considered as an automaton defining such a language. It is shown how it is possible to combine local uniform drawings of traces, and to obtain some global uniform random sampling, without construction of the global model

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

Hal-Diderot

Online Matrix Completion Through Nuclear Norm Regularisation

Author: Clémençon Stéphan
Dhanjal Charanpal
Gaudel Romaric
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 10/01/2014
Field of study

Corrected a typo in the affiliationInternational audienceIt is the main goal of this paper to propose a novel method to perform matrix completion on-line. Motivated by a wide variety of applications, ranging from the design of recommender systems to sensor network localization through seismic data reconstruction, we consider the matrix completion problem when entries of the matrix of interest are observed gradually. Precisely, we place ourselves in the situation where the predictive rule should be refined incrementally, rather than recomputed from scratch each time the sample of observed entries increases. The extension of existing matrix completion methods to the sequential prediction context is indeed a major issue in the Big Data era, and yet little addressed in the literature. The algorithm promoted in this article builds upon the Soft Impute approach introduced in Mazumder et al. (2010). The major novelty essentially arises from the use of a randomised technique for both computing and updating the Singular Value Decomposition (SVD) involved in the algorithm. Though of disarming simplicity, the method proposed turns out to be very efficient, while requiring reduced computations. Several numerical experiments based on real datasets illustrating its performance are displayed, together with preliminary results giving it a theoretical basis

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server