Search CORE

3,046 research outputs found

DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation

Author: Liebman Elad
Saar-Tsechansky Maytal
Stone Peter
Publication venue
Publication date: 25/03/2015
Field of study

In recent years, there has been growing focus on the study of automated recommender systems. Music recommendation systems serve as a prominent domain for such works, both from an academic and a commercial perspective. A fundamental aspect of music perception is that music is experienced in temporal context and in sequence. In this work we present DJ-MC, a novel reinforcement-learning framework for music recommendation that does not recommend songs individually but rather song sequences, or playlists, based on a model of preferences for both songs and song transitions. The model is learned online and is uniquely adapted for each listener. To reduce exploration time, DJ-MC exploits user feedback to initialize a model, which it subsequently updates by reinforcement. We evaluate our framework with human participants using both real song and playlist data. Our results indicate that DJ-MC's ability to recommend sequences of songs provides a significant improvement over more straightforward approaches, which do not take transitions into account.Comment: -Updated to the most recent and completed version (to be presented at AAMAS 2015) -Updated author list. in Autonomous Agents and Multiagent Systems (AAMAS) 2015, Istanbul, Turkey, May 201

arXiv.org e-Print Archive

CiteSeerX

MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis

Author: Anirudh Rushil
Bremer Timo
Sridhar Rahul
Thiagarajan Jayaraman J.
Publication venue
Publication date: 03/12/2018
Field of study

Interpretability has emerged as a crucial aspect of machine learning, aimed at providing insights into the working of complex neural networks. However, existing solutions vary vastly based on the nature of the interpretability task, with each use case requiring substantial time and effort. This paper introduces MARGIN, a simple yet general approach to address a large set of interpretability tasks ranging from identifying prototypes to explaining image predictions. MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph. By carefully defining task-specific graphs and functions, we demonstrate that MARGIN outperforms existing approaches in a number of disparate interpretability challenges.Comment: Technical Repor

arXiv.org e-Print Archive

Directory of Open Access Journals

Algorithm Portfolios for Noisy Optimization

Author: Baptiste Rozière
Cauwet Marie-Liesse
Liu Jialin
Teytaud Olivier
Publication venue
Publication date: 02/11/2015
Field of study

Noisy optimization is the optimization of objective functions corrupted by noise. A portfolio of solvers is a set of solvers equipped with an algorithm selection tool for distributing the computational power among them. Portfolios are widely and successfully used in combinatorial optimization. In this work, we study portfolios of noisy optimization solvers. We obtain mathematically proved performance (in the sense that the portfolio performs nearly as well as the best of its solvers) by an ad hoc portfolio algorithm dedicated to noisy optimization. A somehow surprising result is that it is better to compare solvers with some lag, i.e., propose the current recommendation of best solver based on their performance earlier in the run. An additional finding is a principled method for distributing the computational power among solvers in the portfolio.Comment: in Annals of Mathematics and Artificial Intelligence, Springer Verlag, 201

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Estimating the Maximum Expected Value: An Analysis of (Nested) Cross Validation and the Maximum Sample Average

Author: van Hasselt Hado
Publication venue
Publication date: 01/03/2013
Field of study

We investigate the accuracy of the two most common estimators for the maximum expected value of a general set of random variables: a generalization of the maximum sample average, and cross validation. No unbiased estimator exists and we show that it is non-trivial to select a good estimator without knowledge about the distributions of the random variables. We investigate and bound the bias and variance of the aforementioned estimators and prove consistency. The variance of cross validation can be significantly reduced, but not without risking a large bias. The bias and variance of different variants of cross validation are shown to be very problem-dependent, and a wrong choice can lead to very inaccurate estimates

arXiv.org e-Print Archive

CiteSeerX

Data Optimization in Deep Learning: A Survey

Author: Wu Ou
Yao Rujing
Publication venue
Publication date: 25/10/2023
Field of study

Large-scale, high-quality data are considered an essential factor for the successful application of many deep learning techniques. Meanwhile, numerous real-world deep learning tasks still have to contend with the lack of sufficient amounts of high-quality data. Additionally, issues such as model robustness, fairness, and trustworthiness are also closely related to training data. Consequently, a huge number of studies in the existing literature have focused on the data aspect in deep learning tasks. Some typical data optimization techniques include data augmentation, logit perturbation, sample weighting, and data condensation. These techniques usually come from different deep learning divisions and their theoretical inspirations or heuristic motivations may seem unrelated to each other. This study aims to organize a wide range of existing data optimization methodologies for deep learning from the previous literature, and makes the effort to construct a comprehensive taxonomy for them. The constructed taxonomy considers the diversity of split dimensions, and deep sub-taxonomies are constructed for each dimension. On the basis of the taxonomy, connections among the extensive data optimization methods for deep learning are built in terms of four aspects. We probe into rendering several promising and interesting future directions. The constructed taxonomy and the revealed connections will enlighten the better understanding of existing methods and the design of novel data optimization techniques. Furthermore, our aspiration for this survey is to promote data optimization as an independent subdivision of deep learning. A curated, up-to-date list of resources related to data optimization in deep learning is available at \url{https://github.com/YaoRujing/Data-Optimization}

arXiv.org e-Print Archive

Automatic Handling of Imbalanced Datasets for Classification

Author: Vieira Pedro Marques
Publication venue
Publication date: 01/01/2022
Field of study

Imbalanced data is present in various business areas and when facing it without proper knowledge, it can have undesired negative consequences. In addition, the most common evaluation metrics in machine learning to measure the desired solution can be inappropriate and misleading. Multiple combinations of methods are proposed to handle imbalanced data however, often, they required specialised knowledge to be used correctly. For imbalanced classification, the desire to correctly classify the underrepresented class tends to be more important than the overrepresented class, while being more challenging and time-consuming. Several approaches, ranging from more accessible and more advanced in the domains of data resampling and cost-sensitive techniques, will be considered to handle imbalanced data. The application developed delivers recommendations of the most suited combinations of techniques for the specific dataset imported, by extracting and comparing meta-features values recorded in a knowledge base. It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results to a state-of-the-art solution and with a much smaller execution timeOs dados não balanceados estão presentes em diversas áreas de negócio e, ao enfrentá-los sem o devido conhecimento, podem trazer consequências negativas e indesejadas. Além disso, as métricas de avaliação mais comuns em aprendizagem de máquina (machine learning) para medir a solução desejada podem ser inadequadas e enganosas. Múltiplas combinações de métodos são propostas para lidar com dados não balanceados, contudo, muitas vezes, estas exigem um conhecimento especializado para serem usadas corretamente. Para a classificação não balanceada, o desejo de classificar corretamente a classe sub-representada tende a ser mais importante do que a classe que está representada em demasia, sendo mais difícil e demorado. Várias abordagens, desde as mais acessíveis até as mais avançadas nos domínios de reamostragem de dados e técnicas sensíveis ao custo vão ser consideradas para lidar com dados não balanceados. A aplicação desenvolvida fornece recomendações das combinações de técnicas mais adequadas para o conjunto de dados específico importado, extraindo e comparando os valores de meta características registados numa base de conhecimento. Ela facilita a classificação sem esforço e automatiza parte das etapas de aprendizagem de máquina com resultados comparáveis ou melhores a uma solução de estado da arte e com tempo de execução muito meno

Repositório Científico do Instituto Politécnico do Porto