71,413 research outputs found

    Inverse Classification for Comparison-based Interpretability in Machine Learning

    Full text link
    In the context of post-hoc interpretability, this paper addresses the task of explaining the prediction of a classifier, considering the case where no information is available, neither on the classifier itself, nor on the processed data (neither the training nor the test data). It proposes an instance-based approach whose principle consists in determining the minimal changes needed to alter a prediction: given a data point whose classification must be explained, the proposed method consists in identifying a close neighbour classified differently, where the closeness definition integrates a sparsity constraint. This principle is implemented using observation generation in the Growing Spheres algorithm. Experimental results on two datasets illustrate the relevance of the proposed approach that can be used to gain knowledge about the classifier.Comment: preprin

    Data-driven Soft Sensors in the Process Industry

    Get PDF
    In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

    Will This Video Go Viral? Explaining and Predicting the Popularity of Youtube Videos

    Full text link
    What makes content go viral? Which videos become popular and why others don't? Such questions have elicited significant attention from both researchers and industry, particularly in the context of online media. A range of models have been recently proposed to explain and predict popularity; however, there is a short supply of practical tools, accessible for regular users, that leverage these theoretical results. HIPie -- an interactive visualization system -- is created to fill this gap, by enabling users to reason about the virality and the popularity of online videos. It retrieves the metadata and the past popularity series of Youtube videos, it employs Hawkes Intensity Process, a state-of-the-art online popularity model for explaining and predicting video popularity, and it presents videos comparatively in a series of interactive plots. This system will help both content consumers and content producers in a range of data-driven inquiries, such as to comparatively analyze videos and channels, to explain and predict future popularity, to identify viral videos, and to estimate response to online promotion.Comment: 4 page

    Using Grouped Linear Prediction and Accelerated Reinforcement Learning for Online Content Caching

    Full text link
    Proactive caching is an effective way to alleviate peak-hour traffic congestion by prefetching popular contents at the wireless network edge. To maximize the caching efficiency requires the knowledge of content popularity profile, which however is often unavailable in advance. In this paper, we first propose a new linear prediction model, named grouped linear model (GLM) to estimate the future content requests based on historical data. Unlike many existing works that assumed the static content popularity profile, our model can adapt to the temporal variation of the content popularity in practical systems due to the arrival of new contents and dynamics of user preference. Based on the predicted content requests, we then propose a reinforcement learning approach with model-free acceleration (RLMA) for online cache replacement by taking into account both the cache hits and replacement cost. This approach accelerates the learning process in non-stationary environment by generating imaginary samples for Q-value updates. Numerical results based on real-world traces show that the proposed prediction and learning based online caching policy outperform all considered existing schemes.Comment: 6 pages, 4 figures, ICC 2018 worksho
    corecore