109 research outputs found
Exploratory Analysis of Highly Heterogeneous Document Collections
We present an effective multifaceted system for exploratory analysis of
highly heterogeneous document collections. Our system is based on intelligently
tagging individual documents in a purely automated fashion and exploiting these
tags in a powerful faceted browsing framework. Tagging strategies employed
include both unsupervised and supervised approaches based on machine learning
and natural language processing. As one of our key tagging strategies, we
introduce the KERA algorithm (Keyword Extraction for Reports and Articles).
KERA extracts topic-representative terms from individual documents in a purely
unsupervised fashion and is revealed to be significantly more effective than
state-of-the-art methods. Finally, we evaluate our system in its ability to
help users locate documents pertaining to military critical technologies buried
deep in a large heterogeneous sea of information.Comment: 9 pages; KDD 2013: 19th ACM SIGKDD Conference on Knowledge Discovery
and Data Minin
Deep Reinforcement Learning that Matters
In recent years, significant progress has been made in solving challenging
problems across various domains using deep reinforcement learning (RL).
Reproducing existing work and accurately judging the improvements offered by
novel methods is vital to sustaining this progress. Unfortunately, reproducing
results for state-of-the-art deep RL methods is seldom straightforward. In
particular, non-determinism in standard benchmark environments, combined with
variance intrinsic to the methods, can make reported results tough to
interpret. Without significance metrics and tighter standardization of
experimental reporting, it is difficult to determine whether improvements over
the prior state-of-the-art are meaningful. In this paper, we investigate
challenges posed by reproducibility, proper experimental techniques, and
reporting procedures. We illustrate the variability in reported metrics and
results when comparing against common baselines and suggest guidelines to make
future results in deep RL more reproducible. We aim to spur discussion about
how to ensure continued progress in the field by minimizing wasted effort
stemming from results that are non-reproducible and easily misinterpreted.Comment: Accepted to the Thirthy-Second AAAI Conference On Artificial
Intelligence (AAAI), 201
Holistic Measures for Evaluating Prediction Models in Smart Grids
The performance of prediction models is often based on "abstract metrics"
that estimate the model's ability to limit residual errors between the observed
and predicted values. However, meaningful evaluation and selection of
prediction models for end-user domains requires holistic and
application-sensitive performance measures. Inspired by energy consumption
prediction models used in the emerging "big data" domain of Smart Power Grids,
we propose a suite of performance measures to rationally compare models along
the dimensions of scale independence, reliability, volatility and cost. We
include both application independent and dependent measures, the latter
parameterized to allow customization by domain experts to fit their scenario.
While our measures are generalizable to other domains, we offer an empirical
analysis using real energy use data for three Smart Grid applications:
planning, customer education and demand response, which are relevant for energy
sustainability. Our results underscore the value of the proposed measures to
offer a deeper insight into models' behavior and their impact on real
applications, which benefit both data mining researchers and practitioners.Comment: 14 Pages, 8 figures, Accepted and to appear in IEEE Transactions on
Knowledge and Data Engineering, 2014. Authors' final version. Copyright
transferred to IEE
GP vs GI: if you can't beat them, join them
Genetic Programming (GP) has been criticized for targeting irrelevant problems [12], and is also true of the wider machine learning community [11]. which has become detached from the source of the data it is using to drive the field forward. However, recently GI provides a fresh perspective on automated programming. In contrast to GP, GI begins with existing software, and therefore immediately has the aim of tackling real software. As evolution is the main approach to GI to manipulating programs, this connection with real software should persuade the GP community to confront the issues around what it originally set out to tackle i.e. evolving real software
Introduction to the special issue on Machine learning for multiple modalities in interactive systems and robots
This special issue highlights research articles that apply machine learning to robots and other systems that interact with users through more than one modality, such as speech, gestures, and vision. For example, a robot may coordinate its speech with its actions, taking into account (audio-)visual feedback during their execution. Machine learning provides interactive systems with opportunities to improve performance not only of individual components but also of the system as a whole. However, machine learning methods that encompass multiple modalities of an interactive system are still relatively hard to find. The articles in this special issue represent examples that contribute to filling this gap
- …