1,190 research outputs found
Next challenges for adaptive learning systems
Learning from evolving streaming data has become a 'hot' research topic in the last decade and many adaptive learning algorithms have been developed. This research was stimulated by rapidly growing amounts of industrial, transactional, sensor and other business data that arrives in real time and needs to be mined in real time. Under such circumstances, constant manual adjustment of models is in-efficient and with increasing amounts of data is becoming infeasible. Nevertheless, adaptive learning models are still rarely employed in business applications in practice. In the light of rapidly growing structurally rich 'big data', new generation of parallel computing solutions and cloud computing services as well as recent advances in portable computing devices, this article aims to identify the current key research directions to be taken to bring the adaptive learning closer to application needs. We identify six forthcoming challenges in designing and building adaptive learning (pre-diction) systems: making adaptive systems scalable, dealing with realistic data, improving usability and trust, integrat-ing expert knowledge, taking into account various application needs, and moving from adaptive algorithms towards adaptive tools. Those challenges are critical for the evolving stream settings, as the process of model building needs to be fully automated and continuous.</jats:p
Where are we headed in business analytics? A framework based on a paradigmatic analysis of the history of analytics
The explosion of interest in business analytics (BA) comes with multiple problems. With as many as eleven distinct disciplines teaching analytics, it is not clear which areas of study constitute the BA field. If the information systems (IS) field is to exert a significant influence in analytics, what the IS researcher and practitioner need to focus on has to be made clear. Using a paradigmatic historiographical analysis of the field of analytics this study provides evidence for the bifurcation of analytics into data science and BA as founding disciplines of computer science, mathematics and statistics, machine learning and IS contribute to the analytics movement. The results from this analysis also identify a set of conceptual foundations for BA that takes advantage of both the intellectual strengths of the IS field without sacrificing the necessary depth of data science
VisRuler: Visual Analytics for Extracting Decision Rules from Bagged and Boosted Decision Trees
Bagging and boosting are two popular ensemble methods in machine learning
(ML) that produce many individual decision trees. Due to the inherent ensemble
characteristic of these methods, they typically outperform single decision
trees or other ML models in predictive performance. However, numerous decision
paths are generated for each decision tree, increasing the overall complexity
of the model and hindering its use in domains that require trustworthy and
explainable decisions, such as finance, social care, and health care. Thus, the
interpretability of bagging and boosting algorithms, such as random forest and
adaptive boosting, reduces as the number of decisions rises. In this paper, we
propose a visual analytics tool that aims to assist users in extracting
decisions from such ML models via a thorough visual inspection workflow that
includes selecting a set of robust and diverse models (originating from
different ensemble learning algorithms), choosing important features according
to their global contribution, and deciding which decisions are essential for
global explanation (or locally, for specific cases). The outcome is a final
decision based on the class agreement of several models and the explored manual
decisions exported by users. We evaluated the applicability and effectiveness
of VisRuler via a use case, a usage scenario, and a user study. The evaluation
revealed that most users managed to successfully use our system to explore
decision rules visually, performing the proposed tasks and answering the given
questions in a satisfying way.Comment: This manuscript is currently under revie
Near-Optimal Algorithms for Differentially-Private Principal Components
Principal components analysis (PCA) is a standard tool for identifying good
low-dimensional approximations to data in high dimension. Many data sets of
interest contain private or sensitive information about individuals. Algorithms
which operate on such data should be sensitive to the privacy risks in
publishing their outputs. Differential privacy is a framework for developing
tradeoffs between privacy and the utility of these outputs. In this paper we
investigate the theory and empirical performance of differentially private
approximations to PCA and propose a new method which explicitly optimizes the
utility of the output. We show that the sample complexity of the proposed
method differs from the existing procedure in the scaling with the data
dimension, and that our method is nearly optimal in terms of this scaling. We
furthermore illustrate our results, showing that on real data there is a large
performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of
Machine Learning Research, preliminary version was at NIPS 201
Computational support for academic peer review:a perspective from artificial intelligence
New tools tackle an age-old practice.</jats:p
- …