Search CORE

1,190 research outputs found

Data Mining and Decision Support: An Integrative Approach

Author: Kukar Matjaž
Rupnik Rok
Publication venue: 'IntechOpen'
Publication date: 01/01/2010
Field of study

Next challenges for adaptive learning systems

Author: Bifet A.
Gaber M.
Gabrys B.
Gama J.
Minku L.
Musial K.
Zliobaite I.
Publication venue
Publication date: 01/01/2012
Field of study

Learning from evolving streaming data has become a 'hot' research topic in the last decade and many adaptive learning algorithms have been developed. This research was stimulated by rapidly growing amounts of industrial, transactional, sensor and other business data that arrives in real time and needs to be mined in real time. Under such circumstances, constant manual adjustment of models is in-efficient and with increasing amounts of data is becoming infeasible. Nevertheless, adaptive learning models are still rarely employed in business applications in practice. In the light of rapidly growing structurally rich 'big data', new generation of parallel computing solutions and cloud computing services as well as recent advances in portable computing devices, this article aims to identify the current key research directions to be taken to bring the adaptive learning closer to application needs. We identify six forthcoming challenges in designing and building adaptive learning (pre-diction) systems: making adaptive systems scalable, dealing with realistic data, improving usability and trust, integrat-ing expert knowledge, taking into account various application needs, and moving from adaptive algorithms towards adaptive tools. Those challenges are critical for the evolving stream settings, as the process of model building needs to be fully automated and continuous.</jats:p

Crossref

University of Birmingham Research Portal

Portsmouth University Research Portal (Pure)

Where are we headed in business analytics? A framework based on a paradigmatic analysis of the history of analytics

Author: Hassan Nik Rushdi
Publication venue: AIS Electronic Library (AISeL)
Publication date: 07/11/2019
Field of study

The explosion of interest in business analytics (BA) comes with multiple problems. With as many as eleven distinct disciplines teaching analytics, it is not clear which areas of study constitute the BA field. If the information systems (IS) field is to exert a significant influence in analytics, what the IS researcher and practitioner need to focus on has to be made clear. Using a paradigmatic historiographical analysis of the field of analytics this study provides evidence for the bifurcation of analytics into data science and BA as founding disciplines of computer science, mathematics and statistics, machine learning and IS contribute to the analytics movement. The results from this analysis also identify a set of conceptual foundations for BA that takes advantage of both the intellectual strengths of the IS field without sacrificing the necessary depth of data science

AIS Electronic Library (AISeL)

VisRuler: Visual Analytics for Extracting Decision Rules from Bagged and Boosted Decision Trees

Author: Chatzimparmpas Angelos
Kerren Andreas
Martins Rafael M.
Publication venue
Publication date: 05/07/2022
Field of study

Bagging and boosting are two popular ensemble methods in machine learning (ML) that produce many individual decision trees. Due to the inherent ensemble characteristic of these methods, they typically outperform single decision trees or other ML models in predictive performance. However, numerous decision paths are generated for each decision tree, increasing the overall complexity of the model and hindering its use in domains that require trustworthy and explainable decisions, such as finance, social care, and health care. Thus, the interpretability of bagging and boosting algorithms, such as random forest and adaptive boosting, reduces as the number of decisions rises. In this paper, we propose a visual analytics tool that aims to assist users in extracting decisions from such ML models via a thorough visual inspection workflow that includes selecting a set of robust and diverse models (originating from different ensemble learning algorithms), choosing important features according to their global contribution, and deciding which decisions are essential for global explanation (or locally, for specific cases). The outcome is a final decision based on the class agreement of several models and the explored manual decisions exported by users. We evaluated the applicability and effectiveness of VisRuler via a use case, a usage scenario, and a user study. The evaluation revealed that most users managed to successfully use our system to explore decision rules visually, performing the proposed tasks and answering the given questions in a satisfying way.Comment: This manuscript is currently under revie

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Linnéuniversitetets forskningsdatabas

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Near-Optimal Algorithms for Differentially-Private Principal Components

Author: Chaudhuri Kamalika
Sarwate Anand D.
Sinha Kaushik
Publication venue
Publication date: 07/08/2013
Field of study

Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of Machine Learning Research, preliminary version was at NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Shocker Open Access Repository