1,190 research outputs found

    Data Mining and Decision Support: An Integrative Approach

    Get PDF

    Next challenges for adaptive learning systems

    Get PDF
    Learning from evolving streaming data has become a 'hot' research topic in the last decade and many adaptive learning algorithms have been developed. This research was stimulated by rapidly growing amounts of industrial, transactional, sensor and other business data that arrives in real time and needs to be mined in real time. Under such circumstances, constant manual adjustment of models is in-efficient and with increasing amounts of data is becoming infeasible. Nevertheless, adaptive learning models are still rarely employed in business applications in practice. In the light of rapidly growing structurally rich 'big data', new generation of parallel computing solutions and cloud computing services as well as recent advances in portable computing devices, this article aims to identify the current key research directions to be taken to bring the adaptive learning closer to application needs. We identify six forthcoming challenges in designing and building adaptive learning (pre-diction) systems: making adaptive systems scalable, dealing with realistic data, improving usability and trust, integrat-ing expert knowledge, taking into account various application needs, and moving from adaptive algorithms towards adaptive tools. Those challenges are critical for the evolving stream settings, as the process of model building needs to be fully automated and continuous.</jats:p

    Where are we headed in business analytics? A framework based on a paradigmatic analysis of the history of analytics

    Get PDF
    The explosion of interest in business analytics (BA) comes with multiple problems. With as many as eleven distinct disciplines teaching analytics, it is not clear which areas of study constitute the BA field. If the information systems (IS) field is to exert a significant influence in analytics, what the IS researcher and practitioner need to focus on has to be made clear. Using a paradigmatic historiographical analysis of the field of analytics this study provides evidence for the bifurcation of analytics into data science and BA as founding disciplines of computer science, mathematics and statistics, machine learning and IS contribute to the analytics movement. The results from this analysis also identify a set of conceptual foundations for BA that takes advantage of both the intellectual strengths of the IS field without sacrificing the necessary depth of data science

    VisRuler: Visual Analytics for Extracting Decision Rules from Bagged and Boosted Decision Trees

    Full text link
    Bagging and boosting are two popular ensemble methods in machine learning (ML) that produce many individual decision trees. Due to the inherent ensemble characteristic of these methods, they typically outperform single decision trees or other ML models in predictive performance. However, numerous decision paths are generated for each decision tree, increasing the overall complexity of the model and hindering its use in domains that require trustworthy and explainable decisions, such as finance, social care, and health care. Thus, the interpretability of bagging and boosting algorithms, such as random forest and adaptive boosting, reduces as the number of decisions rises. In this paper, we propose a visual analytics tool that aims to assist users in extracting decisions from such ML models via a thorough visual inspection workflow that includes selecting a set of robust and diverse models (originating from different ensemble learning algorithms), choosing important features according to their global contribution, and deciding which decisions are essential for global explanation (or locally, for specific cases). The outcome is a final decision based on the class agreement of several models and the explored manual decisions exported by users. We evaluated the applicability and effectiveness of VisRuler via a use case, a usage scenario, and a user study. The evaluation revealed that most users managed to successfully use our system to explore decision rules visually, performing the proposed tasks and answering the given questions in a satisfying way.Comment: This manuscript is currently under revie

    Near-Optimal Algorithms for Differentially-Private Principal Components

    Full text link
    Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of Machine Learning Research, preliminary version was at NIPS 201

    Computational support for academic peer review:a perspective from artificial intelligence

    Get PDF
    New tools tackle an age-old practice.</jats:p
    • …
    corecore