31,850 research outputs found

    A recommender system for process discovery

    Get PDF
    Over the last decade, several algorithms for process discovery and process conformance have been proposed. Still, it is well-accepted that there is no dominant algorithm in any of these two disciplines, and then it is often difficult to apply them successfully. Most of these algorithms need a close-to expert knowledge in order to be applied satisfactorily. In this paper, we present a recommender system that uses portfolio-based algorithm selection strategies to face the following problems: to find the best discovery algorithm for the data at hand, and to allow bridging the gap between general users and process mining algorithms. Experiments performed with the developed tool witness the usefulness of the approach for a variety of instances.Peer ReviewedPostprint (author’s final draft

    Exploring the generalizability of visual search strategies

    Get PDF
    When searching our visual environment, we often have multiple strategies available. For example, when looking for apples on a supermarket shelf, you can look for red things, round things, or you can just search serially through all items. How do we choose a strategy? Recent research on this question has revealed substantial variation across individuals in attentional control strategies when approaching visual search tasks, and the strategies have been found to be reliable within subjects. However, strategies on one visual search task have failed to generalize across different paradigms that assess various components of strategy use (Clarke et al., 2018). Thus, evidence for whether strategies generalize beyond a single paradigm remains scarce. While previous tests of generalizability used paradigms that vary in many ways, we focused on a single strategy component that could be preserved across tasks, with several other changes. In two experiments, we assessed the correlation between individuals' strategies in the Standard Adaptive Choice Visual Search (Standard ACVS; Irons & Leber, 2018) and a modified novel visual search task, Spatial ACVS. In the Standard ACVS, participants seeking to perform optimally have to enumerate subsets of different colored squares and identify the smaller subset to choose a target from. Similarly, in the Spatial ACVS, participants seeking optimal performance have to enumerate spatially separate subsets of squares (one on the left and one on the right side of the display), choosing the target in the smaller subset. Participants finished both tasks in the same order in one experimental session. Results showed a positive correlation in optimal target choices between the two tasks (r = .38), indicating similar strategy usage. Future studies can focus on what strategy components tend more to be generalized across tasks and whether an individual's strategy can generalize to tasks with a combination of several strategy components. The ultimate goal is to fully understand how people choose their attentional control strategies in unconstrained, real-life environments.NSF BCS-1632296No embargoAcademic Major: Psycholog

    A Multi-objective Exploratory Procedure for Regression Model Selection

    Full text link
    Variable selection is recognized as one of the most critical steps in statistical modeling. The problems encountered in engineering and social sciences are commonly characterized by over-abundance of explanatory variables, non-linearities and unknown interdependencies between the regressors. An added difficulty is that the analysts may have little or no prior knowledge on the relative importance of the variables. To provide a robust method for model selection, this paper introduces the Multi-objective Genetic Algorithm for Variable Selection (MOGA-VS) that provides the user with an optimal set of regression models for a given data-set. The algorithm considers the regression problem as a two objective task, and explores the Pareto-optimal (best subset) models by preferring those models over the other which have less number of regression coefficients and better goodness of fit. The model exploration can be performed based on in-sample or generalization error minimization. The model selection is proposed to be performed in two steps. First, we generate the frontier of Pareto-optimal regression models by eliminating the dominated models without any user intervention. Second, a decision making process is executed which allows the user to choose the most preferred model using visualisations and simple metrics. The method has been evaluated on a recently published real dataset on Communities and Crime within United States.Comment: in Journal of Computational and Graphical Statistics, Vol. 24, Iss. 1, 201

    Automatic Curriculum Learning For Deep RL: A Short Survey

    Full text link
    Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.Comment: Accepted at IJCAI202

    Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems

    Full text link
    A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data is arranged into sets, called bags, that are weakly labeled. Most AL methods focus on single instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance active learning (MIAL). The \textit{aggregated informativeness} method identifies the most informative instances based on classifier uncertainty, and queries bags incorporating the most information. The other proposed method, called \textit{cluster-based aggregative sampling}, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single instance AL methods

    Evolving Ensemble Fuzzy Classifier

    Full text link
    The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
    • …
    corecore