31,850 research outputs found
A recommender system for process discovery
Over the last decade, several algorithms for process discovery and process conformance have been proposed. Still, it is well-accepted that there is no dominant algorithm in any of these two disciplines, and then it is often difficult to apply them successfully. Most of these algorithms need a close-to expert knowledge in order to be applied satisfactorily. In this paper, we present a recommender system that uses portfolio-based algorithm selection strategies to face the following problems: to find the best discovery algorithm for the data at hand, and to allow bridging the gap between general users and process mining algorithms. Experiments performed with the developed tool witness the usefulness of the approach for a variety of instances.Peer ReviewedPostprint (author’s final draft
Exploring the generalizability of visual search strategies
When searching our visual environment, we often have multiple strategies available. For example, when looking for apples on a supermarket shelf, you can look for red things, round things, or you can just search serially through all items. How do we choose a strategy? Recent research on this question has revealed substantial variation across individuals in attentional control strategies when approaching visual search tasks, and the strategies have been found to be reliable within subjects. However, strategies on one visual search task have failed to generalize across different paradigms that assess various components of strategy use (Clarke et al., 2018). Thus, evidence for whether strategies generalize beyond a single paradigm remains scarce. While previous tests of generalizability used paradigms that vary in many ways, we focused on a single strategy component that could be preserved across tasks, with several other changes. In two experiments, we assessed the correlation between individuals' strategies in the Standard Adaptive Choice Visual Search (Standard ACVS; Irons & Leber, 2018) and a modified novel visual search task, Spatial ACVS. In the Standard ACVS, participants seeking to perform optimally have to enumerate subsets of different colored squares and identify the smaller subset to choose a target from. Similarly, in the Spatial ACVS, participants seeking optimal performance have to enumerate spatially separate subsets of squares (one on the left and one on the right side of the display), choosing the target in the smaller subset. Participants finished both tasks in the same order in one experimental session. Results showed a positive correlation in optimal target choices between the two tasks (r = .38), indicating similar strategy usage. Future studies can focus on what strategy components tend more to be generalized across tasks and whether an individual's strategy can generalize to tasks with a combination of several strategy components. The ultimate goal is to fully understand how people choose their attentional control strategies in unconstrained, real-life environments.NSF BCS-1632296No embargoAcademic Major: Psycholog
A Multi-objective Exploratory Procedure for Regression Model Selection
Variable selection is recognized as one of the most critical steps in
statistical modeling. The problems encountered in engineering and social
sciences are commonly characterized by over-abundance of explanatory variables,
non-linearities and unknown interdependencies between the regressors. An added
difficulty is that the analysts may have little or no prior knowledge on the
relative importance of the variables. To provide a robust method for model
selection, this paper introduces the Multi-objective Genetic Algorithm for
Variable Selection (MOGA-VS) that provides the user with an optimal set of
regression models for a given data-set. The algorithm considers the regression
problem as a two objective task, and explores the Pareto-optimal (best subset)
models by preferring those models over the other which have less number of
regression coefficients and better goodness of fit. The model exploration can
be performed based on in-sample or generalization error minimization. The model
selection is proposed to be performed in two steps. First, we generate the
frontier of Pareto-optimal regression models by eliminating the dominated
models without any user intervention. Second, a decision making process is
executed which allows the user to choose the most preferred model using
visualisations and simple metrics. The method has been evaluated on a recently
published real dataset on Communities and Crime within United States.Comment: in Journal of Computational and Graphical Statistics, Vol. 24, Iss.
1, 201
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning (ACL) has become a cornerstone of recent
successes in Deep Reinforcement Learning (DRL).These methods shape the learning
trajectories of agents by challenging them with tasks adapted to their
capacities. In recent years, they have been used to improve sample efficiency
and asymptotic performance, to organize exploration, to encourage
generalization or to solve sparse reward problems, among others. The ambition
of this work is dual: 1) to present a compact and accessible introduction to
the Automatic Curriculum Learning literature and 2) to draw a bigger picture of
the current state of the art in ACL to encourage the cross-breeding of existing
concepts and the emergence of new ideas.Comment: Accepted at IJCAI202
Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems
A growing number of applications, e.g. video surveillance and medical image
analysis, require training recognition systems from large amounts of weakly
annotated data while some targeted interactions with a domain expert are
allowed to improve the training process. In such cases, active learning (AL)
can reduce labeling costs for training a classifier by querying the expert to
provide the labels of most informative instances. This paper focuses on AL
methods for instance classification problems in multiple instance learning
(MIL), where data is arranged into sets, called bags, that are weakly labeled.
Most AL methods focus on single instance learning problems. These methods are
not suitable for MIL problems because they cannot account for the bag structure
of data. In this paper, new methods for bag-level aggregation of instance
informativeness are proposed for multiple instance active learning (MIAL). The
\textit{aggregated informativeness} method identifies the most informative
instances based on classifier uncertainty, and queries bags incorporating the
most information. The other proposed method, called \textit{cluster-based
aggregative sampling}, clusters data hierarchically in the instance space. The
informativeness of instances is assessed by considering bag labels, inferred
instance labels, and the proportion of labels that remain to be discovered in
clusters. Both proposed methods significantly outperform reference methods in
extensive experiments using benchmark data from several application domains.
Results indicate that using an appropriate strategy to address MIAL problems
yields a significant reduction in the number of queries needed to achieve the
same level of performance as single instance AL methods
Evolving Ensemble Fuzzy Classifier
The concept of ensemble learning offers a promising avenue in learning from
data streams under complex environments because it addresses the bias and
variance dilemma better than its single model counterpart and features a
reconfigurable structure, which is well suited to the given context. While
various extensions of ensemble learning for mining non-stationary data streams
can be found in the literature, most of them are crafted under a static base
classifier and revisits preceding samples in the sliding window for a
retraining step. This feature causes computationally prohibitive complexity and
is not flexible enough to cope with rapidly changing environments. Their
complexities are often demanding because it involves a large collection of
offline classifiers due to the absence of structural complexities reduction
mechanisms and lack of an online feature selection mechanism. A novel evolving
ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in
this paper. pENsemble differs from existing architectures in the fact that it
is built upon an evolving classifier from data streams, termed Parsimonious
Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism,
which estimates a localized generalization error of a base classifier. A
dynamic online feature selection scenario is integrated into the pENsemble.
This method allows for dynamic selection and deselection of input features on
the fly. pENsemble adopts a dynamic ensemble structure to output a final
classification decision where it features a novel drift detection scenario to
grow the ensemble structure. The efficacy of the pENsemble has been numerically
demonstrated through rigorous numerical studies with dynamic and evolving data
streams where it delivers the most encouraging performance in attaining a
tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
- …