10,909 research outputs found
Simple stopping criteria for information theoretic feature selection
Feature selection aims to select the smallest feature subset that yields the
minimum generalization error. In the rich literature in feature selection,
information theory-based approaches seek a subset of features such that the
mutual information between the selected features and the class labels is
maximized. Despite the simplicity of this objective, there still remain several
open problems in optimization. These include, for example, the automatic
determination of the optimal subset size (i.e., the number of features) or a
stopping criterion if the greedy searching strategy is adopted. In this paper,
we suggest two stopping criteria by just monitoring the conditional mutual
information (CMI) among groups of variables. Using the recently developed
multivariate matrix-based Renyi's \alpha-entropy functional, which can be
directly estimated from data samples, we showed that the CMI among groups of
variables can be easily computed without any decomposition or approximation,
hence making our criteria easy to implement and seamlessly integrated into any
existing information theoretic feature selection methods with a greedy search
strategy.Comment: Paper published in the journal of Entrop
Resampling methods for parameter-free and robust feature selection with mutual information
Combining the mutual information criterion with a forward feature selection
strategy offers a good trade-off between optimality of the selected feature
subset and computation time. However, it requires to set the parameter(s) of
the mutual information estimator and to determine when to halt the forward
procedure. These two choices are difficult to make because, as the
dimensionality of the subset increases, the estimation of the mutual
information becomes less and less reliable. This paper proposes to use
resampling methods, a K-fold cross-validation and the permutation test, to
address both issues. The resampling methods bring information about the
variance of the estimator, information which can then be used to automatically
set the parameter and to calculate a threshold to stop the forward procedure.
The procedure is illustrated on a synthetic dataset as well as on real-world
examples
Perfect Information vs Random Investigation: Safety Guidelines for a Consumer in the Jungle of Product Differentiation
We present a graph-theoretic model of consumer choice, where final decisions
are shown to be influenced by information and knowledge, in the form of
individual awareness, discriminating ability, and perception of market
structure. Building upon the distance-based Hotelling's differentiation idea,
we describe the behavioral experience of several prototypes of consumers, who
walk a hypothetical cognitive path in an attempt to maximize their
satisfaction. Our simulations show that even consumers endowed with a small
amount of information and knowledge may reach a very high level of utility. On
the other hand, complete ignorance negatively affects the whole consumption
process. In addition, rather unexpectedly, a random walk on the graph reveals
to be a winning strategy, below a minimal threshold of information and
knowledge.Comment: 27 pages, 12 figure
Massively-Parallel Feature Selection for Big Data
We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for
feature selection (FS) in Big Data settings (high dimensionality and/or sample
size). To tackle the challenges of Big Data FS PFBP partitions the data matrix
both in terms of rows (samples, training examples) as well as columns
(features). By employing the concepts of -values of conditional independence
tests and meta-analysis techniques PFBP manages to rely only on computations
local to a partition while minimizing communication costs. Then, it employs
powerful and safe (asymptotically sound) heuristics to make early, approximate
decisions, such as Early Dropping of features from consideration in subsequent
iterations, Early Stopping of consideration of features within the same
iteration, or Early Return of the winner in each iteration. PFBP provides
asymptotic guarantees of optimality for data distributions faithfully
representable by a causal network (Bayesian network or maximal ancestral
graph). Our empirical analysis confirms a super-linear speedup of the algorithm
with increasing sample size, linear scalability with respect to the number of
features and processing cores, while dominating other competitive algorithms in
its class
Optimal Structuring of Assessment Processes in Competition Law: A Survey of Theoretical Approaches
In competition law, the problem of the optimal design of institutional and procedural rules concerns assessment processes of the pro- and anticompetitiveness of business behaviors. This is well recognized in the discussion about the relative merits of different assessment principles such as the rule of reason and per se rules. Supported by modern industrial organization research, which applies a more differentiated analysis to the welfare effects of different business behaviors, a full-scale case-by-case assessment seems to be the prevailing idea. Even though the discussion mainly focuses on extreme solutions, different theoretical approaches do exist, which provide important determinants and allow for a sound analysis of appropriate legal directives and investigation procedures from a âLaw and Economicsâ perspective. Integrating and examining them in light of various constellations results in differentiated solutions of optimally structured assessment processes.Law Enforcement, Competition Law, Competition Policy, Antitrust Law, Antitrust Policy, Decision-Making
Basics of Feature Selection and Statistical Learning for High Energy Physics
This document introduces basics in data preparation, feature selection and
learning basics for high energy physics tasks. The emphasis is on feature
selection by principal component analysis, information gain and significance
measures for features. As examples for basic statistical learning algorithms,
the maximum a posteriori and maximum likelihood classifiers are shown.
Furthermore, a simple rule based classification as a means for automated cut
finding is introduced. Finally two toolboxes for the application of statistical
learning techniques are introduced.Comment: 12 pages, 8 figures. Part of the proceedings of the Track
'Computational Intelligence for HEP Data Analysis' at iCSC 200
- âŠ