18,311 research outputs found
On PAC-Bayesian Bounds for Random Forests
Existing guarantees in terms of rigorous upper bounds on the generalization
error for the original random forest algorithm, one of the most frequently used
machine learning methods, are unsatisfying. We discuss and evaluate various
PAC-Bayesian approaches to derive such bounds. The bounds do not require
additional hold-out data, because the out-of-bag samples from the bagging in
the training process can be exploited. A random forest predicts by taking a
majority vote of an ensemble of decision trees. The first approach is to bound
the error of the vote by twice the error of the corresponding Gibbs classifier
(classifying with a single member of the ensemble selected at random). However,
this approach does not take into account the effect of averaging out of errors
of individual classifiers when taking the majority vote. This effect provides a
significant boost in performance when the errors are independent or negatively
correlated, but when the correlations are strong the advantage from taking the
majority vote is small. The second approach based on PAC-Bayesian C-bounds
takes dependencies between ensemble members into account, but it requires
estimating correlations between the errors of the individual classifiers. When
the correlations are high or the estimation is poor, the bounds degrade. In our
experiments, we compute generalization bounds for random forests on various
benchmark data sets. Because the individual decision trees already perform
well, their predictions are highly correlated and the C-bounds do not lead to
satisfactory results. For the same reason, the bounds based on the analysis of
Gibbs classifiers are typically superior and often reasonably tight. Bounds
based on a validation set coming at the cost of a smaller training set gave
better performance guarantees, but worse performance in most experiments
Generalized Off-Policy Actor-Critic
We propose a new objective, the counterfactual objective, unifying existing
objectives for off-policy policy gradient algorithms in the continuing
reinforcement learning (RL) setting. Compared to the commonly used excursion
objective, which can be misleading about the performance of the target policy
when deployed, our new objective better predicts such performance. We prove the
Generalized Off-Policy Policy Gradient Theorem to compute the policy gradient
of the counterfactual objective and use an emphatic approach to get an unbiased
sample from this policy gradient, yielding the Generalized Off-Policy
Actor-Critic (Geoff-PAC) algorithm. We demonstrate the merits of Geoff-PAC over
existing algorithms in Mujoco robot simulation tasks, the first empirical
success of emphatic algorithms in prevailing deep RL benchmarks.Comment: NeurIPS 201
Exploring the Antecedents of Potential Absorptive Capacity and its Impact on Innovation Performance.
This paper builds upon the theoretical framework developed by Zahra and George [Absorptive capacity: a review, reconceptualization, and extension. Academy of Management Review 2002;27:185–203] to empirically explore the antecedents of potential absorptive capacity (PAC), i.e. the ability to identify and assimilate external knowledge flows. Based on a sample of 2464 innovative Spanish firms, we find evidence that R&D cooperation, external knowledge acquisition and experience with knowledge search are key antecedents of a firm’s PAC. Also, during periods of important internal reshaping, when there are significant changes in strategy, design of the organization and marketing, firms exert more effort to accumulate PAC. Finally, we find that PAC is a source of competitive advantage in innovation, especially in the presence of efficient internal knowledge flows that help reduce the distance between potential and realized capacity.Knowledge management; Innovation; Absorptive capacity;
Achieving non-discrimination in prediction
Discrimination-aware classification is receiving an increasing attention in
data science fields. The pre-process methods for constructing a
discrimination-free classifier first remove discrimination from the training
data, and then learn the classifier from the cleaned data. However, they lack a
theoretical guarantee for the potential discrimination when the classifier is
deployed for prediction. In this paper, we fill this gap by mathematically
bounding the probability of the discrimination in prediction being within a
given interval in terms of the training data and classifier. We adopt the
causal model for modeling the data generation mechanism, and formally defining
discrimination in population, in a dataset, and in prediction. We obtain two
important theoretical results: (1) the discrimination in prediction can still
exist even if the discrimination in the training data is completely removed;
and (2) not all pre-process methods can ensure non-discrimination in prediction
even though they can achieve non-discrimination in the modified training data.
Based on the results, we develop a two-phase framework for constructing a
discrimination-free classifier with a theoretical guarantee. The experiments
demonstrate the theoretical results and show the effectiveness of our two-phase
framework
A Survey of Monte Carlo Tree Search Methods
Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work
- …