380 research outputs found
Evaluation Evaluation: a Monte Carlo study
Over the last decade there has been increasing concern
about the biases embodied in traditional evaluation methods for
Natural Language Processing/Learning, particularly methods
borrowed from Information Retrieval. Without knowledge of the
Bias and Prevalence of the contingency being tested, or equivalently
the expectation due to chance, the simple conditional probabilities
Recall, Precision and Accuracy are not meaningful as evaluation
measures, either individually or in combinations such as F-factor.
The existence of bias in NLP measures leads to the ‘improvement’
of systems by increasing their bias, such as the practice of improving
tagging and parsing scores by using most common value (e.g. water
is always a Noun) rather than the attempting to discover the correct
one. The measures Cohen Kappa and Powers Informedness are
discussed as unbiased alternative to Recall and related to the
psychologically significant measure DeltaP.
In this paper we will analyze both biased and unbiased measures
theoretically, characterizing the precise relationship between all
these measures as well as evaluating the evaluation measures
themselves empirically using a Monte Carlo simulation
Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation
Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are
biased and should not be used without clear understanding of the biases, and corresponding identification of chance
or base case levels of the statistic. Using these measures a system that performs worse in the objective sense of
Informedness, can appear to perform better under any of these commonly used measures. We discuss several
concepts and measures that reflect the probability that prediction is informed versus chance. Informedness and
introduce Markedness as a dual measure for the probability that prediction is marked versus chance. Finally we
demonstrate elegant connections between the concepts of Informedness, Markedness, Correlation and Significance
as well as their intuitive relationships with Recall and Precision, and outline the extension from the dichotomous case
to the general multi-class case
Adabook and Multibook: adaptive boosting with chance correction
There has been considerable interest in boosting and bagging, including the combination of the adaptive
techniques of AdaBoost with the random selection with replacement techniques of Bagging. At the same
time there has been a revisiting of the way we evaluate, with chance-corrected measures like Kappa,
Informedness, Correlation or ROC AUC being advocated. This leads to the question of whether learning
algorithms can do better by optimizing an appropriate chance corrected measure. Indeed, it is possible for a
weak learner to optimize Accuracy to the detriment of the more reaslistic chance-corrected measures, and
when this happens the booster can give up too early. This phenomenon is known to occur with conventional
Accuracy-based AdaBoost, and the MultiBoost algorithm has been developed to overcome such problems
using restart techniques based on bagging. This paper thus complements the theoretical work showing the
necessity of using chance-corrected measures for evaluation, with empirical work showing how use of a
chance-corrected measure can improve boosting. We show that the early surrender problem occurs in
MultiBoost too, in multiclass situations, so that chance-corrected AdaBook and Multibook can beat standard
Multiboost or AdaBoost, and we further identify which chance-corrected measures to use when
Market Bubbles and Wasteful Avoidance: Tax and Regulatory Constraints on Short Sales
Although short sales make an important contribution to financial markets, this transaction faces legal constraints that do not govern long positions. In evaluating these constraints, other commentators, who are virtually all economists, have not focused rigorously enough on the precise contours of current law. Some short sale constraints are mischaracterized, while others are omitted entirely. Likewise, the existing literature neglects many strategies in which well advised investors circumvent these constraints; this avoidance may reduce the impact of short sale constraints on market prices, but may contribute to social waste in other ways. To fill these gaps in the literature, this paper offers a careful look at current law and draws three conclusions. First, short sales play a valuable role in the financial markets; while there may be plausible reasons to regulate short sales-- most notably, concerns about market manipulation and panics -- current law is very poorly tailored to these goals. Second, investor self-help can ease some of the harm from this poor tailoring, but at a cost. Third, relatively straightforward reforms can eliminate the need for self-help while accommodating legitimate regulatory goals. In making these points, we focus primarily on a burden that other commentators have neglected: profits from short sales generally are ineligible for the reduced tax rate on long-term capital gains, even if the short sale is in place for more than one year.Short sales, Momentum traders, Value investors
Evolvability and redundancy in shared grammar evolution
Los Alamitos, C
- …