30,943 research outputs found

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    Design and enhanced evaluation of a robust anaphor resolution algorithm

    Get PDF
    Syntactic coindexing restrictions are by now known to be of central importance to practical anaphor resolution approaches. Since, in particular due to structural ambiguity, the assumption of the availability of a unique syntactic reading proves to be unrealistic, robust anaphor resolution relies on techniques to overcome this deficiency. This paper describes the ROSANA approach, which generalizes the verification of coindexing restrictions in order to make it applicable to the deficient syntactic descriptions that are provided by a robust state-of-the-art parser. By a formal evaluation on two corpora that differ with respect to text genre and domain, it is shown that ROSANA achieves high-quality robust coreference resolution. Moreover, by an in-depth analysis, it is proven that the robust implementation of syntactic disjoint reference is nearly optimal. The study reveals that, compared with approaches that rely on shallow preprocessing, the largely nonheuristic disjoint reference algorithmization opens up the possibility/or a slight improvement. Furthermore, it is shown that more significant gains are to be expected elsewhere, particularly from a text-genre-specific choice of preference strategies. The performance study of the ROSANA system crucially rests on an enhanced evaluation methodology for coreference resolution systems, the development of which constitutes the second major contribution o/the paper. As a supplement to the model-theoretic scoring scheme that was developed for the Message Understanding Conference (MUC) evaluations, additional evaluation measures are defined that, on one hand, support the developer of anaphor resolution systems, and, on the other hand, shed light on application aspects of pronoun interpretation

    Analyzing the Determinants of the Matching Public School Teachers to Jobs: Estimating Compensating Differentials in Imperfect Labor Markets

    Get PDF
    Although there is growing recognition of the contribution of teachers to students' educational outcomes, there are large gaps in our understanding of how teacher labor markets function. Most research on teacher labor markets use models developed for the private sector. However, markets for public school teachers differ in fundamental ways from those in the private sector. Collective bargaining and public decision making processes set teacher salaries. Thus it is unlikely that wages adjust quickly to equilibrate the supply and demand for worker and job attributes. The objective of this paper is to develop and estimate a model that more accurately characterizes the institutional features of teacher labor markets. The approach is based on a game-theoretic two-sided matching model and the estimation strategy employs the method of simulated moments. With this combination, we are able to estimate how factors affect the choices of individual teachers and hiring authorities, as well as how these choices interact to determine the equilibrium allocation of teachers across jobs. Even though this paper focuses on worker-job match within teacher labor markets, many of the issues raised and the empirical framework employed are relevant in other settings where wages are set administratively or, more generally, do not clear the pertinent markets for job and worker attributes.
    • …
    corecore