16,597 research outputs found

    Finding Statistically Significant Interactions between Continuous Features

    Full text link
    The search for higher-order feature interactions that are statistically significantly associated with a class variable is of high relevance in fields such as Genetics or Healthcare, but the combinatorial explosion of the candidate space makes this problem extremely challenging in terms of computational efficiency and proper correction for multiple testing. While recent progress has been made regarding this challenge for binary features, we here present the first solution for continuous features. We propose an algorithm which overcomes the combinatorial explosion of the search space of higher-order interactions by deriving a lower bound on the p-value for each interaction, which enables us to massively prune interactions that can never reach significance and to thereby gain more statistical power. In our experiments, our approach efficiently detects all significant interactions in a variety of synthetic and real-world datasets.Comment: 13 pages, 5 figures, 2 tables, accepted to the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019

    A consistent nonparametric bootstrap test of exogeneity

    Get PDF
    This paper proposes a novel way of testing exogeneity of an explanatory variable without any parametric assumptions in the presence of a "conditional" instrumental variable. A testable implication is derived that if an explanatory variable is endogenous, the conditional distribution of the outcome given the endogenous variable is not independent of its instrumental variable(s). The test rejects the null hypothesis with probability one if the explanatory variable is endogenous and it detects alternatives converging to the null at a rate n^{-1/2}. We propose a consistent nonparametric bootstrap test to implement this testable implication. We show that the proposed bootstrap test can be asymptotically justified in the sense that it produces asymptotically correct size under the null of exogeneity, and it has unit power asymptotically. Our nonparametric test can be applied to the cases in which the outcome is generated by an additively non-separable structural relation or in which the outcome is discrete, which has not been studied in the literature.Postprin

    Class Size and Sorting in Market Equilibrium: Theory and Evidence

    Get PDF
    This paper examines how schools choose class size and how households sort in response to those choices. Focusing on the highly liberalized Chilean education market, we develop a model in which schools are heterogeneous in an underlying productivity parameter, class size is a component of school quality, households are heterogeneous in income and hence willingness to pay for school quality, and schools are subject to a class-size cap. The model offers an explanation for two distinct empirical patterns observed among private schools that accept government vouchers: (i) There is an inverted-U relationship between class size and household income in equilibrium, which will tend to bias cross-sectional estimates of the effect of class size on student performance. (ii) Some schools at the class size cap adjust prices (or enrollments) to avoid adding another classroom, which produces stacking at enrollments that are multiples of the class size cap. This generates discontinuities in the relationship between enrollment and household characteristics at those points, violating the assumptions underlying regression-discontinuity (RD) research designs. This result suggests that caution is warranted in applying the RD approach in settings in which parents have substantial school choice and schools are free to set prices and influence their enrollments.

    Class size effects: evidence using a new estimation technique

    Get PDF
    This paper estimates the marginal effect of class size on educational attainment of high school students. We control for the potential endogeneity of class size in two ways using a conventional instrumental variable approach, based on changes in cohort size, and an alternative method where identification is based on restriction on higher moments. The data is drawn from the Program for International Student Assessment (PISA) collected in 2003 for the United States and the United Kingdom. Using either method or the two in conjunction leads to the conclusion that increases in class size lead to improvements in student’s mathematics scores. Only the results for the United Kingdom are statistically significant.class sizes, educational production

    Centralised order books versus hybrid order books: a paired comparison of trading costs on NSC (Euronext Paris) and SETS (London Stock Exchange).

    Get PDF
    This article compares the cost of trading large capitalisation equities on the hybrid order-driven segment of the London Stock Exchange and the centralised electronic order book of Euronext. Using samples of stocks matched according to economic sector, free float capitalisation, and trading volume, our study shows that transaction costs are lower on the centralised order book than on the hybrid order book. The presence of dealers outside the electronic order book favours the frequency of large trades, but is associated with higher execution costs for all other trades and higher adverse selection and inventory costs inside the order book.Centralised markets; Fragmentation; Hybrid market; Order Books; Spread components; Transaction cost;

    Testing Identifying Assumptions in Fuzzy Regression Discontinuity Designs

    Get PDF
    corecore