4,530 research outputs found
Maximally selected chi-square statistics and umbrella orderings
Binary outcomes that depend on an ordinal predictor in a non-monotonic way are common in medical data analysis. Such patterns can be addressed in terms of cutpoints: for example, one looks for two cutpoints that define an interval in the range of the ordinal predictor for which the probability of a positive outcome is particularly high (or low). A chi-square test may then be performed to compare the proportions of positive outcomes in and outside this interval. However, if the two cutpoints are chosen to maximize the chi-square statistic, referring the obtained chi-square statistic to the standard chi-square distribution is an inappropriate approach. It is then necessary to correct the p-value for multiple comparisons by considering the distribution of the maximally selected chi-square statistic instead of the nominal chi-square distribution. Here, we derive the exact distribution of the chi-square statistic obtained by the optimal two cutpoints. We suggest a combinatorial computation method and illustrate our approach by a simulation study and an application to varicella data
Computing in unipotent and reductive algebraic groups
The unipotent groups are an important class of algebraic groups. We show that
techniques used to compute with finitely generated nilpotent groups carry over
to unipotent groups. We concentrate particularly on the maximal unipotent
subgroup of a split reductive group and show how this improves computation in
the reductive group itself.Comment: 22 page
Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints
Unsupervised estimation of latent variable models is a fundamental problem
central to numerous applications of machine learning and statistics. This work
presents a principled approach for estimating broad classes of such models,
including probabilistic topic models and latent linear Bayesian networks, using
only second-order observed moments. The sufficient conditions for
identifiability of these models are primarily based on weak expansion
constraints on the topic-word matrix, for topic models, and on the directed
acyclic graph, for Bayesian networks. Because no assumptions are made on the
distribution among the latent variables, the approach can handle arbitrary
correlations among the topics or latent factors. In addition, a tractable
learning method via optimization is proposed and studied in numerical
experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and
Bayesian networks are studied. Simulation section is adde
Recommended from our members
Experimental evaluation of preprocessing algorithms for constraint satisfaction problems
This paper presents an experimental evaluation of two orthogonal schemes for preprocessing constraint satisfaction problems (CSPs). The first of these schemes involves a class of local consistency techniques that includes directional arc consistency, directional path consistency, and adaptive consistency. The other scheme concerns the prearrangement of variables in a linear order to facilitate an efficient search. In the first series of experiments, we evaluated the effect of each of the local consistency techniques on backtracking and its common enhancement, backjumping. Surprizingly, although adaptive consistency has the best worst-case complexity bounds, we have found that it exhibits the worst performance, unless the constraint graph was very sparse. Directional arc consistency (followed by either backjumping or backtracking) and backjumping (without any pre-processing) outperformed all other techniques; moreover, the former dominated the latter in computationally intensive situations. The second series of experiments suggests that maximum cardinality and minimum width arc the best pre-ordering (i.e., static ordering) strategies, while dynamic search rearrangement is superior to all the preorderings studied
Exact Algorithms for Maximum Clique: a computational study
We investigate a number of recently reported exact algorithms for the maximum
clique problem (MCQ, MCR, MCS, BBMC). The program code used is presented and
critiqued showing how small changes in implementation can have a drastic effect
on performance. The computational study demonstrates how problem features and
hardware platforms influence algorithm behaviour. The minimum width order
(smallest-last) is investigated, and MCS is broken into its consituent parts
and we discover that one of these parts degrades performance. It is shown that
the standard procedure used for rescaling published results is unsafe.Comment: 40 pages, 14 figures, 10 tables, 12 short java program listings, code
afailable to download at
http://www.dcs.gla.ac.uk/~pat/maxClique/distribution
- …