35,437 research outputs found
A measure of association (correlation) in nominal data (contingency tables), using determinants
Nominal data currently lack a correlation coefficient, such as has already defined for real data. A measure is possible using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m × n contingency table and n ≤ m the suggested measure is r = Sqrt[det[A'A]] with A = Normalized[M]. With M an n1 × n2 × ... × nk contingency matrix, we can construct a matrix of pairwise correlations R so that the overall correlation is f[R]. An option is to use f[R] = Sqrt[1 - det[R]]. However, for both nominal and cardinal data the advisable choice for such a function f is to take the maximal multiple correlation within R.association; correlation; contingency table; volume ratio; determinant; nonparametric methods; nominal data; nominal scale; categorical data; Fisher’s exact test; odds ratio; tetrachoric correlation coefficient; phi; Cramer’s V; Pearson; contingency coefficient; uncertainty coefficient; Theil’s U; eta; meta-analysis; Simpson’s paradox; causality; statistical independence
Markov bases and subbases for bounded contingency tables
In this paper we study the computation of Markov bases for contingency tables
whose cell entries have an upper bound. In general a Markov basis for unbounded
contingency table under a certain model differs from a Markov basis for bounded
tables. Rapallo, (2007) applied Lawrence lifting to compute a Markov basis for
contingency tables whose cell entries are bounded. However, in the process, one
has to compute the universal Gr\"obner basis of the ideal associated with the
design matrix for a model which is, in general, larger than any reduced
Gr\"obner basis. Thus, this is also infeasible in small- and medium-sized
problems. In this paper we focus on bounded two-way contingency tables under
independence model and show that if these bounds on cells are positive, i.e.,
they are not structural zeros, the set of basic moves of all
minors connects all tables with given margins. We end this paper with an open
problem that if we know the given margins are positive, we want to find the
necessary and sufficient condition on the set of structural zeros so that the
set of basic moves of all minors connects all incomplete
contingency tables with given margins.Comment: 22 pages. It will appear in the Annals of the Institution of
Statistical Mathematic
Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures
In this article we propose novel Bayesian nonparametric methods using
Dirichlet Process Mixture (DPM) models for detecting pairwise dependence
between random variables while accounting for uncertainty in the form of the
underlying distributions. A key criteria is that the procedures should scale to
large data sets. In this regard we find that the formal calculation of the
Bayes factor for a dependent-vs.-independent DPM joint probability measure is
not feasible computationally. To address this we present Bayesian diagnostic
measures for characterising evidence against a "null model" of pairwise
independence. In simulation studies, as well as for a real data analysis, we
show that our approach provides a useful tool for the exploratory nonparametric
Bayesian analysis of large multivariate data sets
Geometry of diagonal-effect models for contingency tables
In this work we study several types of diagonal-effect models for two-way
contingency tables in the framework of Algebraic Statistics. We use both toric
models and mixture models to encode the different behavior of the diagonal
cells. We compute the invariants of these models and we explore their
geometrical structure.Comment: 20 page
- …