Search CORE

1,184 research outputs found

Multiple Correspondence Analysis & the Multilogit Bilinear Model

Author: Fithian William
Josse Julie
Publication venue
Publication date: 01/01/2016
Field of study

Multiple Correspondence Analysis (MCA) is a dimension reduction method which plays a large role in the analysis of tables with categorical nominal variables such as survey data. Though it is usually motivated and derived using geometric considerations, in fact we prove that it amounts to a single proximal Newtown step of a natural bilinear exponential family model for categorical data the multinomial logit bilinear model. We compare and contrast the behavior of MCA with that of the model on simulations and discuss new insights on the properties of both exploratory multivariate methods and their cognate models. One main conclusion is that we could recommend to approximate the multilogit model parameters using MCA. Indeed, estimating the parameters of the model is not a trivial task whereas MCA has the great advantage of being easily solved by singular value decomposition and scalable to large data

arXiv.org e-Print Archive

Portail HAL UNIV-RENNES

Congruence obstructions to pseudomodularity of Fricke groups

Author: Fithian David
Publication venue
Publication date: 31/07/2007
Field of study

A pseudomodular group is a finite coarea nonarithmetic Fuchsian group whose cusp set is exactly

\mathbb{P}^1(\mathbb{Q})

. Long and Reid constructed finitely many of these by considering Fricke groups, i.e., those that uniformize one-cusped tori. We prove that a zonal Fricke group with rational cusps is pseudomodular if and only if its cusp set is dense in the finite adeles of

\mathbb{Q}

. We then deduce that infinitely many such Fricke groups are not pseudomodular.Comment: 4 page

arXiv.org e-Print Archive

Comptes Rendus Mathématique

Numérisation de Documents Anciens Mathématiques

DELEGATION FROM THE EUROPEAN PARLIAMENT for the relations with the UNITED STATES CONGRESS. London 11-13 July 1977. WORKING DOCUMENT on Nuclear Proliferation and Power: Choices for the Future. 9 June 1977

Author: Fithian Floyd
Publication venue
Publication date: 01/01/1977
Field of study

Archive of European Integration

Local case-control sampling: Efficient subsampling in imbalanced data sets

Author: Fithian William
Hastie Trevor
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

For classification problems with significant class imbalance, subsampling can reduce computational costs at the price of inflated variance in estimating model parameters. We propose a method for subsampling efficiently for logistic regression by adjusting the class balance locally in feature space via an accept-reject scheme. Our method generalizes standard case-control sampling, using a pilot estimate to preferentially select examples whose responses are conditionally rare given their features. The biased subsampling is corrected by a post-hoc analytic adjustment to the parameters. The method is simple and requires one parallelizable scan over the full data set. Standard case-control sampling is inconsistent under model misspecification for the population risk-minimizing coefficients

\theta^*

. By contrast, our estimator is consistent for

\theta^*

provided that the pilot estimate is. Moreover, under correct specification and with a consistent, independent pilot estimate, our estimator has exactly twice the asymptotic variance of the full-sample MLE - even if the selected subsample comprises a miniscule fraction of the full data set, as happens when the original data are severely imbalanced. The factor of two improves to

1+\frac{1}{c}

if we multiply the baseline acceptance probabilities by

c>1

(and weight points with acceptance probability greater than 1), taking roughly

\frac{1+c}{2}

times as many data points into the subsample. Experiments on simulated and real data show that our method can substantially outperform standard case-control subsampling.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1220 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX