    Variable selection using MM algorithms

    Variable selection is fundamental to high-dimensional statistical modeling. Many variable selection techniques may be implemented by maximum penalized likelihood using various penalty functions. Optimizing the penalized likelihood function is often challenging because it may be nondifferentiable and/or nonconcave. This article proposes a new class of algorithms for finding a maximizer of the penalized likelihood for a broad class of penalty functions. These algorithms operate by perturbing the penalty function slightly to render it differentiable, then optimizing this differentiable function using a minorize-maximize (MM) algorithm. MM algorithms are useful extensions of the well-known class of EM algorithms, a fact that allows us to analyze the local and global convergence of the proposed algorithm using some of the techniques employed for EM algorithms. In particular, we prove that when our MM algorithms converge, they must converge to a desirable point; we also discuss conditions under which this convergence may be guaranteed. We exploit the Newton-Raphson-like aspect of these algorithms to propose a sandwich estimator for the standard errors of the estimators. Our method performs well in numerical tests.Comment: Published at http://dx.doi.org/10.1214/009053605000000200 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Inference for mixtures of symmetric distributions

    This article discusses the problem of estimation of parameters in finite mixtures when the mixture components are assumed to be symmetric and to come from the same location family. We refer to these mixtures as semi-parametric because no additional assumptions other than symmetry are made regarding the parametric form of the component distributions. Because the class of symmetric distributions is so broad, identifiability of parameters is a major issue in these mixtures. We develop a notion of identifiability of finite mixture models, which we call k-identifiability, where k denotes the number of components in the mixture. We give sufficient conditions for k-identifiability of location mixtures of symmetric components when k=2 or 3. We propose a novel distance-based method for estimating the (location and mixing) parameters from a k-identifiable model and establish the strong consistency and asymptotic normality of the estimator. In the specific case of L_2-distance, we show that our estimator generalizes the Hodges--Lehmann estimator. We discuss the numerical implementation of these procedures, along with an empirical estimate of the component distribution, in the two-component case. In comparisons with maximum likelihood estimation assuming normal components, our method produces somewhat higher standard error estimates in the case where the components are truly normal, but dramatically outperforms the normal method when the components are heavy-tailed.Comment: Published at http://dx.doi.org/10.1214/009053606000001118 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Synoptic/planetary-scale interactions and blocking over the North Atlantic Ocean

    The central theme of this project has been the diagnosis of blocking anticyclogenesis and the corresponding interactions with synoptic-scale circulations. To that end an extensive investigation of the dynamics and energetics of a major blocking anticyclone and two upstream cyclones, all of which occurred over the North Atlantic Ocean and the United States in January 1979, was undertaken. Data for the study were provided by Goddard Laboratory for Atmospheres (GLA) 4 LAT by 5 LON FGGE analyses. The methodology has primarily focused on the diagnosis of circulation forcing mechanisms using the modified forms (referred to as the extended forms) of the height tendency and Zwack-Okossi equations developed by our research group. Calculations use routine second-order finite differencing with boundary layer fraction and sensible heating and latent heat release represented as parameterized quantities. Of particular interest are the latent heat release estimates, which combine convectional parameterized values with estimates derived from satellite IR data. The latter were obtained using an algorithm derived by Dr. Franklin R. Robertson of NASA's Marshall Space Flight Center. Results are contained in project reports, theses and publications identified in previous review summaries and reports, and publications listed at the end of this summary. Significant accomplishments in the past year are presented

    mixtools: An R Package for Analyzing Mixture Models

    The mixtools package for R provides a set of functions for analyzing a variety of finite mixture models. These functions include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reflect some recent research in finite mixture models. In the latter category, mixtools provides algorithms for estimating parameters in a wide range of different mixture-of-regression contexts, in multinomial mixtures such as those arising from discretizing continuous multivariate data, in nonparametric situations where the multivariate component densities are completely unspecified, and in semiparametric situations such as a univariate location mixture of symmetric but otherwise unspecified densities. Many of the algorithms of the mixtools package are EM algorithms or are based on EM-like ideas, so this article includes an overview of EM algorithms for finite mixture models.

    Factor structure of the Gotland Scale of male depression in two samples of men with prostate cancer:Implications for treating male depression

    Up to a quarter of all prostate cancer (PCa) patients suffer from clinically significant depression but treatments are inconsistent and short-lived in their efficacy. One possible reason could be that 'male depression' is not adequately diagnosed by the criteria for major depressive disorder (MDD) used in many clinical settings.In response to this limitation, the Gotland Scale of Male Depression (GSMD) was developed to identify the extra symptoms of MDD in men. Although the factor structure of the GSMD has been reported in non-PCa samples, it has not been determined for this group of men. Two samples of PCa patients were recruited, 191 from Australia and 138 from the United Kingdom and all patients received the GSMD individually, plus a background questionnaire. Two-factor solutions were identified for each of the two samples. The Australian sample was characterized by changes in emotional and somatic function, followed by depressed mood. The U.K. sample exhibited the same two-factor solution but in reverse order of weighting. Targeted treatments for depression in PCa patients may benefit from identification of the loadings that individual patients have on these two GSMD factors so that specific clinical profiles and treatment needs may be based on this information about their depression

    EPR spectroscopy of iron- and nickel-doped [ZnAl]-layered double hydroxides: modeling active sites in heterogeneous water oxidation catalysts

    Iron-doped nickel layered double hydroxides (LDHs) are among the most active heterogeneous water oxidation catalysts. Due to inter-spin interactions, however, the high density of magnetic centers results in line-broadening in magnetic resonance spectra. As a result, gaining atomic-level insight into the catalytic mechanism via electron paramagnetic resonance (EPR) is not generally possible. To circumvent spin-spin broadening, iron and nickel atoms were doped into non-magnetic [ZnAl]-LDH materials and the coordination environments of the isolated Fe(III) and Ni(II) sites were characterized. Multifrequency EPR spectroscopy identified two distinct Fe(III) sites (S = 5/2) in [Fe:ZnAl]-LDH. Changes in zero field splitting (ZFS) were induced by dehydration of the material, revealing that one of the Fe(III) sites is solvent-exposed (i.e. at an edge, corner, or defect site). These solvent-exposed sites feature an axial ZFS of 0.21 cm⁻¹ when hydrated. The ZFS increases dramatically upon dehydration (to -1.5 cm⁻¹), owing to lower symmetry and a decrease in the coordination number of iron. The ZFS of the other (“inert”) Fe(III) site maintains an axial ZFS of 0.19-0.20 cm⁻¹ under both hydrated and dehydrated conditions. We observed a similar effect in [Ni:ZnAl]-LDH materials; notably, Ni(II) (S = 1) atoms displayed a single, small ZFS (±0.30 cm⁻¹) in hydrated material, whereas two distinct Ni(II) ZFS values (±0.30 and ±1.1 cm⁻¹) were observed in the dehydrated samples. Although the magnetically-dilute materials were not active catalysts, the identification of model sites in which the coordination environments of iron and nickel were particularly labile (e.g. by simple vacuum drying) is an important step towards identifying sites in which the coordination number may drop spontaneously in water, a probable mechanism of water oxidation in functional materials