245 research outputs found

    Bayesian analysis for mixtures of discrete distributions with a non-parametric component

    Get PDF
    Bayesian finite mixture modelling is a flexible parametric modelling approach for classification and density fitting. Many areas of application require distinguishing a signal from a noise component. In practice, it is often difficult to justify a specific distribution for the signal component; therefore, the signal distribution is usually further modelled via a mixture of distributions. However, modelling the signal as a mixture of distributions is computationally non-trivial due to the difficulties in justifying the exact number of components to be used and due to the label switching problem. This paper proposes the use of a non-parametric distribution to model the signal component. We consider the case of discrete data and show how this new methodology leads to more accurate parameter estimation and smaller false non-discovery rate. Moreover, it does not incur the label switching problem. We show an application of the method to data generated by ChIP-sequencing experiments

    Equianalytic and equisingular families of curves on surfaces

    Get PDF
    We consider flat families of reduced curves on a smooth surface S such that each member C has the same number of singularities of fixed singularity types and the corresponding (locally closed) subscheme H of the Hilbert scheme of S. We are mainly concerned with analytic resp. topological singularity types and give a sufficient condition for the smoothness of H (at C). Our results for S=P^2 seem to be quite sharp for families of cuves of small degree d.Comment: LaTeX v 2.0

    Bayesian solutions to the label switching problem

    Get PDF
    The label switching problem, the unidentifiability of the permutation of clusters or more generally latent variables, makes interpretation of results computed with MCMC sampling difficult. We introduce a fully Bayesian treatment of the permutations which performs better than alternatives. The method can be used to compute summaries of the posterior samples even for nonparametric Bayesian methods, for which no good solutions exist so far. Although being approximative in this case, the results are very promising. The summaries are intuitively appealing: A summarized cluster is defined as a set of points for which the likelihood of being in the same cluster is maximized

    An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration

    Full text link
    While statisticians are well-accustomed to performing exploratory analysis in the modeling stage of an analysis, the notion of conducting preliminary general-purpose exploratory analysis in the Monte Carlo stage (or more generally, the model-fitting stage) of an analysis is an area which we feel deserves much further attention. Towards this aim, this paper proposes a general-purpose algorithm for automatic density exploration. The proposed exploration algorithm combines and expands upon components from various adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at its heart. Additionally, the algorithm is run on interacting parallel chains -- a feature which both decreases computational cost as well as stabilizes the algorithm, improving its ability to explore the density. Performance is studied in several applications. Through a Bayesian variable selection example, the authors demonstrate the convergence gains obtained with interacting chains. The ability of the algorithm's adaptive proposal to induce mode-jumping is illustrated through a trimodal density and a Bayesian mixture modeling application. Lastly, through a 2D Ising model, the authors demonstrate the ability of the algorithm to overcome the high correlations encountered in spatial models.Comment: 33 pages, 20 figures (the supplementary materials are included as appendices

    Analysis of ChIP-seq data via Bayesian finite mixture models with a non-parametric component

    Get PDF
    In large discrete data sets which requires classification into signal and noise components, the distribution of the signal is often very bumpy and does not follow a standard distribution. Therefore the signal distribution is further modelled as a mixture of component distributions. However, when the signal component is modelled as a mixture of distributions, we are faced with the challenges of justifying the number of components and the label switching problem (caused by multimodality of the likelihood function). To circumvent these challenges, we propose a non-parametric structure for the signal component. This new method is more efficient in terms of precise estimates and better classifications. We demonstrated the efficacy of the methodology using a ChIP-sequencing data set

    Gender gaps in education

    Get PDF
    This chapter reviews the growing body of research in economics which concentrates on the education gender gap and its evolution, over time and across countries. The survey first focuses on gender differentials in the historical period that roughly goes from 1850 to the 1940s and documents the deep determinants of the early phase of female education expansion, including preindustrial conditions, religion, and family and kinship patterns. Next, the survey describes the stylized facts of contemporaneous gender gaps in education, from the 1950s to the present day, accounting for several alternative measures of attainment and achievement and for geographic and temporal differentiations. The determinants of the gaps are then summarized, while keeping a strong emphasis on an historical perspective and disentangling factors related to the labor market, family formation, psychological elements, and societal cultural norms. A discussion follows of the implications of the education gender gap for multiple realms, from economic growth to family life, taking into account the potential for reverse causation. Special attention is devoted to the persistency of gender gaps in the STEM and economics fields
    corecore