974 research outputs found

    Comparison of Mixture and Classification Maximum Likelihood Approaches in Poisson Regression Models

    Get PDF
    In this work, we propose to compare two algorithms to compute maximum likelihood estimators of the parameters of a mixture Poisson regression models. To estimate these parameters, we may use the EM algorithm in a mixture approach or the CEM algorithm in a classification approach. The comparison of the two procedures was done through a simulation study of the performance of these approaches on simulated data sets in a target number of iterations. Simulation results show that the CEM algorithm is a good alternative to the EM algorithm for fitting Poisson mixture regression models, having the advantage of converging more quickly

    Latent class cluster analysis

    Get PDF

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS

    Get PDF
    The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Model based clustering of multinomial count data

    Full text link
    We consider the problem of inferring an unknown number of clusters in replicated multinomial data. Under a model based clustering point of view, this task can be treated by estimating finite mixtures of multinomial distributions with or without covariates. Both Maximum Likelihood (ML) as well as Bayesian estimation are taken into account. Under a Maximum Likelihood approach, we provide an Expectation--Maximization (EM) algorithm which exploits a careful initialization procedure combined with a ridge--stabilized implementation of the Newton--Raphson method in the M--step. Under a Bayesian setup, a stochastic gradient Markov chain Monte Carlo (MCMC) algorithm embedded within a prior parallel tempering scheme is devised. The number of clusters is selected according to the Integrated Completed Likelihood criterion in the ML approach and estimating the number of non-empty components in overfitting mixture models in the Bayesian case. Our method is illustrated in simulated data and applied to two real datasets. An R package is available at https://github.com/mqbssppe/multinomialLogitMix.Comment: to appear in ADA

    ISBIS 2016: Meeting on Statistics in Business and Industry

    Get PDF
    This Book includes the abstracts of the talks presented at the 2016 International Symposium on Business and Industrial Statistics, held at Barcelona, June 8-10, 2016, hosted at the Universitat Politècnica de Catalunya - Barcelona TECH, by the Department of Statistics and Operations Research. The location of the meeting was at ETSEIB Building (Escola Tecnica Superior d'Enginyeria Industrial) at Avda Diagonal 647. The meeting organizers celebrated the continued success of ISBIS and ENBIS society, and the meeting draw together the international community of statisticians, both academics and industry professionals, who share the goal of making statistics the foundation for decision making in business and related applications. The Scientific Program Committee was constituted by: David Banks, Duke University Amílcar Oliveira, DCeT - Universidade Aberta and CEAUL Teresa A. Oliveira, DCeT - Universidade Aberta and CEAUL Nalini Ravishankar, University of Connecticut Xavier Tort Martorell, Universitat Politécnica de Catalunya, Barcelona TECH Martina Vandebroek, KU Leuven Vincenzo Esposito Vinzi, ESSEC Business Schoo
    corecore