Search CORE

838 research outputs found

Scalable Inference of Customer Similarities from Interactions Data using Dirichlet Processes

Author: Bonfrer André
Braun Michael
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/05/2009
Field of study

Under the sociological theory of homophily, people who are similar to one another are more likely to interact with one another. Marketers often have access to data on interactions among customers from which, with homophily as a guiding principle, inferences could be made about the underlying similarities. However, larger networks face a quadratic explosion in the number of potential interactions that need to be modeled. This scalability problem renders probability models of social interactions computationally infeasible for all but the smallest networks. In this paper we develop a probabilistic framework for modeling customer interactions that is both grounded in the theory of homophily, and is flexible enough to account for random variation in who interacts with whom. In particular, we present a novel Bayesian nonparametric approach, using Dirichlet processes, to moderate the scalability problems that marketing researchers encounter when working with networked data. We find that this framework is a powerful way to draw insights into latent similarities of customers, and we discuss how marketers can apply these insights to segmentation and targeting activities

arXiv.org e-Print Archive

Southern Methodist University

CiteSeerX

DSpace@MIT

Deakin Research Online

Research Papers in Economics

The Australian National University

SMU Digital Repository

University of Queensland eSpace

Advanced Methods in Comparative Politics: Modeling Without Conditional Independence

Author: Carlson David George
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2018
Field of study

One of the most significant assumptions we invoke when making quantitative inferences is the conditional independence between observations. There are, however, many situations when we may doubt this independence. For instance, two seemingly distinct data-generating processes may in fact share unobserved relations. Time-series and cross-sectional studies are also plagued by a lack of independence. If we ignore this common violation of our fundamental modeling assumptions we may draw improper conclusions from our data. This dissertation introduces two methods to the political science literature: a zero-inflated multivariate ordered probit and Gaussian process regression for time-series cross-sectional analyses. This latter model is then applied to demonstrate that executives in Latin America enjoy increased public support following ideological moderation, but executives are less willing to moderate during election years. These effects, however, are conditional on the extremity of the executive. The dissertation as a whole contributes both methodologically and theoretically to the field

Washington University St. Louis: Open Scholarship

Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

Author: Ferreira N.
Oliveira M.
Publication venue: CFE and CMStatistics networks
Publication date: 01/01/2015
Field of study

The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

Repositório Institucional do ISCTE-IUL

Onset of an outline map to get a hold on the wildwood of clustering methods

Author: Hennig Christian
Kiers Henk A. L.
Van Mechelen Iven
Publication venue
Publication date: 26/04/2023
Field of study

The domain of cluster analysis is a meeting point for a very rich multidisciplinary encounter, with cluster-analytic methods being studied and developed in discrete mathematics, numerical analysis, statistics, data analysis and data science, and computer science (including machine learning, data mining, and knowledge discovery), to name but a few. The other side of the coin, however, is that the domain suffers from a major accessibility problem as well as from the fact that it is rife with division across many pretty isolated islands. As a way out, the present paper offers an outline map for the clustering domain as a whole, which takes the form of an overarching conceptual framework and a common language. With this framework we wish to contribute to structuring the domain, to characterizing methods that have often been developed and studied in quite different contexts, to identifying links between them, and to introducing a frame of reference for optimally setting up cluster analyses in data-analytic practice.Comment: 33 pages, 4 figure

arXiv.org e-Print Archive

An overview of clustering methods with guidelines for application in mental health research

Author: Gao Caroline X.
Publication venue: Universidad de Granada
Publication date: 27/05/2023
Field of study

Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements. In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently introduced. How to choose algorithms to address common issues as well as methods for pre-clustering data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms, we provide information on R functions and librarie

Repositorio Institucional Universidad de Granada