838 research outputs found
Scalable Inference of Customer Similarities from Interactions Data using Dirichlet Processes
Under the sociological theory of homophily, people who are similar to one
another are more likely to interact with one another. Marketers often have
access to data on interactions among customers from which, with homophily as a
guiding principle, inferences could be made about the underlying similarities.
However, larger networks face a quadratic explosion in the number of potential
interactions that need to be modeled. This scalability problem renders
probability models of social interactions computationally infeasible for all
but the smallest networks. In this paper we develop a probabilistic framework
for modeling customer interactions that is both grounded in the theory of
homophily, and is flexible enough to account for random variation in who
interacts with whom. In particular, we present a novel Bayesian nonparametric
approach, using Dirichlet processes, to moderate the scalability problems that
marketing researchers encounter when working with networked data. We find that
this framework is a powerful way to draw insights into latent similarities of
customers, and we discuss how marketers can apply these insights to
segmentation and targeting activities
Advanced Methods in Comparative Politics: Modeling Without Conditional Independence
One of the most significant assumptions we invoke when making quantitative inferences is the conditional independence between observations. There are, however, many situations when we may doubt this independence. For instance, two seemingly distinct data-generating processes may in fact share unobserved relations. Time-series and cross-sectional studies are also plagued by a lack of independence. If we ignore this common violation of our fundamental modeling assumptions we may draw improper conclusions from our data. This dissertation introduces two methods to the political science literature: a zero-inflated multivariate ordered probit and Gaussian process regression for time-series cross-sectional analyses. This latter model is then applied to demonstrate that executives in Latin America enjoy increased public support following ideological moderation, but executives are less willing to moderate during election years. These effects, however, are conditional on the extremity of the executive. The dissertation as a whole contributes both methodologically and theoretically to the field
Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain
The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
Onset of an outline map to get a hold on the wildwood of clustering methods
The domain of cluster analysis is a meeting point for a very rich
multidisciplinary encounter, with cluster-analytic methods being studied and
developed in discrete mathematics, numerical analysis, statistics, data
analysis and data science, and computer science (including machine learning,
data mining, and knowledge discovery), to name but a few. The other side of the
coin, however, is that the domain suffers from a major accessibility problem as
well as from the fact that it is rife with division across many pretty isolated
islands. As a way out, the present paper offers an outline map for the
clustering domain as a whole, which takes the form of an overarching conceptual
framework and a common language. With this framework we wish to contribute to
structuring the domain, to characterizing methods that have often been
developed and studied in quite different contexts, to identifying links between
them, and to introducing a frame of reference for optimally setting up cluster
analyses in data-analytic practice.Comment: 33 pages, 4 figure
An overview of clustering methods with guidelines for application in mental health research
Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity
by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and
increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements.
In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and
implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic
models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently
introduced. How to choose algorithms to address common issues as well as methods for pre-clustering
data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general
guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms,
we provide information on R functions and librarie
- …