89 research outputs found
Benchmarking in cluster analysis: A white paper
To achieve scientific progress in terms of building a cumulative body of
knowledge, careful attention to benchmarking is of the utmost importance. This
means that proposals of new methods of data pre-processing, new data-analytic
techniques, and new methods of output post-processing, should be extensively
and carefully compared with existing alternatives, and that existing methods
should be subjected to neutral comparison studies. To date, benchmarking and
recommendations for benchmarking have been frequently seen in the context of
supervised learning. Unfortunately, there has been a dearth of guidelines for
benchmarking in an unsupervised setting, with the area of clustering as an
important subdomain. To address this problem, discussion is given to the
theoretical conceptual underpinnings of benchmarking in the field of cluster
analysis by means of simulated as well as empirical data. Subsequently, the
practicalities of how to address benchmarking questions in clustering are dealt
with, and foundational recommendations are made
Modeling Heterogeneous Peer Assortment Effects Using Finite Mixture Exponential Random Graph Models
This article develops a class of models called sender/receiver finite mixture exponential random graph models (SRFM-ERGMs). This class of models extends the existing exponential random graph modeling framework to allow analysts to model unobserved heterogeneity in the effects of nodal covariates and network features without a block structure. An empirical example regarding substance use among adolescents is presented. Simulations across a variety of conditions are used to evaluate the performance of this technique. We conclude that unobserved heterogeneity in effects of nodal covariates can be a major cause of misfit in network models, and the SRFM-ERGM approach can alleviate this misfit. Implications for the analysis of social networks in psychological science are discussed
The analysis of bridging constructs with hierarchical clustering methods: An application to identity
When analyzing psychometric surveys, some design and sample size limitations challenge existing
approaches. Hierarchical clustering, with its graphics (heat maps, dendrograms, means plots), provides
a nonparametric method for analyzing factorially-designed survey data, and small samples data. In the
present study, we demonstrated the advantages of using hierarchical clustering (HC) for the analysis
of non-higher-order measures, comparing the results of HC against those of exploratory factor analysis.
As a factorially-designed survey, we used the Identity Labels and Life Contexts Questionnaire (ILLCQ), a
novel measure to assess identity as a bridging construct for the intersection of identity domains and life
contexts. Results suggest that, when used to validate factorially-designed measures, HC and its graphics
are more stable and consistent compared to EFA
A Modified Approach to Fitting Relative Importance Networks
Relative importance networks have played an important role in network psychometrics because they enable researchers to examine directional relationships among items. Most researchers have estimated the network edge weights using a well-established measure of general dominance for multiple regression and we recommend continuation of this practice. However, we recommend a modified approach that uses best-subsets regression as a preceding step to select an appropriate subset of predictors for each item. The benefits of this modified approach include: (1) a principled approach to edge selection for the relative importance network that reduces overfitting, (2) greater explained variation for scale items in comparison to the cutting of complete graphs (3) a signed network is possible if desired, and (4) potential generalization for logistic regression in the case of binary measurements. We describe and demonstrate the proposed approach and discuss its strengths and limitations
- …