65,891 research outputs found
Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm
The identification of different homogeneous groups of observations and their
appropriate analysis in PLS-SEM has become a critical issue in many appli-
cation fields. Usually, both SEM and PLS-SEM assume the homogeneity of all
units on which the model is estimated, and approaches of segmentation present
in literature, consist in estimating separate models for each segments of
statistical units, which have been obtained either by assigning the units to
segments a priori defined. However, these approaches are not fully accept- able
because no causal structure among the variables is postulated. In other words,
a modeling approach should be used, where the obtained clusters are homogeneous
with respect to the structural causal relationships. In this paper, a new
methodology for simultaneous non-hierarchical clus- tering and PLS-SEM is
proposed. This methodology is motivated by the fact that the sequential
approach of applying first SEM or PLS-SEM and second the clustering algorithm
such as K-means on the latent scores of the SEM/PLS-SEM may fail to find the
correct clustering structure existing in the data. A simulation study and an
application on real data are included to evaluate the performance of the
proposed methodology
Using the theory of planned behaviour as a process evaluation tool in randomised trials of knowledge translation strategies : A case study from UK primary care
Peer reviewedPublisher PD
Regional development assessment using parametric and non-parametric ranking methods: A comparative analysis of Slovenia and Croatia
In this paper we describe several regional development-assessment methods and subsequently apply them in a comparative development level analysis of the Slovenian and Croatian municipalities. The aim is to compare performance and suitability of several parametric and non-parametric ranking methods and to develop a suitable multivariate methodological framework for distinguishing development level of particular territorial units. However, the usefulness and appropriateness of various multivariate techniques for regional development assessment is generally questionable and there is no clear consensus about how to carry out such analysis. Two main methodological approaches are based on parametric and non-parametric methods, where in the former an explicit econometric model containing theory-implied causal and possibly simultaneous relationships is estimated using likelihood-based methods and formally assessed in terms of the goodness of fit and other test statistics, subsequently allowing for estimation of the development level on a metric scale, while in the later, territorial units or regions are essentially classified into clusters or groups differing in the development level, but no formal inferential methods are applied to confirm the validity of the model, or to establish the difference in the development level on a metric scale. The possible advantages of the first approach are in the existence of formal testing and evaluation procedures, as well as in producing interval ranks of the analysed units, while its disadvantages are in the lack of robustness; often unrealistic distributional assumptions; and possible invalidity of the theoretically implied causal relationships. In this paper we consider a parametric, inferential approach based on maximum likelihood estimation of the linear structural equation model with latent variables for metric-scale development ranking, and a non-parametric approach based on cluster analysis for development grouping. Our analysis is based on ten regional development variables such as income per capita, population density, age index, etc. which are similarly collected and generally compatible for both analysed countries. Within the parametric approach, a simultaneous equation econometric model is estimated and latent scores are computed for each underlying latent development variable, where three latent constructs are postulated corresponding to economic, structural and demographic development dimensions. In the non-parametric approach, a combination of Ward?s hierarchical method and K-means clustering procedure is applied to classify the territorial units. We apply both methodological frameworks to Slovenian and Croatian municipality data and assess their regional development level. We further compare the performance of both methods and show to which degree their results are compatible. Finally, we propose a unified framework based on both parametric and non-parametric methods, where clustering techniques are performed both on the original development indicators and on the computed latent scores from the structural equation model, and compare these results with the results from each of the two methods applied separately. We show that a combined parametric/non-parametric approach is superior to each approach applied individually and propose a methodological framework capable of estimating the development level of territorial units or regions on a metric scale, while in the same time preserving the robustness of the non-parametric techniques.
Network Cosmology
Prediction and control of the dynamics of complex networks is a central
problem in network science. Structural and dynamical similarities of different
real networks suggest that some universal laws might accurately describe the
dynamics of these networks, albeit the nature and common origin of such laws
remain elusive. Here we show that the causal network representing the
large-scale structure of spacetime in our accelerating universe is a power-law
graph with strong clustering, similar to many complex networks such as the
Internet, social, or biological networks. We prove that this structural
similarity is a consequence of the asymptotic equivalence between the
large-scale growth dynamics of complex networks and causal networks. This
equivalence suggests that unexpectedly similar laws govern the dynamics of
complex networks and spacetime in the universe, with implications to network
science and cosmology
Detection of regulator genes and eQTLs in gene networks
Genetic differences between individuals associated to quantitative phenotypic
traits, including disease states, are usually found in non-coding genomic
regions. These genetic variants are often also associated to differences in
expression levels of nearby genes (they are "expression quantitative trait
loci" or eQTLs for short) and presumably play a gene regulatory role, affecting
the status of molecular networks of interacting genes, proteins and
metabolites. Computational systems biology approaches to reconstruct causal
gene networks from large-scale omics data have therefore become essential to
understand the structure of networks controlled by eQTLs together with other
regulatory genes, and to generate detailed hypotheses about the molecular
mechanisms that lead from genotype to phenotype. Here we review the main
analytical methods and softwares to identify eQTLs and their associated genes,
to reconstruct co-expression networks and modules, to reconstruct causal
Bayesian gene and module networks, and to validate predicted networks in
silico.Comment: minor revision with typos corrected; review article; 24 pages, 2
figure
The Importance of Being Clustered: Uncluttering the Trends of Statistics from 1970 to 2015
In this paper we retrace the recent history of statistics by analyzing all
the papers published in five prestigious statistical journals since 1970,
namely: Annals of Statistics, Biometrika, Journal of the American Statistical
Association, Journal of the Royal Statistical Society, series B and Statistical
Science. The aim is to construct a kind of "taxonomy" of the statistical papers
by organizing and by clustering them in main themes. In this sense being
identified in a cluster means being important enough to be uncluttered in the
vast and interconnected world of the statistical research. Since the main
statistical research topics naturally born, evolve or die during time, we will
also develop a dynamic clustering strategy, where a group in a time period is
allowed to migrate or to merge into different groups in the following one.
Results show that statistics is a very dynamic and evolving science, stimulated
by the rise of new research questions and types of data
- âŠ