271 research outputs found

    Construction of an Immigrant Integration Composite Indicator through the Partial Least Squares Structural Equation Model K-Means

    Get PDF
    Integration is a multidimensional process, which can take place in different ways and at different times in relation to each of the single economic, social, cultural, and political dimensions. Hence, examining every single dimension is important as well as building composite indexes simultaneously inclusive of all dimensions in order to obtain a complete description of a complex phenomenon and to convey a coherent set of information. In this paper, we aim at building an immigrant integration composite indicator (IICI), able to measure the different aspects related to integration such as employment, education, social inclusion, active citizenship, and on the basis of which to simultaneously classify territorial areas such as European regions. For this application, the data collected in 274 European regions from the European Social Survey (ESS), Round 8, on immigration have been used

    Combinatorial Bounds and Characterizations of Splitting Authentication Codes

    Full text link
    We present several generalizations of results for splitting authentication codes by studying the aspect of multi-fold security. As the two primary results, we prove a combinatorial lower bound on the number of encoding rules and a combinatorial characterization of optimal splitting authentication codes that are multi-fold secure against spoofing attacks. The characterization is based on a new type of combinatorial designs, which we introduce and for which basic necessary conditions are given regarding their existence.Comment: 13 pages; to appear in "Cryptography and Communications

    Lassoing and corraling rooted phylogenetic trees

    Full text link
    The construction of a dendogram on a set of individuals is a key component of a genomewide association study. However even with modern sequencing technologies the distances on the individuals required for the construction of such a structure may not always be reliable making it tempting to exclude them from an analysis. This, in turn, results in an input set for dendogram construction that consists of only partial distance information which raises the following fundamental question. For what subset of its leaf set can we reconstruct uniquely the dendogram from the distances that it induces on that subset. By formalizing a dendogram in terms of an edge-weighted, rooted phylogenetic tree on a pre-given finite set X with |X|>2 whose edge-weighting is equidistant and a set of partial distances on X in terms of a set L of 2-subsets of X, we investigate this problem in terms of when such a tree is lassoed, that is, uniquely determined by the elements in L. For this we consider four different formalizations of the idea of "uniquely determining" giving rise to four distinct types of lassos. We present characterizations for all of them in terms of the child-edge graphs of the interior vertices of such a tree. Our characterizations imply in particular that in case the tree in question is binary then all four types of lasso must coincide

    Recognizing Treelike k-Dissimilarities

    Full text link
    A k-dissimilarity D on a finite set X, |X| >= k, is a map from the set of size k subsets of X to the real numbers. Such maps naturally arise from edge-weighted trees T with leaf-set X: Given a subset Y of X of size k, D(Y) is defined to be the total length of the smallest subtree of T with leaf-set Y . In case k = 2, it is well-known that 2-dissimilarities arising in this way can be characterized by the so-called "4-point condition". However, in case k > 2 Pachter and Speyer recently posed the following question: Given an arbitrary k-dissimilarity, how do we test whether this map comes from a tree? In this paper, we provide an answer to this question, showing that for k >= 3 a k-dissimilarity on a set X arises from a tree if and only if its restriction to every 2k-element subset of X arises from some tree, and that 2k is the least possible subset size to ensure that this is the case. As a corollary, we show that there exists a polynomial-time algorithm to determine when a k-dissimilarity arises from a tree. We also give a 6-point condition for determining when a 3-dissimilarity arises from a tree, that is similar to the aforementioned 4-point condition.Comment: 18 pages, 4 figure

    A survey on feature weighting based K-Means algorithms

    Get PDF
    This is a pre-copyedited, author-produced PDF of an article accepted for publication in Journal of Classification [de Amorim, R. C., 'A survey on feature weighting based K-Means algorithms', Journal of Classification, Vol. 33(2): 210-242, August 25, 2016]. Subject to embargo. Embargo end date: 25 August 2017. The final publication is available at Springer via http://dx.doi.org/10.1007/s00357-016-9208-4 © Classification Society of North America 2016In a real-world data set there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be taken into account during the clustering process. With over 50 years of history, K-Means is arguably the most popular partitional clustering algorithm there is. The first K-Means based clustering algorithm to compute feature weights was designed just over 30 years ago. Various such algorithms have been designed since but there has not been, to our knowledge, a survey integrating empirical evidence of cluster recovery ability, common flaws, and possible directions for future research. This paper elaborates on the concept of feature weighting and addresses these issues by critically analysing some of the most popular, or innovative, feature weighting mechanisms based in K-Means.Peer reviewedFinal Accepted Versio

    A simulated annealing methodology for clusterwise linear regression

    Full text link
    In many regression applications, users are often faced with difficulties due to nonlinear relationships, heterogeneous subjects, or time series which are best represented by splines. In such applications, two or more regression functions are often necessary to best summarize the underlying structure of the data. Unfortunately, in most cases, it is not known a priori which subset of observations should be approximated with which specific regression function. This paper presents a methodology which simultaneously clusters observations into a preset number of groups and estimates the corresponding regression functions' coefficients, all to optimize a common objective function. We describe the problem and discuss related procedures. A new simulated annealing-based methodology is described as well as program options to accommodate overlapping or nonoverlapping clustering, replications per subject, univariate or multivariate dependent variables, and constraints imposed on cluster membership. Extensive Monte Carlo analyses are reported which investigate the overall performance of the methodology. A consumer psychology application is provided concerning a conjoint analysis investigation of consumer satisfaction determinants. Finally, other applications and extensions of the methodology are discussed.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45745/1/11336_2005_Article_BF02296405.pd

    E-commerce transactions in a virtual environment: Virtual transactions

    Get PDF
    E-commerce is a fundamental method of doing business, such that for a firm to say it is trading at all in the modern market-place it must have some element of on-line presence. Coupled with this is the explosion of the "population" of Massively Multiplayer On-line Role Playing Games and other shared virtual environments. Many suggest this will lead to a further dimension of commerce: virtual commerce. We discuss here the issues, current roadblocks and present state of an e-commerce transaction carried out completely within a virtual environment; a virtual transaction. Although technically such transactions are in a sense trivial, they raise many other issues in complex ways thus making V-transactions a highly interesting cross-disciplinary issue. We also discuss the social, ethical and regulatory implications for the virtual communities in these environments of such v-transactions, how their implementation affects the nature and management of a virtual environment, and how they represent a fundamental merging of the real and virtual worlds for the purpose of commerce. We highlight the minimal set of features a v-transaction capable virtual environment requires and suggest a model of how in the medium term they could be carried out via a methodology we call click-through, and that the developers of such environments will need to take on the multi-modal behavior of their users, as well as elements of the economic and political sciences in order to fully realize the commercial potential of the v-transaction. © 2012 Springer Science+Business Media, LLC

    A stochastic multidimensional scaling procedure for the empirical determination of convex indifference curves for preference/choice analysis

    Full text link
    The vast majority of existing multidimensional scaling (MDS) procedures devised for the analysis of paired comparison preference/choice judgments are typically based on either scalar product (i.e., vector) or unfolding (i.e., ideal-point) models. Such methods tend to ignore many of the essential components of microeconomic theory including convex indifference curves, constrained utility maximization, demand functions, et cetera. This paper presents a new stochastic MDS procedure called MICROSCALE that attempts to operationalize many of these traditional microeconomic concepts. First, we briefly review several existing MDS models that operate on paired comparisons data, noting the particular nature of the utility functions implied by each class of models. These utility assumptions are then directly contrasted to those of microeconomic theory. The new maximum likelihood based procedure, MICROSCALE, is presented, as well as the technical details of the estimation procedure. The results of a Monte Carlo analysis investigating the performance of the algorithm as a number of model, data, and error factors are experimentally manipulated are provided. Finally, an illustration in consumer psychology concerning a convenience sample of thirty consumers providing paired comparisons judgments for some fourteen brands of over-the-counter analgesics is discussed.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45748/1/11336_2005_Article_BF02294463.pd

    A spatial interaction model for deriving joint space maps of bundle compositions and market segments from pick-any/J data: An application to new product options

    Full text link
    We propose an approach for deriving joint space maps of bundle compositions and market segments from three-way (e.g., consumers x product options/benefits/features x usage situations/scenarios/time periods) pick-any/J data. The proposed latent structure multidimensional scaling procedure simultaneously extracts market segment and product option positions in a joint space map such that the closer a product option is to a particlar segment, the higher the likelihood of its being chosen by that segment. A segment-level threshold parameter is estimated that spatially delineates the bundle of product options that are predicted to be chosen by each segment. Estimates of the probability of each consumer belonging to the derived segments are simultaneously obtained. Explicit treatment of product and consumer characteristics are allowed via optional model reparameterizations of the product option locations and segment memberships. We illustrate the use of the proposed approach using an actual commercial application involving pick-any/J data gathered by a major hi-tech firm for some 23 advanced technological options for new automobiles.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/47207/1/11002_2004_Article_BF00434905.pd
    corecore