1,172 research outputs found
Canonical analysis based on scatter matrices.
In this paper, the influence functions and limiting distributions of the canonical correlations and coefficients based on affine equivariant scatter matrices are developed for elliptically symmetric distributions. General formulas for limiting variances and covariances of the canonical correlations and canonical vectors based on scatter matrices are obtained. Also the use of the so called shape matrices in canonical analysis is investigated. The scatter and shape matrices based on the affine equivariant Sign Covariance Matrix as well as the Tyler's shape matrix are considered in more detail. Their finite sample and limiting efficiencies are compared to those of the Minimum Covariance Determinant estimator and S-estimates through theoretical and simulation studies. The theory is illustrated by an example.Canonical correlations; Canonical variables; Canonical vectors; Covariance; Covariance determinant estimator; Determinant estimator; Distribution; Efficiency; Estimator; Functions; Influence function; Matrix; Scatter; Shape matrix; Sign covariance mix; Simulation; Studies; Theory; Tyler's estimate;
Evaluation of RSL history as a tool for assistance in the development and evaluation of computer vision algorithms
A revision of Recognition Strategy Language (RSL), a domain-specific language for pattern recognition algorithm development, is in development. This language provides several tools for pattern recognition algorithm implementation and analysis, including composition of operations and a detailed history of those operations and their results. This research focuses on that history and shows that for some problems it provides an improvement over traditional methods of gathering information. When designing a pattern recognition algorithm, bookkeeping code in the form of copious logging and tracing code must be written and analyzed in order to test the effectiveness of procedures and parameters. The amount of data grows when dealing with video streams; new organization and searching tools need to be designed in order to manage the large volume of data. General purpose languages have techniques like Aspect Oriented Programming intended to address this problem, but a general approach is limited because it does not provide tools that are useful to only one problem domain. By incorporating support for this bookkeeping work directly into the language, RSL provides an improvement over the general approach in both development time and ability to evaluate the algorithm being designed for some problems. The utility of RSL is tested by evaluating the implementation process of a computer vision algorithm for recognizing American Sign Language (ASL). RSL history is examined in terms of its use in the development and evaluation stages of the algorithm, and the usefulness of the history is stated based on the benefit seen at each stage. RSL is found to be valuable for a portion of the algorithm involving distinct steps that provide opportunity for comparison. RSL was less beneficial for the dynamic programming portion of the algorithm. Compromises were made for performance reasons while implementing the dynamic programming solution and the inspection at every step of what amounts to a brute-force search was less informative. We suggest that this investigation could be continued by testing with a larger data set and by comparing this ASL recognition algorithm with another
Weighted Mahalanobis Distance for Hyper-Ellipsoidal Clustering
Cluster analysis is widely used in many applications, ranging from image and speech coding to pattern recognition. A new method that uses the weighted Mahalanobis distance (WMD) via the covariance matrix of the individual clusters as the basis for grouping is presented in this thesis. In this algorithm, the Mahalanobis distance is used as a measure of similarity between the samples in each cluster. This thesis discusses some difficulties associated with using the Mahalanobis distance in clustering. The proposed method provides solutions to these problems. The new algorithm is an approximation to the well-known expectation maximization (EM) procedure used to find the maximum likelihood estimates in a Gaussian mixture model. Unlike the EM procedure, WMD eliminates the requirement of having initial parameters such as the cluster means and variances as it starts from the raw data set. Properties of the new clustering method are presented by examining the clustering quality for codebooks designed with the proposed method and competing methods on a variety of data sets. The competing methods are the Linde-Buzo-Gray (LBG) algorithm and the Fuzzy c-means (FCM) algorithm, both of them use the Euclidean distance. The neural network for hyperellipsoidal clustering (HEC) that uses the Mahalnobis distance is also studied and compared to the WMD method and the other techniques as well. The new method provides better results than the competing methods. Thus, this method becomes another useful tool for use in clustering
JMASM19: A SPSS Matrix For Determining Effect Sizes From Three Categories: r And Functions Of r, Differences Between Proportions, And Standardized Differences Between Means
The program is intended to provide editors, manuscript reviewers, students, and researchers with an SPSS matrix to determine an array of effect sizes not reported or the correctness of those reported, such as rrelated indices, r-related squared indices, and measures of association, when the only data provided in the manuscript or article are the n, M, and SD (and sometimes proportions and t and F (1) values) for twogroup designs. This program can create an internal matrix table to assist researchers in determining the size of an effect for commonly utilized r-related, mean difference, and difference in proportions indices when engaging in correlational and/or meta-analytic analyses
Recommended from our members
Identifying Examinees Who Possess Distinct and Reliable Subscores When Added Value is Lacking for the Total Sample
Research has demonstrated that although subdomain information may provide no added value beyond the total score, in some contexts such information is of utility to particular demographic subgroups (Sinharay & Haberman, 2014). However, it is argued that the utility of reporting subscores for an individual should not be based on one’s manifest characteristics (e.g., gender or ethnicity), but rather on individual needs for diagnostic information, which is driven by multidimensionality in subdomain scores. To improve the validity of diagnostic information, this study proposed the use of Mahalanobis Distance and HT indices to assess whether an individual’s data significantly departs from unidimensionality. Those examinees that were found to differ significantly were then assessed separately for subscore added value via Haberman’s (2008) procedure. To this end, simulation analyses were conducted to evaluate Type I error, power, and recovery of subscore added value classifications for various levels of subdomain test lengths, subdomain inter-correlations, and proportions of multidimensionality in the total sample. Results demonstrated that the HT index possessed around 100% power across all conditions, while maintaining Type I error below 5%, which led to nearly perfect recovery of subscore added value classifications. In contrast, the power rates for Mahalanobis Distance were much lower ranging from 13% to 61% with Type I errors maintained at the nominal level of 5%. Although the power rates were below the desired criterion of 80%, the cases identified as aberrant using this method were found to have greater variability between subdomain scores, increased reliability, and lower observed subdomain correlations when compared to the generated data. As a result, outlier cases were found to have subscore added value for nearly 100% of cases across conditions even when the generated multidimensional data did not possess subscore added value. These results were cross-validated using a large-scale high-stakes test in which the Mahalanobis Distance measure was found to identify 6.57% of 8,803 test-takers that possessed subscores with added-value who otherwise would have been masked by the unidimensionality of the total sample. Overall, this study suggests that the Mahalanobis Distance measure shows some promise in identifying examinees with multidimensional score profiles
Recommended from our members
Identifying Examinees Who Possess Distinct and Reliable Subscores When Added Value is Lacking for the Total Sample
Research has demonstrated that although subdomain information may provide no added value beyond the total score, in some contexts such information is of utility to particular demographic subgroups (Sinharay & Haberman, 2014). However, it is argued that the utility of reporting subscores for an individual should not be based on one’s manifest characteristics (e.g., gender or ethnicity), but rather on individual needs for diagnostic information, which is driven by multidimensionality in subdomain scores. To improve the validity of diagnostic information, this study proposed the use of Mahalanobis Distance and HT indices to assess whether an individual’s data significantly departs from unidimensionality. Those examinees that were found to differ significantly were then assessed separately for subscore added value via Haberman’s (2008) procedure. To this end, simulation analyses were conducted to evaluate Type I error, power, and recovery of subscore added value classifications for various levels of subdomain test lengths, subdomain inter-correlations, and proportions of multidimensionality in the total sample. Results demonstrated that the HT index possessed around 100% power across all conditions, while maintaining Type I error below 5%, which led to nearly perfect recovery of subscore added value classifications. In contrast, the power rates for Mahalanobis Distance were much lower ranging from 13% to 61% with Type I errors maintained at the nominal level of 5%. Although the power rates were below the desired criterion of 80%, the cases identified as aberrant using this method were found to have greater variability between subdomain scores, increased reliability, and lower observed subdomain correlations when compared to the generated data. As a result, outlier cases were found to have subscore added value for nearly 100% of cases across conditions even when the generated multidimensional data did not possess subscore added value. These results were cross-validated using a large-scale high-stakes test in which the Mahalanobis Distance measure was found to identify 6.57% of 8,803 test-takers that possessed subscores with added-value who otherwise would have been masked by the unidimensionality of the total sample. Overall, this study suggests that the Mahalanobis Distance measure shows some promise in identifying examinees with multidimensional score profiles
Culture, Cultural Distance and Cultural Intelligence : A Multilevel Hierarchical Linear Model Analysis of Contextual Business Cultural Intelligence Quotient Antecedents
Master's thesis Business Administration BE501 - University of Agder 2019Purpose –The purpose of our master thesis is to investigate contextual antecedents to Cultural Intelligence development. Particularly, we assess the ability of cultural distance to predict Business Cultural Intelligence Quotient scores.Design / methodology / approach–Given our literature review, we hypothesize that cultural distance significantly affects BCIQ in a positive way. For this matter, we split our hypothesis into three sub-hypothesis and measured cultural distance in three ways: having at least one foreign parent, the Mahalanobis cultural distance, and the delta of each GLOBE’s practices dimensions expressed as the difference in birth and residence country scores. Due to having variables at the individual and country level, we utilize a multilevel Hierarchical Linear Model to run our analysis on a sample consisting of 3474 individuals from 54 home and 45 host countries.Findings –In general, we found support for our overarching hypothesis; nevertheless, cultural distance impacts BCIQ in complex ways. On one hand, having a multicultural background has a negative effect on BCIQ; on the other hand, Mahalanobis distance impacts positively but weakly BCIQ. Furthermore, from the nine GLOBE delta practices, only Future Orientation dimension affects positively BCIQ; however, Uncertainty Avoidance and Institutional Collectivism dimensions show a negative impact on BCIQ development. These intricate results are congruent with previous studies. We discuss them under the light of the Social Learning Theory, the nature of cultural distance and empirical studies that confirm contextual characteristics of cultures.Originality / value –We presenttwo main contributions to International Business. Firstly, we map business cultural intelligence quotient globally with our BCIQ Index40; secondly, we employ environmental antecedents, e.g. cultural distance, to explain BCIQ variation among countries.Keywords: Business Cultural Intelligence Quotient, cultural distance, Mahalanobis distance multicultural background, GLOBE, cultural intelligence, C
Template Matching: Matched Spatial Filters and Beyond
Template matching by means of cross-correlation is common practice in pattern recognition. However, its sensitivity to deformations of the pattern and the broad and unsharp peaks it produces are significant drawbacks. This paper reviews some results on how these shortcomings can be removed. Several techniques (Matched Spatial Filters, Synthetic Discriminant Functions, Principal Components Projections and Reconstruction Residuals) are reviewed and compared on a common task: locating eyes in a database of faces. New variants are also proposed and compared: least squares Discriminant Functions and the combined use of projections on eigenfunctions and the corresponding reconstruction residuals. Finally, approximation networks are introduced in an attempt to improve filter design by the introduction of nonlinearity
How Time Preferences Differ: Evidence from 45 Countries
We present results from the first large-scale international survey on time discounting, conducted in 45 countries. Cross-country variation cannot simply be explained by economic variables such as interest rates or in ation. In particular, we find strong evidence for cultural differences, as measured by the Hofstede cultural dimensions. For example, high levels of Uncertainty Avoidance or Individualism are both associated with strong hyperbolic discounting. Moreover, as application of our data, we find evidence for an impact of time preferences on the capability of technological innovations in a country and on environmental protection.Time preferences; Intertemporal decision; Endogenous preference; Cross-cultural comparison
Routine Clustering of Mobile Sensor Data Facilitates Psychotic Relapse Prediction in Schizophrenia Patients
We aim to develop clustering models to obtain behavioral representations from
continuous multimodal mobile sensing data towards relapse prediction tasks. The
identified clusters could represent different routine behavioral trends related
to daily living of patients as well as atypical behavioral trends associated
with impending relapse.
We used the mobile sensing data obtained in the CrossCheck project for our
analysis. Continuous data from six different mobile sensing-based modalities
(e.g. ambient light, sound/conversation, acceleration etc.) obtained from a
total of 63 schizophrenia patients, each monitored for up to a year, were used
for the clustering models and relapse prediction evaluation. Two clustering
models, Gaussian Mixture Model (GMM) and Partition Around Medoids (PAM), were
used to obtain behavioral representations from the mobile sensing data. The
features obtained from the clustering models were used to train and evaluate a
personalized relapse prediction model using Balanced Random Forest. The
personalization was done by identifying optimal features for a given patient
based on a personalization subset consisting of other patients who are of
similar age.
The clusters identified using the GMM and PAM models were found to represent
different behavioral patterns (such as clusters representing sedentary days,
active but with low communications days, etc.). Significant changes near the
relapse periods were seen in the obtained behavioral representation features
from the clustering models. The clustering model based features, together with
other features characterizing the mobile sensing data, resulted in an F2 score
of 0.24 for the relapse prediction task in a leave-one-patient-out evaluation
setting. This obtained F2 score is significantly higher than a random
classification baseline with an average F2 score of 0.042
- …