159 research outputs found

    Machine learning approaches for determining effective seeds for k -means algorithm

    Get PDF
    In this study, I investigate and conduct an experiment on two-stage clustering procedures, hybrid models in simulated environments where conditions such as collinearity problems and cluster structures are controlled, and in real-life problems where conditions are not controlled. The first hybrid model (NK) is an integration between a neural network (NN) and the k-means algorithm (KM) where NN screens seeds and passes them to KM. The second hybrid (GK) uses a genetic algorithm (GA) instead of the neural network. Both NN and GA used in this study are in their simplest-possible forms. In the simulated data sets, I investigate two properties: clustering performance comparisons and effects of five factors (scale, sample size, density, number of clusters, and number of variables) on the five clustering approaches (KM, NN, NK, GA, GK). Density, number of clusters, and dimension influence the clustering performance of all five approaches. KM, NK, and GK classify well when all clusters contain a similar number of observations, while NK and GK perform better than the KM. NN performs well when one cluster contains more observations than any other cluster. The two hybrid models perform at least as well as KM, although the environments are in favor of the KM. The most crucial information, the true number of clusters, is provided to the KM only. In addition, the cluster structures are simple: the clusters are well separated; the variances and cluster sizes are uniform; the correlation between any pair of variables and collinearity problems are not significant; and the observations are normally distributed. Real-life problems consist of three problems with a known natural cluster structure and one problem with an unknown natural cluster structure. Overall results indicate that GK performs better than KM, while NK is the worst performing among the five approaches. The two machine learning approaches generate better results than KM in an environment that does not favor KM. GK has shown to be the best or among the best in a simulated environment and in real-life situations. Furthermore, the GK can detect firms with promising financial prospect such as acquisition targets and firms with “buy” recommendation, better than all other approaches

    A Comparative Investigation of K-means and Partition Around Medoid Methods of Clustering - a Case Study with Acute Lymphoblastic Leukemia Data

    Get PDF
    Clustering methods are important tool in data mining. The main challenge of clustering is to select the suitable method to be used for a given data set and the estimation of the number of clusters in the data set, especially in case of the unsupervised data. In this paper, a comparison between two important partitioning clusteringClustering methods are important tool in data mining. The main challenge of clustering is to select the suitable method to be used for a given data set and the estimation of the number of clusters in the data set, especially in case of the unsupervised data. In this paper, a comparison between two important partitioning clustering methods namely the K-means and the Partition Around Medoid (PAM) have been considered and a special index for each has been used to estimate number of clusters. Also different indices of internal validation and stability measures have been used to compare these two methods to evaluate their performance by using these indices. Internal validation and stability measures have been used to compare between K-means and PAM for B-cells and T-cells and it has been found that for B-cells the K-means performs better than PAM by Connectivity, Dunn, Silhouette, APN, ADM, FOM indexes and PAM perform better than K-means by AD index. For T-cells, PAM performs better than K-means by Connectivity index and K-means performs better than PAM by Dunn, Silhouette, APN, AD, ADM, FOM indices

    Development of an Optimal Replenishment Policy for Human Capital Inventory

    Get PDF
    A unique approach is developed for evaluating Human Capital (workforce) requirements. With this approach, new ways of measuring personnel availability are proposed and available to ensure that an organization remains ready to provide timely, relevant, and accurate products and services in support of its strategic objectives over its planning horizon. The development of this analysis and methodology was established as an alternative approach to existing studies for determining appropriate hiring and attrition rates and to maintain appropriate personnel levels of effectiveness to support existing and future missions. The contribution of this research is a prescribed method for the strategic analyst to incorporate a personnel and cost simulation model within the framework of Human Resources Human Capital forecasting which can be used to project personnel requirements and evaluate workforce sustainment, at least cost, through time. This will allow various personnel managers to evaluate multiple resource strategies, present and future, maintaining near “perfect” hiring and attrition policies to support its future Human Capital assets

    The Strategic Orientation of Skilled Nursing Facilities

    Get PDF
    Since the early 2000s, skilled nursing facilities (SNFs) have operated in an environment made uncertain by changes in health care policy, growth in substitutes for nursing care, and increasing demand for services. To better understand how SNFs are strategically positioning themselves to survive and thrive, this study develops a taxonomy of strategic groups of SNFs. A conceptual framework is based in Strategic Management Theory and classification of SNFs is based on scope of business decisions including length of stay, complexity of patients, and referral networks with hospitals. Two-step, hierarchical cluster analysis finds six strategy groups of SNFs: Post-Acute Care Focus – Wide Network, Private Pay Focus – Narrow Network, High Acuity Care Focus – Wide Network, Intermediate Care Focus – Wide Network, Long-Stay Care Focus – Narrow Network, and Long-Stay Complex Care Focus – Narrow Network. Support is found for a structure-performance link between membership in a particular strategy group and financial and quality performance. A longitudinal analysis finds stability in the structure of the groups, but fluidity of movement from one strategy group to another. A comparison of strategy groups with those in prior studies suggests changes in reimbursement policies and industry trends align with shifts in strategy. This study contributes to the understanding of how SNFs adjust strategically to environmental uncertainty and provides a unique assessment of the relational dynamics of referrals to SNFs from hospitals. A better understanding of the industry structure can benefit managers as they make strategic decisions and help policymakers better target funding and policy changes to improve patient outcomes


    Get PDF
    The main objective of this study is to empirically test a fourth-order hierarchical model of experiential value in an online book and CD setting. In addition, we provide empirical evidence for the role of hedonic and utilitarian value components in creating attitudinal and behavioral loyalty. Finally, we develop an online customer typology, based on the underlying value sources. Based on a sample of 190 visitors of online book and CD retailers, we used PLS to test a third and fourth order hierarchical model of experiential value, emphasizing a hedonic (intrinsic) and utilitarian (extrinsic) value component and the existence of the holistic concept of experiential value. Our results demonstrate that experiential value consists of the third order components hedonic (intrinsic) and utilitarian (extrinsic) value. Both value aspects impact attitudinal loyalty ultimately leading to behavioral loyalty which is also directly affected by utilitarian value. Finally, a nonhierarchical (k-means) cluster analysis identified four segments of online visitors: hedonists, utilitarians, active negativists, and reactive positivists.marketing ;

    차량의 스포티한 엔진음 정량화를 위한 음질 지수 개발과 그 정확도 향상을 위한 방법 연구

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 기계항공공학부, 2020. 8. 강연준.Developments in vehicle technology and accompanying improvements in NVH performance have led to increased consumer demand for high sound quality, such as a sporty engine sound. As sporty sound is subjective, this thesis sought to express its meaning quantitatively and to develop a model that accommodates the differences in individuals tastes. This thesis tackles two main issues. The first is to identify the efficiency of factor analysis for utilizing it in developing a sound quality index of sportiness. The second is to further improve the accuracy of the sound quality index and to refine the definition of sportiness by adding K-means cluster analysis. In Chapter 2 and 3, the initial procedure for developing the sportiness index is presented. Accordingly, the process of recording the vehicles interior operating sound under wide open throttle acceleration conditions for 4 different vehicles and producing 13 evaluation samples by using parametric band-pass filtering is described. Acoustic and psychoacoustic parameters of the samples produced were calculated, and the preferences for sportiness were identified through jury testing. Jury test was jointly carried out by 23 evaluators and a semantic differential method was used to find adjectives that could explain the concept and preference for sportiness. The Sportiness index was developed using factor analysis and multiple linear regression analysis between the calculated values and the previously collected jury test results. The index was then validated by examining the correlation coefficient through a new sample group. Furthermore, the necessity of factor analysis for the sportiness index development was concluded. In Chapter 4, after K-means clustering, factor and multiple linear regression analysis were repeated to develop a model reflecting differences for each group in evaluators tastes. The improved index was also retested using new evaluators and new samples, demonstrating its reliability through the high correlation observed in the validation studies. This sound quality evaluation index is useful for producing highly accurate results and reflecting the opinions of groups expressing a variety of commonalities.현재 차량 개발 기술이 발전함에 따라 차량의 NVH 성능이 많이 개선되었고, 이로 인해 소음 저감의 측면보다 듣기 좋은 소리와 같은 음질 측면에서의 소비자의 수요가 계속해서 증가하고 있다. 스포티한 엔진음이 그 범주에 속하고, 이는 사람마다 떠올리는 이미지가 다르고 소리에 대한 취향의 차이가 발생하는 주관적인 개념이다. 따라서 본 연구는 음질 연구를 통해서 그러한 개념의 객관적인 의미를 찾아 정량적으로 표현하고, 취향의 차이가 발생하는 것을 수용할 수 있는 방법을 찾기 위해 진행되었다. 본 논문에서 중점적으로 다루는 내용은 크게 두 가지이다. 첫 번째는, 스포티함의 음질 지수를 개발함에 있어 요인 분석을 활용함으로써 요인 분석의 효율성을 확인하고자 한 것이고, 두 번째는, K-평균 군집 분석을 추가하여 음질 지수의 정확도를 더 향상시키고 스포티함의 의미를 더욱 구체화하고자 한 것이다. 따라서, 본 논문의 2장과 3장에서는, 양산되고 있는 차량 4대를 wide open throttle 조건에서 엔진음을 녹음하였고, 녹음된 소리로부터 parametric band-pass filter를 사용해 신호를 변조하여 13개의 샘플을 제작하였다. 제작된 샘플의 음향심리학적 매개변수들을 계산하였고, 청음 평가를 통해서 스포티함에 대한 선호도를 파악하였다. 청음 평가는 23명의 평가자가 참여하였고, 의미미분법을 사용해 스포티함의 선호도와 스포티함을 잘 설명할 수 있는 형용사들을 찾아냈다. 그 결과를 요인 분석에 적용해 사람들이 공통적으로 느끼는 스포티함의 특성을 두 요인으로 표현하였고, 평가 결과 간 다중 선형 회귀 분석을 이용해 관련된 음질 인자로 표현할 수 있는 스포티함 정량화 지수를 개발하였다. 개발된 지수는 새로운 샘플군을 통해 상관계수를 확인하여 그 유효성이 확인되었다. 또한, 요인 분석 사용 유무에 따른 회귀식의 결과를 비교함으로써 요인 분석의 필요성에 대해서도 언급하였다. 4장에서는, 스포티함에 대한 평가자들의 성향 차이가 발생하는 것을 토대로 K-평균 군집 분석을 활용해 각 집단에 맞는 회귀식을 개발하기 위해 요인 분석과 다중선형회귀 분석을 재수행하였다. 개발된 지수의 신뢰성을 역시 확보하기 위해 새로운 평가자들로 재검사하였고 높은 상관계수를 토대로 그 신뢰성을 입증하였다. 결과적으로, 본 연구를 통해 개발된 음질 평가 지수는 스포티함을 객관적으로 정의함에 있어 또 다른 공통성을 나타내는 집단의 의견까지도 반영할 수 있고 정확도 높은 결과를 산출해주는 유용한 지수이다.CHAPTER 1 INTRODUCTION 1 CHAPTER 2 SOUND QUALITY EVALUATION OF VEHICLE ENGINE SPORTINESS 6 2.1 Introduction 6 2.2 Sound recording and objective evaluation of engine sound 7 2.2.1 Recording of interior sound 7 2.2.2 Production of sound samples 12 2.2.3 Calculation of objective acoustic and psychoacoustic parameters 16 Sound pressure level 18 Loudness 19 Sharpness 20 Roughness 21 Tonality 22 2.3 Subjective evaluation of sound quality 23 2.3.1 Semantic differential method and pre-test 23 2.3.2 Jury testing 26 CHAPTER 3 DEVELOPMENT OF EVALUATION INDEX OF SPORTY ENGINE SOUND : USING FACTOR ANALYSIS 32 3.1 Introduction 32 3.2 Factor analysis 33 3.3 Regression analysis 42 3.3.1 Multiple linear regression 42 3.3.2 Development of a sound quality index for sportiness 44 3.4 Validation 50 3.5 Summary 55 CHAPTER 4 NEW APPROACH TO DEVELOPMENT OF EVALUATION INDEX OF SPORTY ENGINE SOUND : USING K-MEANS CLUSTER ANALYSIS 57 4.1 Introduction 57 4.2 Statistical analysis 59 4.2.1 K-means cluster analysis 59 4.2.2 Factor analysis after K-means clustering 66 4.2.3 Regression analysis after K-means clustering 71 4.3 Validation 78 4.4 Summary 83 CHAPTER 5 CONCLUSIONS 86 REFERENCES 89 APPENDIX 98 국 문 초 록 102Docto

    Soft clustering using real-world data for the identification of multimorbidity patterns in an elderly population: Cross-sectional study in a Mediterranean population

    Get PDF
    The aim of this study was to identify, with soft clustering methods, multimorbidity patterns in the electronic health records of a population =65 years, and to analyse such patterns in accordance with the different prevalence cut-off points applied. Fuzzy cluster analysis allows individuals to be linked simultaneously to multiple clusters and is more consistent with clinical experience than other approaches frequently found in the literature.Peer ReviewedPostprint (published version