17 research outputs found

    Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

    Full text link
    Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the algorithm's sensitivity to the initial selection of the cluster centers remains to be its most serious drawback. Numerous initialization methods have been proposed to address this drawback. Many of these methods, however, have time complexity superlinear in the number of data points, which makes them impractical for large data sets. On the other hand, linear methods are often random and/or sensitive to the ordering of the data points. These methods are generally unreliable in that the quality of their results is unpredictable. Therefore, it is common practice to perform multiple runs of such methods and take the output of the run that produces the best results. Such a practice, however, greatly increases the computational requirements of the otherwise highly efficient k-means algorithm. In this chapter, we investigate the empirical performance of six linear, deterministic (non-random), and order-invariant k-means initialization methods on a large and diverse collection of data sets from the UCI Machine Learning Repository. The results demonstrate that two relatively unknown hierarchical initialization methods due to Su and Dy outperform the remaining four methods with respect to two objective effectiveness criteria. In addition, a recent method due to Erisoglu et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms (Springer, 2014). arXiv admin note: substantial text overlap with arXiv:1304.7465, arXiv:1209.196

    Global variation in anastomosis and end colostomy formation following left-sided colorectal resection

    Get PDF
    Background End colostomy rates following colorectal resection vary across institutions in high-income settings, being influenced by patient, disease, surgeon and system factors. This study aimed to assess global variation in end colostomy rates after left-sided colorectal resection. Methods This study comprised an analysis of GlobalSurg-1 and -2 international, prospective, observational cohort studies (2014, 2016), including consecutive adult patients undergoing elective or emergency left-sided colorectal resection within discrete 2-week windows. Countries were grouped into high-, middle- and low-income tertiles according to the United Nations Human Development Index (HDI). Factors associated with colostomy formation versus primary anastomosis were explored using a multilevel, multivariable logistic regression model. Results In total, 1635 patients from 242 hospitals in 57 countries undergoing left-sided colorectal resection were included: 113 (6路9 per cent) from low-HDI, 254 (15路5 per cent) from middle-HDI and 1268 (77路6 per cent) from high-HDI countries. There was a higher proportion of patients with perforated disease (57路5, 40路9 and 35路4 per cent; P < 0路001) and subsequent use of end colostomy (52路2, 24路8 and 18路9 per cent; P < 0路001) in low- compared with middle- and high-HDI settings. The association with colostomy use in low-HDI settings persisted (odds ratio (OR) 3路20, 95 per cent c.i. 1路35 to 7路57; P = 0路008) after risk adjustment for malignant disease (OR 2路34, 1路65 to 3路32; P < 0路001), emergency surgery (OR 4路08, 2路73 to 6路10; P < 0路001), time to operation at least 48 h (OR 1路99, 1路28 to 3路09; P = 0路002) and disease perforation (OR 4路00, 2路81 to 5路69; P < 0路001). Conclusion Global differences existed in the proportion of patients receiving end stomas after left-sided colorectal resection based on income, which went beyond case mix alone

    A Survey of File Organizations and Performance

    No full text
    corecore