59 research outputs found

    Impact of contamination on training and test error rates in statistical clustering

    Full text link
    The k-means algorithm is one of the most common nonhierarchical methods of clustering. It aims to construct clusters in order to minimize the within cluster sum of squared distances. However, as most estimators defined in terms of objective functions depending on global sums of squares, the k-means procedure is not robust with respect to atypical observations in the data. Alternative techniques have thus been introduced in the literature, e.g. the k-medoids method. The k-means and k-medoids methodologies are particular cases of the generalized k-means procedure. In this paper, focus is on the error rate these clustering procedures achieve when one expects the data to be distributed according to a mixture distribution. Two different definitions of the error rate are under consideration, depending on the data at hand. It is shown that contamination may make one of these two error rates decrease even under optimal models. The consequence of this will be emphasized with the comparison of influence functions and breakdown points of these error rates

    Pengaruh Metode Moral Reasoning Terhadap Penanaman Karakter Nasionalisme Siswa SD Dalam Pembelajaran Tematik

    Full text link
    Penelitian ini bertujuan untuk mengetahui pengaruh metode moral reasoning terhadap penanaman karakter nasionalisme siswa SD dalam pembelajaran tematik. Jenis penelitian ini adalah quasi experiment dengan desain nonequivalent control group design. Subjek penelitian ini adalah siswa kelas V di SD N Ngebel Kasihan. SD N Ngebel memiliki dua kelas, kelas V A sebagai kelompok kontrol menggunakan metode storytelling dan kelas V B sebagai kelompok eksperimen menggunakan metode moral reasoning. Teknik pengumpulan data yang digunakan adalah observasi dan wawancara. Analisis data yang digunakan adalah uji-t dengan taraf signifikansi 0,05. Hasil penelitian menunjukkan bahwa ada perbedaaan yang signifikan antara penanaman karakter nasionalisme dengan metode moral reasoning dan metode storytelling. Perbedaaan tersebut terlihat di semua subkarakter nasionalisme yang mencakup nilai Ketuhanan Yang Maha Esa dengan hasil uji t 0,155, nilai Kemanusiaan yang Adil dan Beradab dengan hasil uji t 0,129, nilai Persatuan Indonesia dengan hasil uji t 0,405, nilai Kerakyatan yang Dipimpin oleh Hikmat Kebijaksanaan dalam Permusyawaratan/Perwakilan dengan hasil uji t 0,529, dan nilai Keadilan Sosial Bagi Seluruh Rakyat Indonesia dengan hasil uji t 0,608

    Robust principal component analysis based on trimming around affine subspaces

    No full text
    Principal Component Analysis (PCA) is a widely used technique for reducing dimensionality of multivariate data. The principal component subspace is defined as the affine subspace of a given dimension d giving the best fit to the data. PCA suffers from a well-known lack of robustness. As a robust alternative, one can resort to an impartial trimming based approach and search for the best subsample containing a proportion 1 − α of the observations, with 0 < α < 1, and the best d-dimensional affine subspace fitting this subsample, yielding the trimmed principal component subspace. A population version will be given and existence of solutions to both the sample and population problems will be proven. Moreover, under mild conditions, the solutions of the sample problem are consistent toward the solutions of the population one. The robustness of the method is studied by proving qualitative robustness, computing the breakdown point, and deriving the influence functions. Furthermore, asymptotic efficiencies at the normal model are derived and finite sample efficiencies are studied by means of a simulation study.status: publishe
    corecore