2 research outputs found

    Improved point center algorithm for K-Means clustering to increase software defect prediction

    Get PDF
    The k-means is a clustering algorithm that is often and easy to use. This algorithm is susceptible to randomly chosen centroid points so that it cannot produce optimal results. This research aimed to improve the k-means algorithm’s performance by applying a proposed algorithm called point center. The proposed algorithm overcame the random centroid value in k-means and then applied it to predict software defects modules’ errors. The point center algorithm was proposed to determine the initial centroid value for the k-means algorithm optimization. Then, the selection of X and Y variables determined the cluster center members. The ten datasets were used to perform the testing, of which nine datasets were used for predicting software defects. The proposed center point algorithm showed the lowest errors. It also improved the k-means algorithm’s performance by an average of 12.82% cluster errors in the software compared to the centroid value obtained randomly on the simple k-means algorithm. The findings are beneficial and contribute to developing a clustering model to handle data, such as to predict software defect modules more accurately

    Classification of solar variability using k-means method for the evaluation of solar photovoltaic systems performance

    Get PDF
    This paper presents a classification of solar tilt irradiance using the k-means clustering method, and an evaluation of the impact of different solar variabilities on monocrystalline and thin-film photovoltaic (PV) systems. The variability index and clearness index were implemented to quantify five years of solar datasets to assist in clustering solar variabilities. The elbow method was used to validate the k-clustering for solar variabilities. Due to the compact solar datasets, the Silhouette Coefficient and Gap Statistic were utilized to validate the k-cluster numbers. The PV performance was evaluated using the generated power, energy, and performance ratio for solar datasets from 2015 and 2019. Equal number of samples was taken from each PV system to analyse the average calculated values. The results showed that the elbow method was inaccurate for clustering solar variabilities, although it showed a weak elbow at K2 that was inaccurate for grouping solar variabilities. However, the k-means validation methods detected K3, K4, and K5 as the best k-cluster numbers. Among them, K4 was compatible for separating four types of solar variabilities, namely, overcast, moderate, mixed (clear/mild), and high variability. Based on the average performance values of the monocrystalline and thin-film PV systems for 2015 compared to 2019, similar degradation values were detected, especially for the performance ratio (0.77) under overcast. The thin-film showed degraded generated power and energy under the moderate type. The degraded generated power and performance ratio for the monocrystalline were due to the high passing clouds under the mixed and high variability types
    corecore