Search CORE

916 research outputs found

Medoid-based clustering using ant colony optimization

Author: Camacho D.
Camacho D.
Menéndez H.
Menéndez H.
Otero F.
Otero F.
Publication venue: Springer
Publication date: 01/01/2016
Field of study

The application of ACO-based algorithms in data mining has been growing over the last few years, and several supervised and unsupervised learning algorithms have been developed using this bio-inspired approach. Most recent works about unsupervised learning have focused on clustering, showing the potential of ACO-based techniques. However, there are still clustering areas that are almost unexplored using these techniques, such as medoid-based clustering. Medoid-based clustering methods are helpful—compared to classical centroid-based techniques—when centroids cannot be easily defined. This paper proposes two medoid-based ACO clustering algorithms, where the only information needed is the distance between data: one algorithm that uses an ACO procedure to determine an optimal medoid set (METACOC algorithm) and another algorithm that uses an automatic selection of the number of clusters (METACOC-K algorithm). The proposed algorithms are compared against classical clustering approaches using synthetic and real-world datasets

Middlesex University Research Repository

Capuchin Search Particle Swarm Optimization (CS-PSO) based Optimized Approach to Improve the QoS Provisioning in Cloud Computing Environment

Author: Gupta Bhumika
Gupta Manila
Singh Devendra
Publication venue: Auricle Global Society of Education and Research
Publication date: 11/06/2023
Field of study

This review introduces the methods for further enhancing resource assignment in distributed computing situations taking into account QoS restrictions. While resource distribution typically affects the quality of service (QoS) of cloud organizations, QoS constraints such as response time, throughput, hold-up time, and makespan are key factors to take into account. The approach makes use of a methodology from the Capuchin Search Particle Large Number Improvement (CS-PSO) apparatus to smooth out resource designation while taking QoS constraints into account. Throughput, reaction time, makespan, holding time, and resource use are just a few of the objectives the approach works on. The method divides the resources in an optimum way using the K-medoids batching scheme. During batching, projects are divided into two-pack assembles, and the resource segment method is enhanced to obtain the optimal configuration. The exploratory association makes use of the JAVA device and the GWA-T-12 Bitbrains dataset for replication. The outrageous worth advancement problem of the multivariable capacity is addressed using the superior calculation. The simulation findings demonstrate that the core (Cloud Molecule Multitude Improvement, CPSO) computation during 500 ages has not reached assembly repeatedly, repeatedly, repeatedly, and repeatedly, respectively.The connection analysis reveals that the developed model outperforms the state-of-the-art approaches. Generally speaking, this approach provides significant areas of strength for a successful procedure for improving resource designation in distributed processing conditions and can be applied to address a variety of resource segment challenges, such as virtual machine setup, work arranging, and resource allocation. Because of this, the capuchin search molecule enhancement algorithm (CSPSO) ensures the success of the improvement measures, such as minimal streamlined polynomial math, rapid consolidation speed, high productivity, and a wide variety of people

International Journal on Recent and Innovation Trends in Computing and Communication

A statistical learning based approach for parameter fine-tuning of metaheuristics

Author: Calvet Liñán Laura
Juan Pérez Ángel Alejandro
Ries Jana
Serrat Piè Carles
Publication venue
Publication date: 01/01/2016
Field of study

Metaheuristics are approximation methods used to solve combinatorial optimization problems. Their performance usually depends on a set of parameters that need to be adjusted. The selection of appropriate parameter values causes a loss of efficiency, as it requires time, and advanced analytical and problem-specific skills. This paper provides an overview of the principal approaches to tackle the Parameter Setting Problem, focusing on the statistical procedures employed so far by the scientific community. In addition, a novel methodology is proposed, which is tested using an already existing algorithm for solving the Multi-Depot Vehicle Routing Problem.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The Oberta in open access

Medoid-based clustering using ant colony optimization

Author: A Hamdi
AP Dempster
C Fraley
D Martens
David Camacho
E Hruschka
E Keogh
F França
Fernando E. B. Otero
H Menéndez
HD Menéndez
Héctor D. Menéndez
J Demšar
J Handl
L Cao
L Hubert
M Dorigo
M Dorigo
OM Jafar
PJ Rousseeuw
R Tibshirani
T Liao
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Springer - Publisher Connector

UCL Discovery

Middlesex University Research Repository

Kent Academic Repository

Design of Cluster-Randomized Trials with Cross-Cluster Interference

Author: Leung Michael P.
Publication venue
Publication date: 19/11/2023
Field of study

Cluster-randomized trials often involve units that are irregularly distributed in space without well-separated communities. In these settings, cluster construction is a critical aspect of the design due to the potential for cross-cluster interference. The existing literature relies on partial interference models, which take clusters as given and assume no cross-cluster interference. We relax this assumption by allowing interference to decay with geographic distance between units. This induces a bias-variance trade-off: constructing fewer, larger clusters reduces bias due to interference but increases variance. We propose new estimators that exclude units most potentially impacted by cross-cluster interference and show that this substantially reduces asymptotic bias relative to conventional difference-in-means estimators. We provide formal justification for a new design that chooses the number of clusters to balance the asymptotic bias and variance of our estimators and uses unsupervised learning to automate cluster construction

arXiv.org e-Print Archive

Knowledge Extraction From PV Power Generation With Deep Learning Autoencoder and Clustering-Based Algorithms

Author: Brenna M.
Longo M.
Miraftabzadeh S.
Publication venue
Publication date: 01/01/2023
Field of study

The unpredictable nature of photovoltaic solar power generation, caused by changing weather conditions, creates challenges for grid operators as they work to balance supply and demand. As solar power continues to become a larger part of the energy mix, managing this intermittency will be increasingly important. This paper focuses on identifying daily photovoltaic power production patterns to gain new knowledge of the generation patterns throughout the year based on unsupervised learning algorithms. The proposed data-driven model aims to extract typical daily photovoltaic power generation patterns by transforming the high dimensional temporal features of the daily PV power output into a lower latent feature space, which is learned by a deep learning autoencoder. Subsequently, the Partitioning Around Medoids (PAM) clustering algorithm is employed to identify the six distinct dominant patterns. Finally, a new algorithm is proposed to reconstruct these patterns in their original subspace. The proposed model is applied to two distinct datasets for further analysis. The results indicate that four out of the identified patterns in both datasets exhibit high correlation (over 95%) and temporal trends. These patterns correspond to distinct weather conditions, such as entirely sunny, mostly sunny, cloudy, and negligible power generation days, which were observed approximately 61% of the analyzed period. These typical patterns can be expected to be observed in other locations as well. Identified PV power generation patterns can improve forecasting models, optimize energy management systems, and aid in implementing energy storage or demand response programs and scheduling efficiently

Archivio istituzionale della ricerca - Politecnico di Milano

Medoid Silhouette clustering with automatic cluster number selection

Author: Lenssen Lars
Schubert Erich
Publication venue
Publication date: 07/09/2023
Field of study

The evaluation of clustering results is difficult, highly dependent on the evaluated data set and the perspective of the beholder. There are many different clustering quality measures, which try to provide a general measure to validate clustering results. A very popular measure is the Silhouette. We discuss the efficient medoid-based variant of the Silhouette, perform a theoretical analysis of its properties, provide two fast versions for the direct optimization, and discuss the use to choose the optimal number of clusters. We combine ideas from the original Silhouette with the well-known PAM algorithm and its latest improvements FasterPAM. One of the versions guarantees equal results to the original variant and provides a run speedup of

O(k^2)

. In experiments on real data with 30000 samples and

k

=100, we observed a 10464

\times

speedup compared to the original PAMMEDSIL algorithm. Additionally, we provide a variant to choose the optimal number of clusters directly.Comment: arXiv admin note: substantial text overlap with arXiv:2209.1255

arXiv.org e-Print Archive