Search CORE

9 research outputs found

The <em>K</em>-Means Algorithm Evolution

Author: Almanza-Ortega Nelva Nely
Martínez-Rebollar Alicia
Pazos-Rangel Rodolfo
Pérez-Ortega Joaquín
Vega-Villalobos Andrea
Zavala-Díaz Crispín
Publication venue: 'IntechOpen'
Publication date: 03/04/2019
Field of study

IntechOpen

Crossref

“Cooperation Greedy Monkey Algorithm”: Algoritmo paralelo para resolver la clase fuertemente correlacionada del problema de la mochila 0-1

Author: Jaqueline López-Calderón
Joaquín Pérez-Ortega
José Crispín Zavala-Díaz
Nely Nelva Almanza-Ortega
Publication venue: 'Universidad Autonoma del Estado de Morelos'
Publication date: 01/06/2021
Field of study

Se presenta la paralelización del Cooperation Greedy Monkey Algorithm y el ajuste de parámetros para resolver el problema KP 0-1 (0-1 Knapsack Problem). Los problemas resueltos son tomados de la literatura especializada hasta las instancias establecidas por Pisinger, las no correlacionadas, las débilmente correlacionadas y las fuertemente correlacionadas. Se amplía la capacidad de solución del algoritmo para resolver instancias con diferentes porcentajes del 25% y 50% de la suma de los pesos de los elementos, y no únicamente el 75% como está diseñado el algoritmo originalmente. Se utilizó un modelo maestro-esclavo para su implementación paralela en un cluster de 5 servidores. Los resultados son alentadores y en algunas ocasiones se calcula la solución óptima

Directory of Open Access Journals

Una nueva estrategia heurística para el problema de Bin Packing

Author: Castillo-Zacatelco Hilda
Estrada-Esquivel Hugo
Martínez-Rebollar Alicia
Mexicano-Santoyo Adriana
Pérez-Ortega Joaquín
Vilariño-Ayala Darnes
Zavala-Díaz José Crispín
Publication venue
Publication date: 30/06/2016
Field of study

ResumenEl problema de Bin Packing (BPP) es NP-duro, por lo que un método exacto para resolver instancias del BPP requiere un gran número de variables y demasiado tiempo de ejecución. En este trabajo se propone una nueva estrategia heurística para resolver instancias del BPP en donde se garantiza la solución óptima. La estrategia propuesta incluye el uso de un nuevo modelo exacto basado en arcos de flujo. En el modelo propuesto, el número de variables se redujo asignando objetos en contenedores. Adicionalmente se incluye una heurística que mediante el preprocesado de la instancia permite reducir su tamaño y con ello el espacio de búsqueda del algoritmo de solución. Para validar el enfoque propuesto, se realizaron experimentos usando los conjuntos de prueba hard28, 53nirup, bin1data, uniform, triplets y subconjuntos de otras instancias, todos ellos conocidos en el estado del arte. Los resultados muestran que empleando nuestro enfoque es posible encontrar la solución óptima de todas las instancias de prueba. Además, el tiempo de ejecución se redujo en relación con lo reportado por el modelo basado en arcos de flujo. Las reducciones de tiempo fueron de 19.7 y 43% para los conjuntos 53nirup y hard28, respectivamente.AbstractThe Bin Packing problem (BPP) is NP-hard, the use of exact methods for solving BPP instances require a high number of variables and therefore a high computational cost. In this paper a new heuristic strategy for solving the BPP instances, which guarantees obtain optimal solutions, is proposed. The proposed strategy includes the use of a new model based on flow arcs. In the proposed model, the number of variables was reduced by previous allocation of objects in bins. Additionally, our approach includes a heuristic that allows reducing the instance size and thereby the solution algorithm search space. To validate the proposed approach, experiments were performed using the test sets hard28, 53nirup, bin1data and falkenauer, all of them well known in the state of the art. The results show that it using our approach is possible to find the optimal solution for all test set. Also, the execution time was reduced in regard the reported time obtained by using the flow arc model. Time reductions were up to 19.7 and 43% for 53nirup and hard28 test set, respectively

Elsevier - Publisher Connector

A Multi-Branch-and-Bound Binary Parallel Algorithm to Solve the Knapsack Problem 0–1 in a Multicore Cluster

Author: Jacqueline López-Calderón
José Alberto Hernández-Aguilar
José Crispín Zavala-Díaz
Marco Antonio Cruz-Chávez
Martha Elena Luna-Ortíz
Publication venue: 'MDPI AG'
Publication date: 09/12/2019
Field of study

This paper presents a process that is based on sets of parts, where elements are fixed and removed to form different binary branch-and-bound (BB) trees, which in turn are used to build a parallel algorithm called “multi-BB”. These sequential and parallel algorithms calculate the exact solution for the 0–1 knapsack problem. The sequential algorithm solves the instances published by other researchers (and the proposals by Pisinger) to solve the not-so-complex (uncorrelated) class and some problems of the medium-complex (weakly correlated) class. The parallel algorithm solves the problems that cannot be solved with the sequential algorithm of the weakly correlated class in a cluster of multicore processors. The multi-branch-and-bound algorithms obtained parallel efficiencies of approximately 75%, but in some cases, it was possible to obtain a superlinear speedup

Multidisciplinary Digital Publishing Institute

Application of Data Science for Cluster Analysis of COVID-19 Mortality According to Sociodemographic Factors at Municipal Level in Mexico

Author: Gerardo Martínez-González
Joaquín Pérez-Ortega
José Crispín Zavala-Díaz
Kirvis Torres-Poveda
Nelva Nely Almanza-Ortega
Rodolfo Pazos-Rangel
Publication venue: 'MDPI AG'
Publication date: 22/06/2022
Field of study

Mexico is among the five countries with the largest number of reported deaths from COVID-19 disease, and the mortality rates associated to infections are heterogeneous in the country due to structural factors concerning population. This study aims at the analysis of clusters related to mortality rate from COVID-19 at the municipal level in Mexico from the perspective of Data Science. In this sense, a new application is presented that uses a machine learning hybrid algorithm for generating clusters of municipalities with similar values of sociodemographic indicators and mortality rates. To provide a systematic framework, we applied an extension of the International Business Machines Corporation (IBM) methodology called Batch Foundation Methodology for Data Science (FMDS). For the study, 1,086,743 death certificates corresponding to the year 2020 were used, among other official data. As a result of the analysis, two key indicators related to mortality from COVID-19 at the municipal level were identified: one is population density and the other is percentage of population in poverty. Based on these indicators, 16 municipality clusters were determined. Among the main results of this research, it was found that clusters with high values of mortality rate had high values of population density and low poverty levels. In contrast, clusters with low density values and high poverty levels had low mortality rates. Finally, we think that the patterns found, expressed as municipality clusters with similar characteristics, can be useful for decision making by health authorities regarding disease prevention and control for reinforcing public health measures and optimizing resource distribution for reducing hospitalizations and mortality

Multidisciplinary Digital Publishing Institute

Hybrid Fuzzy C-Means Clustering Algorithm Oriented to Big Data Realms

Author: Crispín Zavala-Díaz
Joaquín Pérez-Ortega
Juan Frausto Solís
Nelva Nely Almanza-Ortega
Sandra Silvia Roblero-Aguilar
Vanesa Landero-Nájera
Yasmín Hernández
Publication venue: 'MDPI AG'
Publication date: 31/07/2022
Field of study

A hybrid variant of the Fuzzy C-Means and K-Means algorithms is proposed to solve large datasets such as those presented in Big Data. The Fuzzy C-Means algorithm is sensitive to the initial values of the membership matrix. Therefore, a special configuration of the matrix can accelerate the convergence of the algorithm. In this sense, a new approach is proposed, which we call Hybrid OK-Means Fuzzy C-Means (HOFCM), and it optimizes the values of the membership matrix parameter. This approach consists of three steps: (a) generate a set of n solutions of an x dataset, applying a variant of the K-Means algorithm; (b) select the best solution as the basis for generating the optimized membership matrix; (c) resolve the x dataset with Fuzzy C-Means. The experimental results with four real datasets and one synthetic dataset show that HOFCM reduces the time by up to 93.94% compared to the average time of the standard Fuzzy C-Means. It is highlighted that the quality of the solution was reduced by 2.51% in the worst case

Multidisciplinary Digital Publishing Institute

POFCM: A Parallel Fuzzy Clustering Algorithm for Large Datasets

Author: Crispín Zavala-Díaz
César David Rey-Figueroa
Joaquín Pérez-Ortega
Nelva Nely Almanza-Ortega
Salomón García-Paredes
Sandra Silvia Roblero-Aguilar
Vanesa Landero-Nájera
Publication venue: 'MDPI AG'
Publication date: 01/04/2023
Field of study

Clustering algorithms have proven to be a useful tool to extract knowledge and support decision making by processing large volumes of data. Hard and fuzzy clustering algorithms have been used successfully to identify patterns and trends in many areas, such as finance, healthcare, and marketing. However, these algorithms significantly increase their solution time as the size of the datasets to be solved increase, making their use unfeasible. In this sense, the parallel processing of algorithms has proven to be an efficient alternative to reduce their solution time. It has been established that the parallel implementation of algorithms requires its redesign to optimise the hardware resources of the platform that will be used. In this article, we propose a new parallel implementation of the Hybrid OK-Means Fuzzy C-Means (HOFCM) algorithm, which is an efficient variant of Fuzzy C-Means, in OpenMP. An advantage of using OpenMP is its scalability. The efficiency of the implementation is compared against the HOFCM algorithm. The experimental results of processing large real and synthetic datasets show that our implementation tends to more efficiently solve instances with a large number of clusters and dimensions. Additionally, the implementation shows excellent results concerning speedup and parallel efficiency metrics. Our main contribution is a Fuzzy clustering algorithm for large datasets that is scalable and not limited to a specific domain

Directory of Open Access Journals