1,063 research outputs found

    Evaluating an automated procedure of machine learning parameter tuning for software effort estimation

    Get PDF
    Software effort estimation requires accurate prediction models. Machine learning algorithms have been used to create more accurate estimation models. However, these algorithms are sensitive to factors such as the choice of hyper-parameters. To reduce this sensitivity, automated approaches for hyper-parameter tuning have been recently investigated. There is a need for further research on the effectiveness of such approaches in the context of software effort estimation. These evaluations could help understand which hyper-parameter settings can be adjusted to improve model accuracy, and in which specific contexts tuning can benefit model performance. The goal of this work is to develop an automated procedure for machine learning hyper-parameter tuning in the context of software effort estimation. The automated procedure builds and evaluates software effort estimation models to determine the most accurate evaluation schemes. The methodology followed in this work consists of first performing a systematic mapping study to characterize existing hyper-parameter tuning approaches in software effort estimation, developing the procedure to automate the evaluation of hyper-parameter tuning, and conducting controlled quasi experiments to evaluate the automated procedure. From the systematic literature mapping we discovered that effort estimation literature has favored the use of grid search. The results we obtained in our quasi experiments demonstrated that fast, less exhaustive tuners were viable in place of grid search. These results indicate that randomly evaluating 60 hyper-parameters can be as good as grid search, and that multiple state-of-the-art tuners were only more effective than this random search in 6% of the evaluated dataset-model combinations. We endorse random search, genetic algorithms, flash, differential evolution, and tabu and harmony search as effective tuners.Los algoritmos de aprendizaje automático han sido utilizados para crear modelos con mayor precisión para la estimación del esfuerzo del desarrollo de software. Sin embargo, estos algoritmos son sensibles a factores, incluyendo la selección de hiper parámetros. Para reducir esto, se han investigado recientemente algoritmos de ajuste automático de hiper parámetros. Es necesario evaluar la efectividad de estos algoritmos en el contexto de estimación de esfuerzo. Estas evaluaciones podrían ayudar a entender qué hiper parámetros se pueden ajustar para mejorar los modelos, y en qué contextos esto ayuda el rendimiento de los modelos. El objetivo de este trabajo es desarrollar un procedimiento automatizado para el ajuste de hiper parámetros para algoritmos de aprendizaje automático aplicados a la estimación de esfuerzo del desarrollo de software. La metodología seguida en este trabajo consta de realizar un estudio de mapeo sistemático para caracterizar los algoritmos de ajuste existentes, desarrollar el procedimiento automatizado, y conducir cuasi experimentos controlados para evaluar este procedimiento. Mediante el mapeo sistemático descubrimos que la literatura en estimación de esfuerzo ha favorecido el uso de la búsqueda en cuadrícula. Los resultados obtenidos en nuestros cuasi experimentos demostraron que algoritmos de estimación no-exhaustivos son viables para la estimación de esfuerzo. Estos resultados indican que evaluar aleatoriamente 60 hiper parámetros puede ser tan efectivo como la búsqueda en cuadrícula, y que muchos de los métodos usados en el estado del arte son solo más efectivos que esta búsqueda aleatoria en 6% de los escenarios. Recomendamos el uso de la búsqueda aleatoria, algoritmos genéticos y similares, y la búsqueda tabú y harmónica.Escuela de Ciencias de la Computación e InformáticaCentro de Investigaciones en Tecnologías de la Información y ComunicaciónUCR::Vicerrectoría de Investigación::Sistema de Estudios de Posgrado::Ingeniería::Maestría Académica en Computación e Informátic

    A Novel Grouping Harmony Search Algorithm for Clustering Problems

    Get PDF
    The problem of partitioning a data set into disjoint groups or clusters of related items plays a key role in data analytics, in particular when the information retrieval becomes crucial for further data analysis. In this context, clustering approaches aim at obtaining a good parti- tion of the data based on multiple criteria. One of the most challenging aspects of clustering techniques is the inference of the optimal number of clusters. In this regard, a number of clustering methods from the literature assume that the number of clusters is known a priori and sub- sequently assign instances to clusters based on distance, density or any other criterion. This paper proposes to override any prior assumption on the number of clusters or groups in the data at hand by hybridizing the grouping encoding strategy and the Harmony Search (HS) algorithm. The resulting hybrid approach optimally infers the number of clusters by means of the tailored design of the HS operators, which estimates this important structural clustering parameter as an implicit byproduct of the instance-to-cluster mapping performed by the algorithm. Apart from inferring the optimal number of clusters, simulation results ver- ify that the proposed scheme achieves a better performance than other na ̈ıve clustering techniques in synthetic scenarios and widely known data repositories

    Applications of Artificial Intelligence in Power Systems

    Get PDF
    Artificial intelligence tools, which are fast, robust and adaptive can overcome the drawbacks of traditional solutions for several power systems problems. In this work, applications of AI techniques have been studied for solving two important problems in power systems. The first problem is static security evaluation (SSE). The objective of SSE is to identify the contingencies in planning and operations of power systems. Numerical conventional solutions are time-consuming, computationally expensive, and are not suitable for online applications. SSE may be considered as a binary-classification, multi-classification or regression problem. In this work, multi-support vector machine is combined with several evolutionary computation algorithms, including particle swarm optimization (PSO), differential evolution, Ant colony optimization for the continuous domain, and harmony search techniques to solve the SSE. Moreover, support vector regression is combined with modified PSO with a proposed modification on the inertia weight in order to solve the SSE. Also, the correct accuracy of classification, the speed of training, and the final cost of using power equipment heavily depend on the selected input features. In this dissertation, multi-object PSO has been used to solve this problem. Furthermore, a multi-classifier voting scheme is proposed to get the final test output. The classifiers participating in the voting scheme include multi-SVM with different types of kernels and random forests with an adaptive number of trees. In short, the development and performance of different machine learning tools combined with evolutionary computation techniques have been studied to solve the online SSE. The performance of the proposed techniques is tested on several benchmark systems, namely the IEEE 9-bus, 14-bus, 39-bus, 57-bus, 118-bus, and 300-bus power systems. The second problem is the non-convex, nonlinear, and non-differentiable economic dispatch (ED) problem. The purpose of solving the ED is to improve the cost-effectiveness of power generation. To solve ED with multi-fuel options, prohibited operating zones, valve point effect, and transmission line losses, genetic algorithm (GA) variant-based methods, such as breeder GA, fast navigating GA, twin removal GA, kite GA, and United GA are used. The IEEE systems with 6-units, 10-units, and 15-units are used to study the efficiency of the algorithms

    Applications of Artificial Intelligence in Power Systems

    Get PDF
    Artificial intelligence tools, which are fast, robust and adaptive can overcome the drawbacks of traditional solutions for several power systems problems. In this work, applications of AI techniques have been studied for solving two important problems in power systems. The first problem is static security evaluation (SSE). The objective of SSE is to identify the contingencies in planning and operations of power systems. Numerical conventional solutions are time-consuming, computationally expensive, and are not suitable for online applications. SSE may be considered as a binary-classification, multi-classification or regression problem. In this work, multi-support vector machine is combined with several evolutionary computation algorithms, including particle swarm optimization (PSO), differential evolution, Ant colony optimization for the continuous domain, and harmony search techniques to solve the SSE. Moreover, support vector regression is combined with modified PSO with a proposed modification on the inertia weight in order to solve the SSE. Also, the correct accuracy of classification, the speed of training, and the final cost of using power equipment heavily depend on the selected input features. In this dissertation, multi-object PSO has been used to solve this problem. Furthermore, a multi-classifier voting scheme is proposed to get the final test output. The classifiers participating in the voting scheme include multi-SVM with different types of kernels and random forests with an adaptive number of trees. In short, the development and performance of different machine learning tools combined with evolutionary computation techniques have been studied to solve the online SSE. The performance of the proposed techniques is tested on several benchmark systems, namely the IEEE 9-bus, 14-bus, 39-bus, 57-bus, 118-bus, and 300-bus power systems. The second problem is the non-convex, nonlinear, and non-differentiable economic dispatch (ED) problem. The purpose of solving the ED is to improve the cost-effectiveness of power generation. To solve ED with multi-fuel options, prohibited operating zones, valve point effect, and transmission line losses, genetic algorithm (GA) variant-based methods, such as breeder GA, fast navigating GA, twin removal GA, kite GA, and United GA are used. The IEEE systems with 6-units, 10-units, and 15-units are used to study the efficiency of the algorithms

    Computational intelligence approaches for energy load forecasting in smart energy management grids: state of the art, future challenges, and research directions and Research Directions

    Get PDF
    Energy management systems are designed to monitor, optimize, and control the smart grid energy market. Demand-side management, considered as an essential part of the energy management system, can enable utility market operators to make better management decisions for energy trading between consumers and the operator. In this system, a priori knowledge about the energy load pattern can help reshape the load and cut the energy demand curve, thus allowing a better management and distribution of the energy in smart grid energy systems. Designing a computationally intelligent load forecasting (ILF) system is often a primary goal of energy demand management. This study explores the state of the art of computationally intelligent (i.e., machine learning) methods that are applied in load forecasting in terms of their classification and evaluation for sustainable operation of the overall energy management system. More than 50 research papers related to the subject identified in existing literature are classified into two categories: namely the single and the hybrid computational intelligence (CI)-based load forecasting technique. The advantages and disadvantages of each individual techniques also discussed to encapsulate them into the perspective into the energy management research. The identified methods have been further investigated by a qualitative analysis based on the accuracy of the prediction, which confirms the dominance of hybrid forecasting methods, which are often applied as metaheurstic algorithms considering the different optimization techniques over single model approaches. Based on extensive surveys, the review paper predicts a continuous future expansion of such literature on different CI approaches and their optimizations with both heuristic and metaheuristic methods used for energy load forecasting and their potential utilization in real-time smart energy management grids to address future challenges in energy demand managemen

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    PSO-CALBA: Particle Swarm Optimization Based Content-Aware Load Balancing Algorithm in Cloud Computing Environment

    Get PDF
    Cloud computing provides hosted services (i.e., servers, storage, bandwidth, and software) over the internet. The key benefits of cloud computing are scalability, efficiency, and cost reduction. The key challenge in cloud computing is the even distribution of workload across numerous heterogeneous servers. Several Cloud scheduling and load-balancing techniques have been proposed in the literature. These techniques include heuristic-based, meta-heuristics-based, and hybrid algorithms. However, most of the current cloud scheduling and load balancing schemes are not content-aware (i.e., they are not considering the content-type of user tasks). The literature studies show that the content type of tasks can significantly improve the balanced distribution of workload. In this paper, a novel hybrid approach named Particle Swarm Optimization based Content-Aware Load Balancing Algorithm (PSO-CALBA) is proposed. PSO-CALBA scheduling scheme combines machine learning and meta-heuristic algorithm that performs classification utilizing file content type. The SVM classifier is used to classify users' tasks into different content types like video, audio, image, and text. Particle Swarm Optimization (PSO) based meta-heuristic algorithm is used to map user's tasks on Cloud. The proposed approach has been implemented and evaluated using a renowned Cloudsim simulation kit and compared with ACOFTF and DFTF. The proposed study shows significant improvement in terms of makespan, degree of imbalance (DI)
    • …
    corecore