    Discovering hierarchical decision rules with evolutive algorithms in supervised learning

    This paper describes a new approach, HIDER (HIerarchical DEcision Rules), for learning rules in continuous and discrete domains based on evolutive algorithms. The algorithm produces a hierarchical set of rules, that is, the rules must be applied in a speciÞc order. With this policy, the number of rules may be reduced because the rules could be one inside of another. The evolutive algorithm uses both real and binary codiÞcation for the individuals of the population and introduces several new genetic operators. In addition, this paper discusses the capability of learning systems based on an evolutive algorithm to reduce both the number of rules and the number of attributes involved in the rule set. We have tested our system on real data from the UCI repository. The results of a 10-fold cross validation are compared to C4.5 s and they show an important improvement.Comisión Interministerial de Ciencia y Tecnología TIC99-035

    Non-parametric Nearest Neighbor with Local Adaptation

    The k-Nearest Neighbor algorithm (k-NN) uses a classification criterion that depends on the parameter k. Usually, the value of this parameter must be determined by the user. In this paper we present an algorithm based on the NN technique that does not take the value of k from the user. Our approach evaluates values of k that classified the training examples correctly and takes which classified most examples. As the user does not take part in the election of the parameter k, the algorithm is non-parametric. With this heuristic, we propose an easy variation of the k-NN algorithm that gives robustness with noise present in data. Summarized in the last section, the experiments show that the error rate decreases in comparison with the k-NN technique when the best k for each database has been previously obtained

    Software Process Dynamics: Modeling, Simulation and Improvement

    The aim of this chapter is to introduce the reader to the dynamics of the software process, the ways to represent and formalize it, and how it can be integrated with other techniques to facilitate, among other things, process improvement. In order to achieve this goal, different approaches of software process modeling and simulation will be introduced, analyzing their pros and cons. Then, continuous modeling will be used as the modeling approach to build software process models that work in the qualitative and quantitative fields, assessing the decision-making process and the software process improvement arena. The integration of this approach with current process assessment models (such as CMM), static and algorithmic models (such as traditional models used in the estimation process) and the design of a metrics collection program which is triggered by the actual process of model building will also be described in the chapter.Comisión Interministerial de Ciencia y Tecnología (CICYT) TIN2004-06689-C03-0

    Facing online challenges using learning classifier systems

    Els grans avenços en el camp de l’aprenentatge automàtic han resultat en el disseny de màquines competents que són capaces d’aprendre i d’extreure informació útil i original de l’experiència. Recentment, algunes d’aquestes tècniques d’aprenentatge s’han aplicat amb èxit per resoldre problemes del món real en àmbits tecnològics, mèdics, científics i industrials, els quals no es podien tractar amb tècniques convencionals d’anàlisi ja sigui per la seva complexitat o pel gran volum de dades a processar. Donat aquest èxit inicial, actualment els sistemes d’aprenentatge s’enfronten a problemes de complexitat més elevada, el que ha resultat en un augment de l’activitat investigadora entorn sistemes capaços d’afrontar nous problemes del món real eficientment i de manera escalable. Una de les famílies d’algorismes més prometedores en l’aprenentatge automàtic són els sistemes classificadors basats en algorismes genetics (LCSs), el funcionament dels quals s’inspira en la natura. Els LCSs intenten representar les polítiques d’actuació d’experts humans amb un conjunt de regles que s’empren per escollir les millors accions a realitzar en tot moment. Així doncs, aquests sistemes aprenen polítiques d’actuació de manera incremental a mida que van adquirint experiència a través de la informació nova que se’ls va presentant durant el temps. Els LCSs s’han aplicat, amb èxit, a camps tan diversos com la predicció de càncer de pròstata o el suport a la inversió en borsa, entre altres. A més en alguns casos s’ha demostrat que els LCSs realitzen tasques superant la precisió dels éssers humans. El propòsit d’aquesta tesi és explorar la naturalesa de l’aprenentatge online dels LCSs d’estil Michigan per a la mineria de grans quantitats de dades en forma de fluxos d’informació continus a alta velocitat i canviants en el temps. Molt sovint, l’extracció de coneixement a partir d’aquestes fonts de dades és clau per tal d’obtenir una millor comprensió dels processos que les dades estan descrivint. Així, aprendre d’aquestes dades planteja nous reptes a les tècniques tradicionals d’aprenentatge automàtic, les quals no estan dissenyades per tractar fluxos de dades continus i on els conceptes i els nivells de soroll poden variar amb el temps de forma arbitrària. La contribució de la present tesi pren l’eXtended Classifier System (XCS), el LCS d’estil Michigan més estudiat i un dels algoritmes d’aprenentatge automàtic més competents, com el punt de partida. D’aquesta manera els reptes abordats en aquesta tesi són dos: el primer desafiament és la construcció d’un sistema supervisat competent sobre el framework dels LCSs d’estil Michigan que aprèn dels fluxos de dades amb una capacitat de reacció ràpida als canvis de concepte i entrades amb soroll. Com moltes aplicacions científiques i industrials generen grans quantitats de dades sense etiquetar, el segon repte és aplicar les lliçons apreses per continuar amb el disseny de LCSs d’estil Michigan capaços de solucionar problemes online sense assumir una estructura a priori en els dades d’entrada.Los grandes avances en el campo del aprendizaje automático han resultado en el diseño de máquinas capaces de aprender y de extraer información útil y original de la experiencia. Recientemente alguna de estas técnicas de aprendizaje se han aplicado con éxito para resolver problemas del mundo real en ámbitos tecnológicos, médicos, científicos e industriales, los cuales no se podían tratar con técnicas convencionales de análisis ya sea por su complejidad o por el gran volumen de datos a procesar. Dado este éxito inicial, los sistemas de aprendizaje automático se enfrentan actualmente a problemas de complejidad cada vez m ́as elevada, lo que ha resultado en un aumento de la actividad investigadora en sistemas capaces de afrontar nuevos problemas del mundo real de manera eficiente y escalable. Una de las familias más prometedoras dentro del aprendizaje automático son los sistemas clasificadores basados en algoritmos genéticos (LCSs), el funcionamiento de los cuales se inspira en la naturaleza. Los LCSs intentan representar las políticas de actuación de expertos humanos usando conjuntos de reglas que se emplean para escoger las mejores acciones a realizar en todo momento. Así pues estos sistemas aprenden políticas de actuación de manera incremental mientras van adquiriendo experiencia a través de la nueva información que se les va presentando. Los LCSs se han aplicado con éxito en campos tan diversos como en la predicción de cáncer de próstata o en sistemas de soporte de bolsa, entre otros. Además en algunos casos se ha demostrado que los LCSs realizan tareas superando la precisión de expertos humanos. El propósito de la presente tesis es explorar la naturaleza online del aprendizaje empleado por los LCSs de estilo Michigan para la minería de grandes cantidades de datos en forma de flujos continuos de información a alta velocidad y cambiantes en el tiempo. La extracción del conocimiento a partir de estas fuentes de datos es clave para obtener una mejor comprensión de los procesos que se describen. Así, aprender de estos datos plantea nuevos retos a las técnicas tradicionales, las cuales no están diseñadas para tratar flujos de datos continuos y donde los conceptos y los niveles de ruido pueden variar en el tiempo de forma arbitraria. La contribución del la presente tesis toma el eXtended Classifier System (XCS), el LCS de tipo Michigan más estudiado y uno de los sistemas de aprendizaje automático más competentes, como punto de partida. De esta forma los retos abordados en esta tesis son dos: el primer desafío es la construcción de un sistema supervisado competente sobre el framework de los LCSs de estilo Michigan que aprende de flujos de datos con una capacidad de reacción rápida a los cambios de concepto y al ruido. Como muchas aplicaciones científicas e industriales generan grandes volúmenes de datos sin etiquetar, el segundo reto es aplicar las lecciones aprendidas para continuar con el diseño de nuevos LCSs de tipo Michigan capaces de solucionar problemas online sin asumir una estructura a priori en los datos de entrada.Last advances in machine learning have fostered the design of competent algorithms that are able to learn and extract novel and useful information from data. Recently, some of these techniques have been successfully applied to solve real-­‐world problems in distinct technological, scientific and industrial areas; problems that were not possible to handle by the traditional engineering methodology of analysis either for their inherent complexity or by the huge volumes of data involved. Due to the initial success of these pioneers, current machine learning systems are facing problems with higher difficulties that hamper the learning process of such algorithms, promoting the interest of practitioners for designing systems that are able to scalably and efficiently tackle real-­‐world problems. One of the most appealing machine learning paradigms are Learning Classifier Systems (LCSs), and more specifically Michigan-­‐style LCSs, an open framework that combines an apportionment of credit mechanism with a knowledge discovery technique inspired by biological processes to evolve their internal knowledge. In this regard, LCSs mimic human experts by making use of rule lists to choose the best action to a given problem situation, acquiring their knowledge through the experience. LCSs have been applied with relative success to a wide set of real-­‐ world problems such as cancer prediction or business support systems, among many others. Furthermore, on some of these areas LCSs have demonstrated learning capacities that exceed those of human experts for that particular task. The purpose of this thesis is to explore the online learning nature of Michigan-­‐style LCSs for mining large amounts of data in the form of continuous, high speed and time-­‐changing streams of information. Most often, extracting knowledge from these data is key, in order to gain a better understanding of the processes that the data are describing. Learning from these data poses new challenges to traditional machine learning techniques, which are not typically designed to deal with data in which concepts and noise levels may vary over time. The contribution of this thesis takes the extended classifier system (XCS), the most studied Michigan-­‐style LCS and one of the most competent machine learning algorithms, as the starting point. Thus, the challenges addressed in this thesis are twofold: the first challenge is building a competent supervised system based on the guidance of Michigan-­‐style LCSs that learns from data streams with a fast reaction capacity to changes in concept and noisy inputs. As many scientific and industrial applications generate vast amounts of unlabelled data, the second challenge is to apply the lessons learned in the previous issue to continue with the design of unsupervised Michigan-­‐style LCSs that handle online problems without assuming any a priori structure in input data

    GenoMus: Representing Procedural Musical Structures with an Encoded Functional Grammar Optimized for Metaprogramming and Machine Learning

    We present GenoMus, a new model for artificial musical creativity based on a procedural approach, able to represent compositional techniques behind a musical score. This model aims to build a framework for automatic creativity, that is easily adaptable to other domains beyond music. The core of GenoMus is a functional grammar designed to cover a wide range of styles, integrating traditional and contemporary composing techniques. In its encoded form, both composing methods and music scores are represented as one-dimensional arrays of normalized values. On the other hand, the decoded form of GenoMus grammar is human-readable, allowing for manual editing and the implementation of user-defined processes. Musical procedures (genotypes) are functional trees, able to generate musical scores (phenotypes). Each subprocess uses the same generic functional structure, regardless of the time scale, polyphonic structure, or traditional or algorithmic process being employed. Some works produced with the algorithm have been already published. This highly homogeneous and modular approach simplifies metaprogramming and maximizes search space. Its abstract and compact representation of musical knowledge as pure numeric arrays is optimized for the application of different machine learning paradigms.FEDER/Junta de Andalucia A.TIC.244.UGR20 Spanish GovernmentEuropean Commission PID2021-125537NA-I0

    AFRANCI : multi-layer architecture for cognitive agents

    Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 201

    Critical Computation: Digital Automata and General Artificial Thinking

    Since the 1980s, computational systems of information processing have evolved to include not only deductive methods of decision, whereby results are already implicated in their premises, but have crucially shifted towards an adaptive practice of learning from data, an inductive method of retrieving information from the environment and establish general premises. This shift in logical methods of decision-making does not simply concern technical apparatuses, but is a symptom of a transformation in logical thinking activated with and through machines. This article discusses the pioneering work of Katherine Hayles whose study of the cybernetic and computational infrastructures of our culture particularly clarifies this epistemological transformation of thinking in relation to machines

    A Survey on Data Mining Techniques Applied to Energy Time Series Forecasting

    Data mining has become an essential tool during the last decade to analyze large sets of data. The variety of techniques it includes and the successful results obtained in many application fields, make this family of approaches powerful and widely used. In particular, this work explores the application of these techniques to time series forecasting. Although classical statistical-based methods provides reasonably good results, the result of the application of data mining outperforms those of classical ones. Hence, this work faces two main challenges: (i) to provide a compact mathematical formulation of the mainly used techniques; (ii) to review the latest works of time series forecasting and, as case study, those related to electricity price and demand markets.Ministerio de Economía y Competitividad TIN2014-55894-C2-RJunta de Andalucía P12- TIC-1728Universidad Pablo de Olavide APPB81309