    Grammar-based evolutionary approach for automated workflow composition with domain-specific operators and ensemble diversity

    The process of extracting valuable and novel insights from raw data involves a series of complex steps. In the realm of Automated Machine Learning (AutoML), a significant research focus is on automating aspects of this process, specifically tasks like selecting algorithms and optimising their hyper-parameters. A particularly challenging task in AutoML is automatic workflow composition (AWC). AWC aims to identify the most effective sequence of data preprocessing and ML algorithms, coupled with their best hyper-parameters, for a specific dataset. However, existing AWC methods are limited in how many and in what ways they can combine algorithms within a workflow. Addressing this gap, this paper introduces EvoFlow, a grammar-based evolutionary approach for AWC. EvoFlow enhances the flexibility in designing workflow structures, empowering practitioners to select algorithms that best fit their specific requirements. EvoFlow stands out by integrating two innovative features. First, it employs a suite of genetic operators, designed specifically for AWC, to optimise both the structure of workflows and their hyper-parameters. Second, it implements a novel updating mechanism that enriches the variety of predictions made by different workflows. Promoting this diversity helps prevent the algorithm from overfitting. With this aim, EvoFlow builds an ensemble whose workflows differ in their misclassified instances. To evaluate EvoFlow's effectiveness, we carried out empirical validation using a set of classification benchmarks. We begin with an ablation study to demonstrate the enhanced performance attributable to EvoFlow's unique components. Then, we compare EvoFlow with other AWC approaches, encompassing both evolutionary and non-evolutionary techniques. Our findings show that EvoFlow's specialised genetic operators and updating mechanism substantially outperform current leading methods[..]Comment: 32 pages, 7 figures, 6 tables, journal pape

    Reducing gaps in quantitative association rules: A genetic programming free-parameter algorithm

    The extraction of useful information for decision making is a challenge in many different domains. Association rule mining is one of the most important techniques in this field, discovering relationships of interest among patterns. Despite the mining of association rules being an area of great interest for many researchers, the search for well-grouped continuous values is still a challenge, discovering rules that do not comprise patterns which represent unnecessary ranges of values. Existing algorithms for mining association rules in continuous domains are mainly based on a non-deterministic search, requiring a high number of parameters to be optimised. These parameters hinder the mining process, and the algorithms themselves must be known to those data mining experts that want to use them. We therefore present a grammar guided genetic programming algorithm that does not require as many parameters as other existing approaches and enables the discovery of quantitative association rules comprising small-size gaps. The algorithm is verified over a varied set of data, comparing the results to other association rule mining algorithms from several paradigms. Additionally, some resulting rules from different paradigms are analysed, demonstrating the effectiveness of our model for reducing gaps in numerical features

    Artificial intelligence to automate the systematic review of scientific literature

    Artificial intelligence (AI) has acquired notorious relevance in modern computing as it effectively solves complex tasks traditionally done by humans. AI provides methods to represent and infer knowledge, efficiently manipulate texts and learn from vast amount of data. These characteristics are applicable in many activities that human find laborious or repetitive, as is the case of the analysis of scientific literature. Manually preparing and writing a systematic literature review (SLR) takes considerable time and effort, since it requires planning a strategy, conducting the literature search and analysis, and reporting the findings. Depending on the area under study, the number of papers retrieved can be of hundreds or thousands, meaning that filtering those relevant ones and extracting the key information becomes a costly and error-prone process. However, some of the involved tasks are repetitive and, therefore, subject to automation by means of AI. In this paper, we present a survey of AI techniques proposed in the last 15 years to help researchers conduct systematic analyses of scientific literature. We describe the tasks currently supported, the types of algorithms applied, and available tools proposed in 34 primary studies. This survey also provides a historical perspective of the evolution of the field and the role that humans can play in an increasingly automated SLR process.Comment: 25 pages, 3 figures, 1 table, journal pape

    GEML: A Grammar-based Evolutionary Machine Learning Approach for Design-Pattern Detection

    Design patterns (DPs) are recognised as a good practice in software development. However, the lack of appropriate documentation often hampers traceability, and their benefits are blurred among thousands of lines of code. Automatic methods for DP detection have become relevant but are usually based on the rigid analysis of either software metrics or specific properties of the source code. We propose GEML, a novel detection approach based on evolutionary machine learning using software properties of diverse nature. Firstly, GEML makes use of an evolutionary algorithm to extract those characteristics that better describe the DP, formulated in terms of human-readable rules, whose syntax is conformant with a context-free grammar. Secondly, a rule-based classifier is built to predict whether new code contains a hidden DP implementation. GEML has been validated over five DPs taken from a public repository recurrently adopted by machine learning studies. Then, we increase this number up to 15 diverse DPs, showing its effectiveness and robustness in terms of detection capability. An initial parameter study served to tune a parameter setup whose performance guarantees the general applicability of this approach without the need to adjust complex parameters to a specific pattern. Finally, a demonstration tool is also provided.Comment: 27 pages, 18 tables, 10 figures, journal pape

    La interiorización de valores y la convivencia escolar.

    This research deals with the theme: the internalization of values and school coexistence, this research project seeks to find relevant information about the problem: The deficient practice of values affect school coexistence whose general objective is to diagnose the internalization of values and coexistence school, through the application of different strategies, workshops and talks that originate good social and behavioral relationships among students of the 5th grader students at Unidad Educativa "Mariscal Antonio José de Sucre". For this purpose, a scientific, analytical and statistical research methodology was used to gather valuable information from various bibliographical sources to support the problem, instruments and techniques were also used to collect information and prepare statistical data to analyze the need and feasibility of the proposal. Bearing in mind the need to strengthen values as fundamental axes for good school coexistence, since currently the values are disappearing and there is no normal development in the formation of the human being. Consequently, some issues appear within the social context being one of the main fundamental demands that the human being has, which will serve as an ideal support in the normal development of people and their relationship with others. Taking into account that the values educate open-minded people focused on fulfilling with and enforcing norms that establish a normal development of people, as a reference the behavioral model is taken which focuses on the behavior modification during childhood and adolescence, placing values as fundamental pillars in the formation of the human being, enhancing good behaviors that will be put into practice throughout their training as integrated people able to relate to their peers solving problems that arise in this new society.La presente investigación aborda el tema: la interiorización de valores y la convivencia escolar, este proyecto de investigación trata de buscar información relevante acerca del problema: La deficiente práctica de valores afectan la convivencia escolar cuyo objetivo general es Diagnosticar la interiorización de valores y la convivencia escolar, mediante la aplicación de diferentes estrategias, talleres y charlas que originen buenas relaciones sociales y comportamentales entre estudiantes del 5° año de Educación Básica de la Unidad Educativa “Mariscal Antonio José de Sucre”. Para lo cual se utilizó una metodología de investigación científica, analítica y estadística que permitió recabar información valiosa de diversas fuentes bibliográficas para la sustentación del problema planteado, también se utilizaron instrumentos y técnicas para la recolección de información y elaboración de datos estadísticos que permitió analizar la necesidad y factibilidad de la propuesta. Teniendo en cuenta la necesidad de fortalecer los valores como ejes fundamentales para la buena convivencia escolar, puesto que en la actualidad los valores están desapareciendo y no hay un normal desenvolvimiento en la formación del ser humano, esto trae conflictos dentro del contexto social siendo una de las principales exigencias primordiales que tiene el ser humano, mismas que servirán como apoyo idóneo en el normal desarrollo de las personas y su relación con los demás. Tomando en cuenta que los valores son formadores de personas con una mentalidad abierta enfocada a cumplir y hacer cumplir normas que establezcan un normal desarrollo de los individuos, como referencia se toma el modelo conductual el cual se enfoca a la modificación del comportamiento durante la infancia y la adolescencia, ubicando a los valores como pilares fundamentales en la formación del ser humano, adoptando de esta manera comportamientos adecuados que serán puestos en práctica a lo largo de su formación como personas integras capaces de relacionarse con sus semejantes solucionando problemas que se presentan en esta nueva sociedad.Universidad Técnica de Cotopax

    Expert knowledge versus sampling in species distribution modelling

    Los registros puntuales de fauna, afectados por el dinamismo en el tiempo que configura la naturaleza, son incapaces de representar la distribución real y completa de una especie. Los expertos infieren las distribuciones de las especies acorde al conocimiento sobre la relación entre éstas y su entorno, aunque su conocimiento está sujeto a la forma difusa y subjetiva en la que la mente humana construye el pensamiento. La función de favorabilidad (FF) permitió la comparación entre ajustar los modelos a partir de los registros puntuales de campo o de los lugares ocupados por una especie según los expertos. Medimos éstos resultados para especies de anfibios: 1) amenazadas, 2) ubicuas y 3) ni amenazadas/ni ubicuas. Se generó una cartografía unificada desde ambas fuentes de conocimiento para todas las especies analizadas. Esta modelización basada en el pensamiento difuso, más acorde a la naturaleza, permitió la comparación de la información sobre la distribución de los anfibios de Uruguay, desde los registros de campo y desde el conocimiento de expertos. El resultado ayudó a predecir los territorios más favorables para encontrarlos. Las especies generalistas (ubicuas), se explicaron mejor por los modelos desde los registros observados, a pesar de su naturaleza incompleta, mientras las especies amenazadas lo fueron por el conocimiento del experto. Estos hallazgos resaltan la importancia de incluir tanto observaciones de campo, como el conocimiento de expertos, en la planificación de la conservación.Agencia Nacional de Investigación e Innovación (ANII); Comisión Académica de Posgrado (CAP) Universidad de la República, Uruguay; Plan Andaluz de Investigación, Desarrollo e Innovación (PAIDI) RNM-262, Junta de Andalucía; Universidad de Málaga (UMA). Funding for open access charge: Universidad de Málaga / CBU

    Evolutionary composition of QoS-aware web services: a many-objective perspective

    Web service based applications often invoke services provided by third-parties in their workflow. The Quality of Service (QoS) provided by the invoked supplier can be expressed in terms of the Service Level Agreement specifying the values contracted for particular aspects like cost or throughput, among others. In this scenario, intelligent systems can support the engineer to scrutinise the service market in order to select those candidates that best fit with the expected composition focusing on different QoS aspects. This search problem, also known as QoS-aware web service composition, is characterised by the presence of many diverse QoS properties to be simultaneously optimised from a multi-objective perspective. Nevertheless, as the number of QoS properties considered during the design phase increases and a larger number of decision factors come into play, it becomes more difficult to find the most suitable candidate solutions, so more sophisticated techniques are required to explore and return diverse, competitive alternatives. With this aim, this paper explores the suitability of many-objective evolutionary algorithms for addressing the binding problem of web services on the basis of a real-world benchmark with 9 QoS properties. A complete comparative study demonstrates that these techniques, never before applied to this problem, can achieve a better trade-off between all the QoS properties, or even promote specific QoS properties while keeping high values for the rest. In addition, this search process can be performed within a reasonable computational cost, enabling its adoption by intelligent and decision-support systems in the field of service oriented computation.Junta de Andalucía P12-TIC-1867Ministerio de Economía y Competitividad TIN2012-32273Junta de Andalucía TIC-5906Ministerio de Economía y Competitividad TIN2014-55252-PMinisterio de Economía y Competitividad TIN2015- 71841-REDTMinisterio de Educación, Cultura y Deportes FPU13/0146

    Draft genome sequence of Cupriavidus UYMMa02A, a novel beta-rhizobium species

    We present the draft genome of Cupriavidus UYMMa02A, a rhizobium strain isolated from root nodules of Mimosa magentea. The assembly has approximately 8.1 million bp with an average G+C of 64.1%. Symbiotic and metal-resistance genes were identified. The study of this genome will contribute to the understanding of rhizobial evolution