17 research outputs found

    Bio-inspired computation for big data fusion, storage, processing, learning and visualization: state of the art and future directions

    Get PDF
    This overview gravitates on research achievements that have recently emerged from the confluence between Big Data technologies and bio-inspired computation. A manifold of reasons can be identified for the profitable synergy between these two paradigms, all rooted on the adaptability, intelligence and robustness that biologically inspired principles can provide to technologies aimed to manage, retrieve, fuse and process Big Data efficiently. We delve into this research field by first analyzing in depth the existing literature, with a focus on advances reported in the last few years. This prior literature analysis is complemented by an identification of the new trends and open challenges in Big Data that remain unsolved to date, and that can be effectively addressed by bio-inspired algorithms. As a second contribution, this work elaborates on how bio-inspired algorithms need to be adapted for their use in a Big Data context, in which data fusion becomes crucial as a previous step to allow processing and mining several and potentially heterogeneous data sources. This analysis allows exploring and comparing the scope and efficiency of existing approaches across different problems and domains, with the purpose of identifying new potential applications and research niches. Finally, this survey highlights open issues that remain unsolved to date in this research avenue, alongside a prescription of recommendations for future research.This work has received funding support from the Basque Government (Eusko Jaurlaritza) through the Consolidated Research Group MATHMODE (IT1294-19), EMAITEK and ELK ARTEK programs. D. Camacho also acknowledges support from the Spanish Ministry of Science and Education under PID2020-117263GB-100 grant (FightDIS), the Comunidad Autonoma de Madrid under S2018/TCS-4566 grant (CYNAMON), and the CHIST ERA 2017 BDSI PACMEL Project (PCI2019-103623, Spain)

    Evolving machine learning and deep learning models using evolutionary algorithms

    Get PDF
    Despite the great success in data mining, machine learning and deep learning models are yet subject to material obstacles when tackling real-life challenges, such as feature selection, initialization sensitivity, as well as hyperparameter optimization. The prevalence of these obstacles has severely constrained conventional machine learning and deep learning methods from fulfilling their potentials. In this research, three evolving machine learning and one evolving deep learning models are proposed to eliminate above bottlenecks, i.e. improving model initialization, enhancing feature representation, as well as optimizing model configuration, respectively, through hybridization between the advanced evolutionary algorithms and the conventional ML and DL methods. Specifically, two Firefly Algorithm based evolutionary clustering models are proposed to optimize cluster centroids in K-means and overcome initialization sensitivity as well as local stagnation. Secondly, a Particle Swarm Optimization based evolving feature selection model is developed for automatic identification of the most effective feature subset and reduction of feature dimensionality for tackling classification problems. Lastly, a Grey Wolf Optimizer based evolving Convolutional Neural Network-Long Short-Term Memory method is devised for automatic generation of the optimal topological and learning configurations for Convolutional Neural Network-Long Short-Term Memory networks to undertake multivariate time series prediction problems. Moreover, a variety of tailored search strategies are proposed to eliminate the intrinsic limitations embedded in the search mechanisms of the three employed evolutionary algorithms, i.e. the dictation of the global best signal in Particle Swarm Optimization, the constraint of the diagonal movement in Firefly Algorithm, as well as the acute contraction of search territory in Grey Wolf Optimizer, respectively. The remedy strategies include the diversification of guiding signals, the adaptive nonlinear search parameters, the hybrid position updating mechanisms, as well as the enhancement of population leaders. As such, the enhanced Particle Swarm Optimization, Firefly Algorithm, and Grey Wolf Optimizer variants are more likely to attain global optimality on complex search landscapes embedded in data mining problems, owing to the elevated search diversity as well as the achievement of advanced trade-offs between exploration and exploitation

    Scheduling Problems

    Get PDF
    Scheduling is defined as the process of assigning operations to resources over time to optimize a criterion. Problems with scheduling comprise both a set of resources and a set of a consumers. As such, managing scheduling problems involves managing the use of resources by several consumers. This book presents some new applications and trends related to task and data scheduling. In particular, chapters focus on data science, big data, high-performance computing, and Cloud computing environments. In addition, this book presents novel algorithms and literature reviews that will guide current and new researchers who work with load balancing, scheduling, and allocation problems

    Bio-inspired optimization in integrated river basin management

    Get PDF
    Water resources worldwide are facing severe challenges in terms of quality and quantity. It is essential to conserve, manage, and optimize water resources and their quality through integrated water resources management (IWRM). IWRM is an interdisciplinary field that works on multiple levels to maximize the socio-economic and ecological benefits of water resources. Since this is directly influenced by the river’s ecological health, the point of interest should start at the basin-level. The main objective of this study is to evaluate the application of bio-inspired optimization techniques in integrated river basin management (IRBM). This study demonstrates the application of versatile, flexible and yet simple metaheuristic bio-inspired algorithms in IRBM. In a novel approach, bio-inspired optimization algorithms Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) are used to spatially distribute mitigation measures within a basin to reduce long-term annual mean total nitrogen (TN) concentration at the outlet of the basin. The Upper Fuhse river basin developed in the hydrological model, Hydrological Predictions for the Environment (HYPE), is used as a case study. ACO and PSO are coupled with the HYPE model to distribute a set of measures and compute the resulting TN reduction. The algorithms spatially distribute nine crop and subbasin-level mitigation measures under four categories. Both algorithms can successfully yield a discrete combination of measures to reduce long-term annual mean TN concentration. They achieved an 18.65% reduction, and their performance was on par with each other. This study has established the applicability of these bio-inspired optimization algorithms in successfully distributing the TN mitigation measures within the river basin. Stakeholder involvement is a crucial aspect of IRBM. It ensures that researchers and policymakers are aware of the ground reality through large amounts of information collected from the stakeholder. Including stakeholders in policy planning and decision-making legitimizes the decisions and eases their implementation. Therefore, a socio-hydrological framework is developed and tested in the Larqui river basin, Chile, based on a field survey to explore the conditions under which the farmers would implement or extend the width of vegetative filter strips (VFS) to prevent soil erosion. The framework consists of a behavioral, social model (extended Theory of Planned Behavior, TPB) and an agent-based model (developed in NetLogo) coupled with the results from the vegetative filter model (Vegetative Filter Strip Modeling System, VFSMOD-W). The results showed that the ABM corroborates with the survey results and the farmers are willing to extend the width of VFS as long as their utility stays positive. This framework can be used to develop tailor-made policies for river basins based on the conditions of the river basins and the stakeholders' requirements to motivate them to adopt sustainable practices. It is vital to assess whether the proposed management plans achieve the expected results for the river basin and if the stakeholders will accept and implement them. The assessment via simulation tools ensures effective implementation and realization of the target stipulated by the decision-makers. In this regard, this dissertation introduces the application of bio-inspired optimization techniques in the field of IRBM. The successful discrete combinatorial optimization in terms of the spatial distribution of mitigation measures by ACO and PSO and the novel socio-hydrological framework using ABM prove the forte and diverse applicability of bio-inspired optimization algorithms

    Modélisation formelle des systèmes de détection d'intrusions

    Get PDF
    L’écosystème de la cybersécurité évolue en permanence en termes du nombre, de la diversité, et de la complexité des attaques. De ce fait, les outils de détection deviennent inefficaces face à certaines attaques. On distingue généralement trois types de systèmes de détection d’intrusions : détection par anomalies, détection par signatures et détection hybride. La détection par anomalies est fondée sur la caractérisation du comportement habituel du système, typiquement de manière statistique. Elle permet de détecter des attaques connues ou inconnues, mais génère aussi un très grand nombre de faux positifs. La détection par signatures permet de détecter des attaques connues en définissant des règles qui décrivent le comportement connu d’un attaquant. Cela demande une bonne connaissance du comportement de l’attaquant. La détection hybride repose sur plusieurs méthodes de détection incluant celles sus-citées. Elle présente l’avantage d’être plus précise pendant la détection. Des outils tels que Snort et Zeek offrent des langages de bas niveau pour l’expression de règles de reconnaissance d’attaques. Le nombre d’attaques potentielles étant très grand, ces bases de règles deviennent rapidement difficiles à gérer et à maintenir. De plus, l’expression de règles avec état dit stateful est particulièrement ardue pour reconnaître une séquence d’événements. Dans cette thèse, nous proposons une approche stateful basée sur les diagrammes d’état-transition algébriques (ASTDs) afin d’identifier des attaques complexes. Les ASTDs permettent de représenter de façon graphique et modulaire une spécification, ce qui facilite la maintenance et la compréhension des règles. Nous étendons la notation ASTD avec de nouvelles fonctionnalités pour représenter des attaques complexes. Ensuite, nous spécifions plusieurs attaques avec la notation étendue et exécutons les spécifications obtenues sur des flots d’événements à l’aide d’un interpréteur pour identifier des attaques. Nous évaluons aussi les performances de l’interpréteur avec des outils industriels tels que Snort et Zeek. Puis, nous réalisons un compilateur afin de générer du code exécutable à partir d’une spécification ASTD, capable d’identifier de façon efficiente les séquences d’événements.Abstract : The cybersecurity ecosystem continuously evolves with the number, the diversity, and the complexity of cyber attacks. Generally, we have three types of Intrusion Detection System (IDS) : anomaly-based detection, signature-based detection, and hybrid detection. Anomaly detection is based on the usual behavior description of the system, typically in a static manner. It enables detecting known or unknown attacks but also generating a large number of false positives. Signature based detection enables detecting known attacks by defining rules that describe known attacker’s behavior. It needs a good knowledge of attacker behavior. Hybrid detection relies on several detection methods including the previous ones. It has the advantage of being more precise during detection. Tools like Snort and Zeek offer low level languages to represent rules for detecting attacks. The number of potential attacks being large, these rule bases become quickly hard to manage and maintain. Moreover, the representation of stateful rules to recognize a sequence of events is particularly arduous. In this thesis, we propose a stateful approach based on algebraic state-transition diagrams (ASTDs) to identify complex attacks. ASTDs allow a graphical and modular representation of a specification, that facilitates maintenance and understanding of rules. We extend the ASTD notation with new features to represent complex attacks. Next, we specify several attacks with the extended notation and run the resulting specifications on event streams using an interpreter to identify attacks. We also evaluate the performance of the interpreter with industrial tools such as Snort and Zeek. Then, we build a compiler in order to generate executable code from an ASTD specification, able to efficiently identify sequences of events

    Feature Selection for Document Classification : Case Study of Meta-heuristic Intelligence and Traditional Approaches

    Get PDF
    Doctor of Philosophy (Computer Engineering), 2020Nowadays, the culture for accessing news around the world is changed from paper to electronic format and the rate of publication for newspapers and magazines on website are increased dramatically. Meanwhile, text feature selection for the automatic document classification (ADC) is becoming a big challenge because of the unstructured nature of text feature, which is called “multi-dimension feature problem”. On the other hand, various powerful schemes dealing with text feature selection are being developed continuously nowadays, but there still exists a research gap for “optimization of feature selection problem (OFSP)”, which can be looked for the global optimal features. Meanwhile, the capacity of meta-heuristic intelligence for knowledge discovery process (KDP) is also become the critical role to overcome NP-hard problem of OFSP by providing effective performance and efficient computation time. Therefore, the idea of meta-heuristic based approach for optimization of feature selection is proposed in this research to search the global optimal features for ADC. In this thesis, case study of meta-heuristic intelligence and traditional approaches for feature selection optimization process in document classification is observed. It includes eleven meta-heuristic algorithms such as Ant Colony search, Artificial Bee Colony search, Bat search, Cuckoo search, Evolutionary search, Elephant search, Firefly search, Flower search, Genetic search, Rhinoceros search, and Wolf search, for searching the optimal feature subset for document classification. Then, the results of proposed model are compared with three traditional search algorithms like Best First search (BFS), Greedy Stepwise (GS), and Ranker search (RS). In addition, the framework of data mining is applied. It involves data preprocessing, feature engineering, building learning model and evaluating the performance of proposed meta-heuristic intelligence-based feature selection using various performance and computation complexity evaluation schemes. In data processing, tokenization, stop-words handling, stemming and lemmatizing, and normalization are applied. In feature engineering process, n-gram TF-IDF feature extraction is used for implementing feature vector and both filter and wrapper approach are applied for observing different cases. In addition, three different classifiers like J48, Naïve Bayes, and Support Vector Machine, are used for building the document classification model. According to the results, the proposed system can reduce the number of selected features dramatically that can deteriorate learning model performance. In addition, the selected global subset features can yield better performance than traditional search according to single objective function of proposed model

    Swarm Intelligence

    Get PDF
    Swarm Intelligence has emerged as one of the most studied artificial intelligence branches during the last decade, constituting the fastest growing stream in the bio-inspired computation community. A clear trend can be deduced analyzing some of the most renowned scientific databases available, showing that the interest aroused by this branch has increased at a notable pace in the last years. This book describes the prominent theories and recent developments of Swarm Intelligence methods, and their application in all fields covered by engineering. This book unleashes a great opportunity for researchers, lecturers, and practitioners interested in Swarm Intelligence, optimization problems, and artificial intelligence

    Data-Intensive Computing in Smart Microgrids

    Get PDF
    Microgrids have recently emerged as the building block of a smart grid, combining distributed renewable energy sources, energy storage devices, and load management in order to improve power system reliability, enhance sustainable development, and reduce carbon emissions. At the same time, rapid advancements in sensor and metering technologies, wireless and network communication, as well as cloud and fog computing are leading to the collection and accumulation of large amounts of data (e.g., device status data, energy generation data, consumption data). The application of big data analysis techniques (e.g., forecasting, classification, clustering) on such data can optimize the power generation and operation in real time by accurately predicting electricity demands, discovering electricity consumption patterns, and developing dynamic pricing mechanisms. An efficient and intelligent analysis of the data will enable smart microgrids to detect and recover from failures quickly, respond to electricity demand swiftly, supply more reliable and economical energy, and enable customers to have more control over their energy use. Overall, data-intensive analytics can provide effective and efficient decision support for all of the producers, operators, customers, and regulators in smart microgrids, in order to achieve holistic smart energy management, including energy generation, transmission, distribution, and demand-side management. This book contains an assortment of relevant novel research contributions that provide real-world applications of data-intensive analytics in smart grids and contribute to the dissemination of new ideas in this area

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    Applied Methuerstic computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC
    corecore