10,758 research outputs found

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    The role of artificial intelligence-driven soft sensors in advanced sustainable process industries: a critical review

    Get PDF
    With the predicted depletion of natural resources and alarming environmental issues, sustainable development has become a popular as well as a much-needed concept in modern process industries. Hence, manufacturers are quite keen on adopting novel process monitoring techniques to enhance product quality and process efficiency while minimizing possible adverse environmental impacts. Hardware sensors are employed in process industries to aid process monitoring and control, but they are associated with many limitations such as disturbances to the process flow, measurement delays, frequent need for maintenance, and high capital costs. As a result, soft sensors have become an attractive alternative for predicting quality-related parameters that are ‘hard-to-measure’ using hardware sensors. Due to their promising features over hardware counterparts, they have been employed across different process industries. This article attempts to explore the state-of-the-art artificial intelligence (Al)-driven soft sensors designed for process industries and their role in achieving the goal of sustainable development. First, a general introduction is given to soft sensors, their applications in different process industries, and their significance in achieving sustainable development goals. AI-based soft sensing algorithms are then introduced. Next, a discussion on how AI-driven soft sensors contribute toward different sustainable manufacturing strategies of process industries is provided. This is followed by a critical review of the most recent state-of-the-art AI-based soft sensors reported in the literature. Here, the use of powerful AI-based algorithms for addressing the limitations of traditional algorithms, that restrict the soft sensor performance is discussed. Finally, the challenges and limitations associated with the current soft sensor design, application, and maintenance aspects are discussed with possible future directions for designing more intelligent and smart soft sensing technologies to cater the future industrial needs

    A Machine Learning based Empirical Evaluation of Cyber Threat Actors High Level Attack Patterns over Low level Attack Patterns in Attributing Attacks

    Full text link
    Cyber threat attribution is the process of identifying the actor of an attack incident in cyberspace. An accurate and timely threat attribution plays an important role in deterring future attacks by applying appropriate and timely defense mechanisms. Manual analysis of attack patterns gathered by honeypot deployments, intrusion detection systems, firewalls, and via trace-back procedures is still the preferred method of security analysts for cyber threat attribution. Such attack patterns are low-level Indicators of Compromise (IOC). They represent Tactics, Techniques, Procedures (TTP), and software tools used by the adversaries in their campaigns. The adversaries rarely re-use them. They can also be manipulated, resulting in false and unfair attribution. To empirically evaluate and compare the effectiveness of both kinds of IOC, there are two problems that need to be addressed. The first problem is that in recent research works, the ineffectiveness of low-level IOC for cyber threat attribution has been discussed intuitively. An empirical evaluation for the measure of the effectiveness of low-level IOC based on a real-world dataset is missing. The second problem is that the available dataset for high-level IOC has a single instance for each predictive class label that cannot be used directly for training machine learning models. To address these problems in this research work, we empirically evaluate the effectiveness of low-level IOC based on a real-world dataset that is specifically built for comparative analysis with high-level IOC. The experimental results show that the high-level IOC trained models effectively attribute cyberattacks with an accuracy of 95% as compared to the low-level IOC trained models where accuracy is 40%.Comment: 20 page

    PU GNN: Chargeback Fraud Detection in P2E MMORPGs via Graph Attention Networks with Imbalanced PU Labels

    Full text link
    The recent advent of play-to-earn (P2E) systems in massively multiplayer online role-playing games (MMORPGs) has made in-game goods interchangeable with real-world values more than ever before. The goods in the P2E MMORPGs can be directly exchanged with cryptocurrencies such as Bitcoin, Ethereum, or Klaytn via blockchain networks. Unlike traditional in-game goods, once they had been written to the blockchains, P2E goods cannot be restored by the game operation teams even with chargeback fraud such as payment fraud, cancellation, or refund. To tackle the problem, we propose a novel chargeback fraud prediction method, PU GNN, which leverages graph attention networks with PU loss to capture both the players' in-game behavior with P2E token transaction patterns. With the adoption of modified GraphSMOTE, the proposed model handles the imbalanced distribution of labels in chargeback fraud datasets. The conducted experiments on three real-world P2E MMORPG datasets demonstrate that PU GNN achieves superior performances over previously suggested methods.Comment: Under Review, Industry Trac

    Analysis and Design of Detection for Liver Cancer using Particle Swarm Optimization and Decision Tree

    Get PDF
    Liver cancer is taken as a major cause of death all over the world. According to WHO (World Health Organization) every year 9.6 million peoples are died due to cancer worldwide. It is one of the eighth most leading causes of death in women and fifth in men as reported by the American Cancer Society. The number of death rate due to cancer is projected to increase by45 percent in between 2008 to 2030. The most common cancers are lung, breast, and liver, colorectal. Approximately 7, 82,000 peoples are died due to liver cancer each year. The most efficient way to decrease the death rate cause of liver cancer is to treat the diseases in the initial stage. Early treatment depends upon the early diagnosis, which depends on reliable diagnosis methods. CT imaging is one of the most common and important technique and it acts as an imaging tool for evaluating the patients with intuition of liver cancer. The diagnosis of liver cancer has historically been made manually by a skilled radiologist, who relied on their expertise and personal judgement to reach a conclusion. The main objective of this paper is to develop the automatic methods based on machine learning approach for accurate detection of liver cancer in order to help radiologists in the clinical practice. The paper primary contribution to the process of liver cancer lesion classification and automatic detection for clinical diagnosis. For the purpose of detecting liver cancer lesions, the best approaches based on PSO and DPSO have been given. With the help of the C4.5 decision tree classifier, wavelet-based statistical and morphological features were retrieved and categorised

    Modelling, Monitoring, Control and Optimization for Complex Industrial Processes

    Get PDF
    This reprint includes 22 research papers and an editorial, collected from the Special Issue "Modelling, Monitoring, Control and Optimization for Complex Industrial Processes", highlighting recent research advances and emerging research directions in complex industrial processes. This reprint aims to promote the research field and benefit the readers from both academic communities and industrial sectors

    Designing similarity functions

    Get PDF
    The concept of similarity is important in many areas of cognitive science, computer science, and statistics. In machine learning, functions that measure similarity between two instances form the core of instance-based classifiers. Past similarity measures have been primarily based on simple Euclidean distance. As machine learning has matured, it has become obvious that a simple numeric instance representation is insufficient for most domains. Similarity functions for symbolic attributes have been developed, and simple methods for combining these functions with numeric similarity functions were devised. This sequence of events has revealed three important issues, which this thesis addresses. The first issue is concerned with combining multiple measures of similarity. There is no equivalence between units of numeric similarity and units of symbolic similarity. Existing similarity functions for numeric and symbolic attributes have no common foundation, and so various schemes have been devised to avoid biasing the overall similarity towards one type of attribute. The similarity function design framework proposed by this thesis produces probability distributions that describe the likelihood of transforming between two attribute values. Because common units of probability are employed, similarities may be combined using standard methods. It is empirically shown that the resulting similarity functions treat different attribute types coherently. The second issue relates to the instance representation itself. The current choice of numeric and symbolic attribute types is insufficient for many domains, in which more complicated representations are required. For example, a domain may require varying numbers of features, or features with structural information. The framework proposed by this thesis is sufficiently general to permit virtually any type of instance representation-all that is required is that a set of basic transformations that operate on the instances be defined. To illustrate the framework’s applicability to different instance representations, several example similarity functions are developed. The third, and perhaps most important, issue concerns the ability to incorporate domain knowledge within similarity functions. Domain information plays an important part in choosing an instance representation. However, even given an adequate instance representation, domain information is often lost. For example, numeric features that are modulo (such as the time of day) can be perfectly represented as a numeric attribute, but simple linear similarity functions ignore the modulo nature of the attribute. Similarly, symbolic attributes may have inter-symbol relationships that should be captured in the similarity function. The design framework proposed by this thesis allows domain information to be captured in the similarity function, both in the transformation model and in the probability assigned to basic transformations. Empirical results indicate that such domain information improves classifier performance, particularly when training data is limited

    A Reinforcement Learning-assisted Genetic Programming Algorithm for Team Formation Problem Considering Person-Job Matching

    Full text link
    An efficient team is essential for the company to successfully complete new projects. To solve the team formation problem considering person-job matching (TFP-PJM), a 0-1 integer programming model is constructed, which considers both person-job matching and team members' willingness to communicate on team efficiency, with the person-job matching score calculated using intuitionistic fuzzy numbers. Then, a reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions. The RL-GP adopts the ensemble population strategies. Before the population evolution at each generation, the agent selects one from four population search modes according to the information obtained, thus realizing a sound balance of exploration and exploitation. In addition, surrogate models are used in the algorithm to evaluate the formation plans generated by individuals, which speeds up the algorithm learning process. Afterward, a series of comparison experiments are conducted to verify the overall performance of RL-GP and the effectiveness of the improved strategies within the algorithm. The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams. This study reveals the advantages of reinforcement learning methods, ensemble strategies, and the surrogate model applied to the GP framework. The diversity and intelligent selection of search patterns along with fast adaptation evaluation, are distinct features that enable RL-GP to be deployed in real-world enterprise environments.Comment: 16 page

    Variational Quantum Time Evolution without the Quantum Geometric Tensor

    Full text link
    The real- and imaginary-time evolution of quantum states are powerful tools in physics and chemistry to investigate quantum dynamics, prepare ground states or calculate thermodynamic observables. They also find applications in wider fields such as quantum machine learning or optimization. On near-term devices, variational quantum time evolution is a promising candidate for these tasks, as the required circuit model can be tailored to trade off available device capabilities and approximation accuracy. However, even if the circuits can be reliably executed, variational quantum time evolution algorithms quickly become infeasible for relevant system sizes. They require the calculation of the Quantum Geometric Tensor and its complexity scales quadratically with the number of parameters in the circuit. In this work, we propose a solution to this scaling problem by leveraging a dual formulation that circumvents the explicit evaluation of the Quantum Geometric Tensor. We demonstrate our algorithm for the time evolution of the Heisenberg Hamiltonian and show that it accurately reproduces the system dynamics at a fraction of the cost of standard variational quantum time evolution algorithms. As an application, we calculate thermodynamic observables with the QMETTS algorithm

    Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

    Full text link
    In this paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 98% were articles with at least 482 citations published in 903 journals during the past 30 years. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent
    corecore