10,758 research outputs found
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
The role of artificial intelligence-driven soft sensors in advanced sustainable process industries: a critical review
With the predicted depletion of natural resources and alarming environmental issues, sustainable development has become a popular as well as a much-needed concept in modern process industries. Hence, manufacturers are quite keen on adopting novel process monitoring techniques to enhance product quality and process efficiency while minimizing possible adverse environmental impacts. Hardware sensors are employed in process industries to aid process monitoring and control, but they are associated with many limitations such as disturbances to the process flow, measurement delays, frequent need for maintenance, and high capital costs. As a result, soft sensors have become an attractive alternative for predicting quality-related parameters that are ‘hard-to-measure’ using hardware sensors. Due to their promising features over hardware counterparts, they have been employed across different process industries. This article attempts to explore the state-of-the-art artificial intelligence (Al)-driven soft sensors designed for process industries and their role in achieving the goal of sustainable development. First, a general introduction is given to soft sensors, their applications in different process industries, and their significance in achieving sustainable development goals. AI-based soft sensing algorithms are then introduced. Next, a discussion on how AI-driven soft sensors contribute toward different sustainable manufacturing strategies of process industries is provided. This is followed by a critical review of the most recent state-of-the-art AI-based soft sensors reported in the literature. Here, the use of powerful AI-based algorithms for addressing the limitations of traditional algorithms, that restrict the soft sensor performance is discussed. Finally, the challenges and limitations associated with the current soft sensor design, application, and maintenance aspects are discussed with possible future directions for designing more intelligent and smart soft sensing technologies to cater the future industrial needs
A Machine Learning based Empirical Evaluation of Cyber Threat Actors High Level Attack Patterns over Low level Attack Patterns in Attributing Attacks
Cyber threat attribution is the process of identifying the actor of an attack
incident in cyberspace. An accurate and timely threat attribution plays an
important role in deterring future attacks by applying appropriate and timely
defense mechanisms. Manual analysis of attack patterns gathered by honeypot
deployments, intrusion detection systems, firewalls, and via trace-back
procedures is still the preferred method of security analysts for cyber threat
attribution. Such attack patterns are low-level Indicators of Compromise (IOC).
They represent Tactics, Techniques, Procedures (TTP), and software tools used
by the adversaries in their campaigns. The adversaries rarely re-use them. They
can also be manipulated, resulting in false and unfair attribution. To
empirically evaluate and compare the effectiveness of both kinds of IOC, there
are two problems that need to be addressed. The first problem is that in recent
research works, the ineffectiveness of low-level IOC for cyber threat
attribution has been discussed intuitively. An empirical evaluation for the
measure of the effectiveness of low-level IOC based on a real-world dataset is
missing. The second problem is that the available dataset for high-level IOC
has a single instance for each predictive class label that cannot be used
directly for training machine learning models. To address these problems in
this research work, we empirically evaluate the effectiveness of low-level IOC
based on a real-world dataset that is specifically built for comparative
analysis with high-level IOC. The experimental results show that the high-level
IOC trained models effectively attribute cyberattacks with an accuracy of 95%
as compared to the low-level IOC trained models where accuracy is 40%.Comment: 20 page
PU GNN: Chargeback Fraud Detection in P2E MMORPGs via Graph Attention Networks with Imbalanced PU Labels
The recent advent of play-to-earn (P2E) systems in massively multiplayer
online role-playing games (MMORPGs) has made in-game goods interchangeable with
real-world values more than ever before. The goods in the P2E MMORPGs can be
directly exchanged with cryptocurrencies such as Bitcoin, Ethereum, or Klaytn
via blockchain networks. Unlike traditional in-game goods, once they had been
written to the blockchains, P2E goods cannot be restored by the game operation
teams even with chargeback fraud such as payment fraud, cancellation, or
refund. To tackle the problem, we propose a novel chargeback fraud prediction
method, PU GNN, which leverages graph attention networks with PU loss to
capture both the players' in-game behavior with P2E token transaction patterns.
With the adoption of modified GraphSMOTE, the proposed model handles the
imbalanced distribution of labels in chargeback fraud datasets. The conducted
experiments on three real-world P2E MMORPG datasets demonstrate that PU GNN
achieves superior performances over previously suggested methods.Comment: Under Review, Industry Trac
Analysis and Design of Detection for Liver Cancer using Particle Swarm Optimization and Decision Tree
Liver cancer is taken as a major cause of death all over the world. According to WHO (World Health Organization) every year 9.6 million peoples are died due to cancer worldwide. It is one of the eighth most leading causes of death in women and fifth in men as reported by the American Cancer Society. The number of death rate due to cancer is projected to increase by45 percent in between 2008 to 2030. The most common cancers are lung, breast, and liver, colorectal. Approximately 7, 82,000 peoples are died due to liver cancer each year. The most efficient way to decrease the death rate cause of liver cancer is to treat the diseases in the initial stage. Early treatment depends upon the early diagnosis, which depends on reliable diagnosis methods. CT imaging is one of the most common and important technique and it acts as an imaging tool for evaluating the patients with intuition of liver cancer. The diagnosis of liver cancer has historically been made manually by a skilled radiologist, who relied on their expertise and personal judgement to reach a conclusion. The main objective of this paper is to develop the automatic methods based on machine learning approach for accurate detection of liver cancer in order to help radiologists in the clinical practice. The paper primary contribution to the process of liver cancer lesion classification and automatic detection for clinical diagnosis. For the purpose of detecting liver cancer lesions, the best approaches based on PSO and DPSO have been given. With the help of the C4.5 decision tree classifier, wavelet-based statistical and morphological features were retrieved and categorised
Modelling, Monitoring, Control and Optimization for Complex Industrial Processes
This reprint includes 22 research papers and an editorial, collected from the Special Issue "Modelling, Monitoring, Control and Optimization for Complex Industrial Processes", highlighting recent research advances and emerging research directions in complex industrial processes. This reprint aims to promote the research field and benefit the readers from both academic communities and industrial sectors
Designing similarity functions
The concept of similarity is important in many areas of cognitive science, computer science, and statistics. In machine learning, functions that measure similarity between two instances form the core of instance-based classifiers. Past similarity measures have been primarily based on simple Euclidean distance. As machine learning has matured, it has become obvious that a simple numeric instance representation is insufficient for most domains. Similarity functions for symbolic attributes have been developed, and simple methods for combining these functions with numeric similarity functions were devised. This sequence of events has revealed three important issues, which this thesis addresses.
The first issue is concerned with combining multiple measures of similarity. There is no equivalence between units of numeric similarity and units of symbolic similarity. Existing similarity functions for numeric and symbolic attributes have no common foundation, and so various schemes have been devised to avoid biasing the overall similarity towards one type of attribute. The similarity function design framework proposed by this thesis produces probability distributions that describe the likelihood of transforming between two attribute values. Because common units of probability are employed, similarities may be combined using standard methods. It is empirically shown that the resulting similarity functions treat different attribute types coherently.
The second issue relates to the instance representation itself. The current choice of numeric and symbolic attribute types is insufficient for many domains, in which more complicated representations are required. For example, a domain may require varying numbers of features, or features with structural information. The framework proposed by this thesis is sufficiently general to permit virtually any type of instance representation-all that is required is that a set of basic transformations that operate on the instances be defined. To illustrate the framework’s applicability to different instance representations, several example similarity functions are developed.
The third, and perhaps most important, issue concerns the ability to incorporate domain knowledge within similarity functions. Domain information plays an important part in choosing an instance representation. However, even given an adequate instance representation, domain information is often lost. For example, numeric features that are modulo (such as the time of day) can be perfectly represented as a numeric attribute, but simple linear similarity functions ignore the modulo nature of the attribute. Similarly, symbolic attributes may have inter-symbol relationships that should be captured in the similarity function. The design framework proposed by this thesis allows domain information to be captured in the similarity function, both in the transformation model and in the probability assigned to basic transformations. Empirical results indicate that such domain information improves classifier performance, particularly when training data is limited
A Reinforcement Learning-assisted Genetic Programming Algorithm for Team Formation Problem Considering Person-Job Matching
An efficient team is essential for the company to successfully complete new
projects. To solve the team formation problem considering person-job matching
(TFP-PJM), a 0-1 integer programming model is constructed, which considers both
person-job matching and team members' willingness to communicate on team
efficiency, with the person-job matching score calculated using intuitionistic
fuzzy numbers. Then, a reinforcement learning-assisted genetic programming
algorithm (RL-GP) is proposed to enhance the quality of solutions. The RL-GP
adopts the ensemble population strategies. Before the population evolution at
each generation, the agent selects one from four population search modes
according to the information obtained, thus realizing a sound balance of
exploration and exploitation. In addition, surrogate models are used in the
algorithm to evaluate the formation plans generated by individuals, which
speeds up the algorithm learning process. Afterward, a series of comparison
experiments are conducted to verify the overall performance of RL-GP and the
effectiveness of the improved strategies within the algorithm. The
hyper-heuristic rules obtained through efficient learning can be utilized as
decision-making aids when forming project teams. This study reveals the
advantages of reinforcement learning methods, ensemble strategies, and the
surrogate model applied to the GP framework. The diversity and intelligent
selection of search patterns along with fast adaptation evaluation, are
distinct features that enable RL-GP to be deployed in real-world enterprise
environments.Comment: 16 page
Variational Quantum Time Evolution without the Quantum Geometric Tensor
The real- and imaginary-time evolution of quantum states are powerful tools
in physics and chemistry to investigate quantum dynamics, prepare ground states
or calculate thermodynamic observables. They also find applications in wider
fields such as quantum machine learning or optimization. On near-term devices,
variational quantum time evolution is a promising candidate for these tasks, as
the required circuit model can be tailored to trade off available device
capabilities and approximation accuracy. However, even if the circuits can be
reliably executed, variational quantum time evolution algorithms quickly become
infeasible for relevant system sizes. They require the calculation of the
Quantum Geometric Tensor and its complexity scales quadratically with the
number of parameters in the circuit. In this work, we propose a solution to
this scaling problem by leveraging a dual formulation that circumvents the
explicit evaluation of the Quantum Geometric Tensor. We demonstrate our
algorithm for the time evolution of the Heisenberg Hamiltonian and show that it
accurately reproduces the system dynamics at a fraction of the cost of standard
variational quantum time evolution algorithms. As an application, we calculate
thermodynamic observables with the QMETTS algorithm
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
In this paper, a critical bibliometric analysis study is conducted, coupled
with an extensive literature survey on recent developments and associated
applications in machine learning research with a perspective on Africa. The
presented bibliometric analysis study consists of 2761 machine learning-related
documents, of which 98% were articles with at least 482 citations published in
903 journals during the past 30 years. Furthermore, the collated documents were
retrieved from the Science Citation Index EXPANDED, comprising research
publications from 54 African countries between 1993 and 2021. The bibliometric
study shows the visualization of the current landscape and future trends in
machine learning research and its application to facilitate future
collaborative research and knowledge exchange among authors from different
research institutions scattered across the African continent
- …