4 research outputs found
Multi-Task Kernel Null-Space for One-Class Classification
The one-class kernel spectral regression (OC-KSR), the regression-based
formulation of the kernel null-space approach has been found to be an effective
Fisher criterion-based methodology for one-class classification (OCC),
achieving state-of-the-art performance in one-class classification while
providing relatively high robustness against data corruption. This work extends
the OC-KSR methodology to a multi-task setting where multiple one-class
problems share information for improved performance. By viewing the multi-task
structure learning problem as one of compositional function learning, first,
the OC-KSR method is extended to learn multiple tasks' structure
\textit{linearly} by posing it as an instantiation of the separable kernel
learning problem in a vector-valued reproducing kernel Hilbert space where an
output kernel encodes tasks' structure while another kernel captures input
similarities. Next, a non-linear structure learning mechanism is proposed which
captures multiple tasks' relationships \textit{non-linearly} via an output
kernel. The non-linear structure learning method is then extended to a sparse
setting where different tasks compete in an output composition mechanism,
leading to a sparse non-linear structure among multiple problems. Through
extensive experiments on different data sets, the merits of the proposed
multi-task kernel null-space techniques are verified against the baseline as
well as other existing multi-task one-class learning techniques
Anomaly Detection with Domain Adaptation
We study the problem of semi-supervised anomaly detection with domain
adaptation. Given a set of normal data from a source domain and a limited
amount of normal examples from a target domain, the goal is to have a
well-performing anomaly detector in the target domain. We propose the Invariant
Representation Anomaly Detection (IRAD) to solve this problem where we first
learn to extract a domain-invariant representation. The extraction is achieved
by an across-domain encoder trained together with source-specific encoders and
generators by adversarial learning. An anomaly detector is then trained using
the learnt representations. We evaluate IRAD extensively on digits images
datasets (MNIST, USPS and SVHN) and object recognition datasets (Office-Home).
Experimental results show that IRAD outperforms baseline models by a wide
margin across different datasets. We derive a theoretical lower bound for the
joint error that explains the performance decay from overtraining and also an
upper bound for the generalization error
Meta-Learning for Relative Density-Ratio Estimation
The ratio of two probability densities, called a density-ratio, is a vital
quantity in machine learning. In particular, a relative density-ratio, which is
a bounded extension of the density-ratio, has received much attention due to
its stability and has been used in various applications such as outlier
detection and dataset comparison. Existing methods for (relative) density-ratio
estimation (DRE) require many instances from both densities. However,
sufficient instances are often unavailable in practice. In this paper, we
propose a meta-learning method for relative DRE, which estimates the relative
density-ratio from a few instances by using knowledge in related datasets.
Specifically, given two datasets that consist of a few instances, our model
extracts the datasets' information by using neural networks and uses it to
obtain instance embeddings appropriate for the relative DRE. We model the
relative density-ratio by a linear model on the embedded space, whose global
optimum solution can be obtained as a closed-form solution. The closed-form
solution enables fast and effective adaptation to a few instances, and its
differentiability enables us to train our model such that the expected test
error for relative DRE can be explicitly minimized after adapting to a few
instances. We empirically demonstrate the effectiveness of the proposed method
by using three problems: relative DRE, dataset comparison, and outlier
detection.Comment: 17 page
On-edge Multi-task Transfer Learning: Model and Practice with Data-driven Task Allocation
On edge devices, data scarcity occurs as a common problem where transfer
learning serves as a widely-suggested remedy. Nevertheless, transfer learning
imposes a heavy computation burden to resource-constrained edge devices.
Existing task allocation works usually assume all submitted tasks are equally
important, leading to inefficient resource allocation at a task level when
directly applied in Multi-task Transfer Learning (MTL). To address these
issues, we first reveal that it is crucial to measure the impact of tasks on
overall decision performance improvement and quantify \emph{task importance}.
We then show that task allocation with task importance for MTL (TATIM) is a
variant of the NP-complete Knapsack problem, where the complicated computation
to solve this problem needs to be conducted repeatedly under varying contexts.
To solve TATIM with high computational efficiency, we propose a Data-driven
Cooperative Task Allocation (DCTA) approach. Finally, we evaluate the
performance of DCTA by not only a trace-driven simulation, but also a new
comprehensive real-world AIOps case study that bridges model and practice via a
new architecture and main components design within the AIOps system. Extensive
experiments show that our DCTA reduces 3.24 times of processing time, and saves
48.4\% energy consumption compared with the state-of-the-art when solving
TATIM.Comment: 15 pages, published in IEEE TRANSACTIONS ON Parallel and Distributed
Systems, VOL. 31, NO. 6, JUNE 202