391 research outputs found
Robust Federated Training via Collaborative Machine Teaching using Trusted Instances
Federated learning performs distributed model training using local data
hosted by agents. It shares only model parameter updates for iterative
aggregation at the server. Although it is privacy-preserving by design,
federated learning is vulnerable to noise corruption of local agents, as
demonstrated in the previous study on adversarial data poisoning threat against
federated learning systems. Even a single noise-corrupted agent can bias the
model training. In our work, we propose a collaborative and privacy-preserving
machine teaching paradigm with multiple distributed teachers, to improve
robustness of the federated training process against local data corruption. We
assume that each local agent (teacher) have the resources to verify a small
portions of trusted instances, which may not by itself be adequate for
learning. In the proposed collaborative machine teaching method, these trusted
instances guide the distributed agents to jointly select a compact while
informative training subset from data hosted by their own. Simultaneously, the
agents learn to add changes of limited magnitudes into the selected data
instances, in order to improve the testing performances of the federally
trained model despite of the training data corruption. Experiments on toy and
real data demonstrate that our approach can identify training set bugs
effectively and suggest appropriate changes to the labels. Our algorithm is a
step toward trustworthy machine learning
Graph Embedding with Rich Information through Heterogeneous Network
Graph embedding has attracted increasing attention due to its critical
application in social network analysis. Most existing algorithms for graph
embedding only rely on the typology information and fail to use the copious
information in nodes as well as edges. As a result, their performance for many
tasks may not be satisfactory. In this paper, we proposed a novel and general
framework of representation learning for graph with rich text information
through constructing a bipartite heterogeneous network. Specially, we designed
a biased random walk to explore the constructed heterogeneous network with the
notion of flexible neighborhood. The efficacy of our method is demonstrated by
extensive comparison experiments with several baselines on various datasets. It
improves the Micro-F1 and Macro-F1 of node classification by 10% and 7% on Cora
dataset.Comment: 9 pages, 7 figures, 4 table
Coarse Grained Exponential Variational Autoencoders
Variational autoencoders (VAE) often use Gaussian or category distribution to
model the inference process. This puts a limit on variational learning because
this simplified assumption does not match the true posterior distribution,
which is usually much more sophisticated. To break this limitation and apply
arbitrary parametric distribution during inference, this paper derives a
\emph{semi-continuous} latent representation, which approximates a continuous
density up to a prescribed precision, and is much easier to analyze than its
continuous counterpart because it is fundamentally discrete. We showcase the
proposition by applying polynomial exponential family distributions as the
posterior, which are universal probability density function generators. Our
experimental results show consistent improvements over commonly used VAE
models
Weakly-paired Cross-Modal Hashing
Hashing has been widely adopted for large-scale data retrieval in many
domains, due to its low storage cost and high retrieval speed. Existing
cross-modal hashing methods optimistically assume that the correspondence
between training samples across modalities are readily available. This
assumption is unrealistic in practical applications. In addition, these methods
generally require the same number of samples across different modalities, which
restricts their flexibility. We propose a flexible cross-modal hashing approach
(Flex-CMH) to learn effective hashing codes from weakly-paired data, whose
correspondence across modalities are partially (or even totally) unknown.
FlexCMH first introduces a clustering-based matching strategy to explore the
local structure of each cluster, and thus to find the potential correspondence
between clusters (and samples therein) across modalities. To reduce the impact
of an incomplete correspondence, it jointly optimizes in a unified objective
function the potential correspondence, the cross-modal hashing functions
derived from the correspondence, and a hashing quantitative loss. An
alternative optimization technique is also proposed to coordinate the
correspondence and hash functions, and to reinforce the reciprocal effects of
the two objectives. Experiments on publicly multi-modal datasets show that
FlexCMH achieves significantly better results than state-of-the-art methods,
and it indeed offers a high degree of flexibility for practical cross-modal
hashing tasks
Multi-View Multiple Clustering
Multiple clustering aims at exploring alternative clusterings to organize the
data into meaningful groups from different perspectives. Existing multiple
clustering algorithms are designed for single-view data. We assume that the
individuality and commonality of multi-view data can be leveraged to generate
high-quality and diverse clusterings. To this end, we propose a novel
multi-view multiple clustering (MVMC) algorithm. MVMC first adapts multi-view
self-representation learning to explore the individuality encoding matrices and
the shared commonality matrix of multi-view data. It additionally reduces the
redundancy (i.e., enhancing the individuality) among the matrices using the
Hilbert-Schmidt Independence Criterion (HSIC), and collects shared information
by forcing the shared matrix to be smooth across all views. It then uses matrix
factorization on the individual matrices, along with the shared matrix, to
generate diverse clusterings of high-quality. We further extend multiple
co-clustering on multi-view data and propose a solution called multi-view
multiple co-clustering (MVMCC). Our empirical study shows that MVMC (MVMCC) can
exploit multi-view data to generate multiple high-quality and diverse
clusterings (co-clusterings), with superior performance to the state-of-the-art
methods.Comment: 7 pages, 5 figures, uses ijcai19.st
Addressing Class-Imbalance Problem in Personalized Ranking
Pairwise ranking models have been widely used to address recommendation
problems. The basic idea is to learn the rank of users' preferred items through
separating items into \emph{positive} samples if user-item interactions exist,
and \emph{negative} samples otherwise. Due to the limited number of observable
interactions, pairwise ranking models face serious \emph{class-imbalance}
issues. Our theoretical analysis shows that current sampling-based methods
cause the vertex-level imbalance problem, which makes the norm of learned item
embeddings towards infinite after a certain training iterations, and
consequently results in vanishing gradient and affects the model inference
results. We thus propose an efficient \emph{\underline{Vi}tal
\underline{N}egative \underline{S}ampler} (VINS) to alleviate the
class-imbalance issue for pairwise ranking model, in particular for deep
learning models optimized by gradient methods. The core of VINS is a bias
sampler with reject probability that will tend to accept a negative candidate
with a larger degree weight than the given positive item. Evaluation results on
several real datasets demonstrate that the proposed sampling method speeds up
the training procedure 30\% to 50\% for ranking models ranging from shallow to
deep, while maintaining and even improving the quality of ranking results in
top-N item recommendation.Comment: Preprin
GESF: A Universal Discriminative Mapping Mechanism for Graph Representation Learning
Graph embedding is a central problem in social network analysis and many
other applications, aiming to learn the vector representation for each node.
While most existing approaches need to specify the neighborhood and the
dependence form to the neighborhood, which may significantly degrades the
flexibility of representation, we propose a novel graph node embedding method
(namely GESF) via the set function technique. Our method can 1) learn an
arbitrary form of representation function from neighborhood, 2) automatically
decide the significance of neighbors at different distances, and 3) be applied
to heterogeneous graph embedding, which may contain multiple types of nodes.
Theoretical guarantee for the representation capability of our method has been
proved for general homogeneous and heterogeneous graphs and evaluation results
on benchmark data sets show that the proposed GESF outperforms the
state-of-the-art approaches on producing node vectors for classification tasks.Comment: 18 page
Risk Convergence of Centered Kernel Ridge Regression with Large Dimensional Data
This paper carries out a large dimensional analysis of a variation of kernel
ridge regression that we call \emph{centered kernel ridge regression} (CKRR),
also known in the literature as kernel ridge regression with offset. This
modified technique is obtained by accounting for the bias in the regression
problem resulting in the old kernel ridge regression but with \emph{centered}
kernels. The analysis is carried out under the assumption that the data is
drawn from a Gaussian distribution and heavily relies on tools from random
matrix theory (RMT). Under the regime in which the data dimension and the
training size grow infinitely large with fixed ratio and under some mild
assumptions controlling the data statistics, we show that both the empirical
and the prediction risks converge to a deterministic quantities that describe
in closed form fashion the performance of CKRR in terms of the data statistics
and dimensions. Inspired by this theoretical result, we subsequently build a
consistent estimator of the prediction risk based on the training data which
allows to optimally tune the design parameters. A key insight of the proposed
analysis is the fact that asymptotically a large class of kernels achieve the
same minimum prediction risk. This insight is validated with both synthetic and
real data.Comment: Submitted to IEEE Transactions on Signal Processin
Tracking Influential Nodes in Time-Decaying Dynamic Interaction Networks
Identifying influential nodes that can jointly trigger the maximum influence
spread in networks is a fundamental problem in many applications such as viral
marketing, online advertising, and disease control. Most existing studies
assume that social influence is static and they fail to capture the dynamics of
influence in reality. In this work, we address the dynamic influence challenge
by designing efficient streaming methods that can identify influential nodes
from highly dynamic node interaction streams. We first propose a general
time-decaying dynamic interaction network (TDN) model to model node interaction
streams with the ability to smoothly discard outdated data. Based on the TDN
model, we design three algorithms, i.e., SieveADN, BasicReduction, and
HistApprox. SieveADN identifies influential nodes from a special kind of TDNs
with efficiency. BasicReduction uses SieveADN as a basic building block to
identify influential nodes from general TDNs. HistApprox significantly improves
the efficiency of BasicReduction. More importantly, we theoretically show that
all three algorithms enjoy constant factor approximation guarantees.
Experiments conducted on various real interaction datasets demonstrate that our
approach finds near-optimal solutions with speed at least to times
faster than baseline methods.Comment: 14 pages, 15 figure
ActiveHNE: Active Heterogeneous Network Embedding
Heterogeneous network embedding (HNE) is a challenging task due to the
diverse node types and/or diverse relationships between nodes. Existing HNE
methods are typically unsupervised. To maximize the profit of utilizing the
rare and valuable supervised information in HNEs, we develop a novel Active
Heterogeneous Network Embedding (ActiveHNE) framework, which includes two
components: Discriminative Heterogeneous Network Embedding (DHNE) and Active
Query in Heterogeneous Networks (AQHN). In DHNE, we introduce a novel
semi-supervised heterogeneous network embedding method based on graph
convolutional neural network. In AQHN, we first introduce three active
selection strategies based on uncertainty and representativeness, and then
derive a batch selection method that assembles these strategies using a
multi-armed bandit mechanism. ActiveHNE aims at improving the performance of
HNE by feeding the most valuable supervision obtained by AQHN into DHNE.
Experiments on public datasets demonstrate the effectiveness of ActiveHNE and
its advantage on reducing the query cost.Comment: Accepted to IJCAI201
- …