113 research outputs found
Multilabel Consensus Classification
In the era of big data, a large amount of noisy and incomplete data can be
collected from multiple sources for prediction tasks. Combining multiple models
or data sources helps to counteract the effects of low data quality and the
bias of any single model or data source, and thus can improve the robustness
and the performance of predictive models. Out of privacy, storage and bandwidth
considerations, in certain circumstances one has to combine the predictions
from multiple models or data sources to obtain the final predictions without
accessing the raw data. Consensus-based prediction combination algorithms are
effective for such situations. However, current research on prediction
combination focuses on the single label setting, where an instance can have one
and only one label. Nonetheless, data nowadays are usually multilabeled, such
that more than one label have to be predicted at the same time. Direct
applications of existing prediction combination methods to multilabel settings
can lead to degenerated performance. In this paper, we address the challenges
of combining predictions from multiple multilabel classifiers and propose two
novel algorithms, MLCM-r (MultiLabel Consensus Maximization for ranking) and
MLCM-a (MLCM for microAUC). These algorithms can capture label correlations
that are common in multilabel classifications, and optimize corresponding
performance metrics. Experimental results on popular multilabel classification
tasks verify the theoretical analysis and effectiveness of the proposed
methods
Dynamic aspiration based on Win-Stay-Lose-Learn rule in Spatial Prisoner's Dilemma Gam
Prisoner's dilemma game is the most commonly used model of spatial
evolutionary game which is considered as a paradigm to portray competition
among selfish individuals. In recent years, Win-Stay-Lose-Learn, a strategy
updating rule base on aspiration, has been proved to be an effective model to
promote cooperation in spatial prisoner's dilemma game, which leads aspiration
to receive lots of attention. But in many research the assumption that
individual's aspiration is fixed is inconsistent with recent results from
psychology. In this paper, according to Expected Value Theory and Achievement
Motivation Theory, we propose a dynamic aspiration model based on
Win-Stay-Lose-Learn rule in which individual's aspiration is inspired by its
payoff. It is found that dynamic aspiration has a significant impact on the
evolution process, and different initial aspirations lead to different results,
which are called Stable Coexistence under Low Aspiration, Dependent Coexistence
under Moderate aspiration and Defection Explosion under High Aspiration
respectively. Furthermore, a deep analysis is performed on the local structures
which cause cooperator's existence or defector's expansion, and the evolution
process for different parameters including strategy and aspiration. As a
result, the intrinsic structures leading to defectors' expansion and
cooperators' survival are achieved for different evolution process, which
provides a penetrating understanding of the evolution. Compared to fixed
aspiration model, dynamic aspiration introduces a more satisfactory explanation
on population evolution laws and can promote deeper comprehension for the
principle of prisoner's dilemma.Comment: 17 pages, 13 figure
Large-Scale Multi-Label Learning with Incomplete Label Assignments
Multi-label learning deals with the classification problems where each
instance can be assigned with multiple labels simultaneously. Conventional
multi-label learning approaches mainly focus on exploiting label correlations.
It is usually assumed, explicitly or implicitly, that the label sets for
training instances are fully labeled without any missing labels. However, in
many real-world multi-label datasets, the label assignments for training
instances can be incomplete. Some ground-truth labels can be missed by the
labeler from the label set. This problem is especially typical when the number
instances is very large, and the labeling cost is very high, which makes it
almost impossible to get a fully labeled training set. In this paper, we study
the problem of large-scale multi-label learning with incomplete label
assignments. We propose an approach, called MPU, based upon positive and
unlabeled stochastic gradient descent and stacked models. Unlike prior works,
our method can effectively and efficiently consider missing labels and label
correlations simultaneously, and is very scalable, that has linear time
complexities over the size of the data. Extensive experiments on two real-world
multi-label datasets show that our MPU model consistently outperform other
commonly-used baselines
Efficient Link Prediction in Continuous-Time Dynamic Networks using Optimal Transmission and Metropolis Hastings Sampling
Efficient link prediction in continuous-time dynamic networks is a
challenging problem that has attracted much research attention in recent years.
A widely used approach to dynamic network link prediction is to extract the
local structure of the target link through temporal random walk on the network
and learn node features using a coding model. However, this approach often
assumes that candidate temporal neighbors follow some certain types of
distributions, which may be inappropriate for real-world networks, thereby
incurring information loss. To address this limitation, we propose a framework
in continuous-time dynamic networks based on Optimal Transmission (OT) and
Metropolis Hastings (MH) sampling (COM). Specifically, we use optimal
transmission theory to calculate the Wasserstein distance between the current
node and the time-valid candidate neighbors to minimize information loss in
node information propagation. Additionally, we employ the MH algorithm to
obtain higher-order structural relationships in the vicinity of the target
link, as it is a Markov Chain Monte Carlo method and can flexibly simulate
target distributions with complex patterns. We demonstrate the effectiveness of
our proposed method through experiments on eight datasets from different
fields.Comment: 11 pages, 7 figure
Explaining Latent Factor Models for Recommendation with Influence Functions
Latent factor models (LFMs) such as matrix factorization achieve the
state-of-the-art performance among various Collaborative Filtering (CF)
approaches for recommendation. Despite the high recommendation accuracy of
LFMs, a critical issue to be resolved is the lack of explainability. Extensive
efforts have been made in the literature to incorporate explainability into
LFMs. However, they either rely on auxiliary information which may not be
available in practice, or fail to provide easy-to-understand explanations. In
this paper, we propose a fast influence analysis method named FIA, which
successfully enforces explicit neighbor-style explanations to LFMs with the
technique of influence functions stemmed from robust statistics. We first
describe how to employ influence functions to LFMs to deliver neighbor-style
explanations. Then we develop a novel influence computation algorithm for
matrix factorization with high efficiency. We further extend it to the more
general neural collaborative filtering and introduce an approximation algorithm
to accelerate influence analysis over neural network models. Experimental
results on real datasets demonstrate the correctness, efficiency and usefulness
of our proposed method
- …