5,677 research outputs found
Independent Asymmetric Embedding for Cascade Prediction on Social Networks
The prediction for information diffusion on social networks has great
practical significance in marketing and public opinion control. Cascade
prediction aims to predict the individuals who will potentially repost the
message on the social network. One kind of methods either exploit
demographical, structural, and temporal features for prediction, or explicitly
rely on particular information diffusion models. The other kind of models are
fully data-driven and do not require a global network structure. Thus massive
diffusion prediction models based on network embedding are proposed. These
models embed the users into the latent space using their cascade information,
but are lack of consideration for the intervene among users when embedding. In
this paper, we propose an independent asymmetric embedding method to learn
social embedding for cascade prediction. Different from existing methods, our
method embeds each individual into one latent influence space and multiple
latent susceptibility spaces. Furthermore, our method captures the
co-occurrence regulation of user combination in cascades to improve the
calculating effectiveness. The results of extensive experiments conducted on
real-world datasets verify both the predictive accuracy and cost-effectiveness
of our approach
Learning user-specific latent influence and susceptibility from information cascades
Predicting cascade dynamics has important implications for understanding
information propagation and launching viral marketing. Previous works mainly
adopt a pair-wise manner, modeling the propagation probability between pairs of
users using n^2 independent parameters for n users. Consequently, these models
suffer from severe overfitting problem, specially for pairs of users without
direct interactions, limiting their prediction accuracy. Here we propose to
model the cascade dynamics by learning two low-dimensional user-specific
vectors from observed cascades, capturing their influence and susceptibility
respectively. This model requires much less parameters and thus could combat
overfitting problem. Moreover, this model could naturally model
context-dependent factors like cumulative effect in information propagation.
Extensive experiments on synthetic dataset and a large-scale microblogging
dataset demonstrate that this model outperforms the existing pair-wise models
at predicting cascade dynamics, cascade size, and "who will be retweeted".Comment: from The 29th AAAI Conference on Artificial Intelligence (AAAI-2015
Machine learning in the real world with multiple objectives
Machine learning (ML) is ubiquitous in many real-world applications. Existing ML systems are based on optimizing a single quality metric such as prediction accuracy. These metrics typically do not fully align with real-world design constraints such as computation, latency, fairness, and acquisition costs that we encounter in real-world applications. In this thesis, we develop ML methods for optimizing prediction accuracy while accounting for such real-world constraints. In particular, we introduce multi-objective learning in two different setups: resource-efficient prediction and algorithmic fairness in language models.
First, we focus on decreasing the test-time computational costs of prediction systems. Budget constraints arise in many machine learning problems. Computational costs limit the usage of many models on small devices such as IoT or mobile phones and increase the energy consumption in cloud computing. We design systems that allow on-the-fly modification of the prediction model for each input sample. These sample-adaptive systems allow us to leverage wide variability in sample complexity where we learn policies for selecting cheap models for low complexity instances and using descriptive models only for complex ones. We utilize multiple--objective approach where one minimizes the system cost while preserving predictive accuracy. We demonstrate significant speed-ups in the fields of computer vision, structured prediction, natural language processing, and deep learning.
In the context of fairness, we first demonstrate that a naive application of ML methods runs the risk of amplifying social biases present in data. This danger is particularly acute for methods based on word embeddings, which are increasingly gaining importance in many natural language processing applications of ML. We show that word embeddings trained on Google News articles exhibit female/male gender stereotypes. We demonstrate that geometrically, gender bias is captured by unique directions in the word embedding vector space. To remove bias we formulate a empirical risk objective with fairness constraints to remove stereotypes from embeddings while maintaining desired associations. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduces gender bias in embeddings, while preserving its useful properties such as the ability to cluster related concepts
Node Embedding over Temporal Graphs
In this work, we present a method for node embedding in temporal graphs. We
propose an algorithm that learns the evolution of a temporal graph's nodes and
edges over time and incorporates this dynamics in a temporal node embedding
framework for different graph prediction tasks. We present a joint loss
function that creates a temporal embedding of a node by learning to combine its
historical temporal embeddings, such that it optimizes per given task (e.g.,
link prediction). The algorithm is initialized using static node embeddings,
which are then aligned over the representations of a node at different time
points, and eventually adapted for the given task in a joint optimization. We
evaluate the effectiveness of our approach over a variety of temporal graphs
for the two fundamental tasks of temporal link prediction and multi-label node
classification, comparing to competitive baselines and algorithmic
alternatives. Our algorithm shows performance improvements across many of the
datasets and baselines and is found particularly effective for graphs that are
less cohesive, with a lower clustering coefficient
- …