10,985 research outputs found
The Complexity of Finding Effectors
The NP-hard EFFECTORS problem on directed graphs is motivated by applications
in network mining, particularly concerning the analysis of probabilistic
information-propagation processes in social networks. In the corresponding
model the arcs carry probabilities and there is a probabilistic diffusion
process activating nodes by neighboring activated nodes with probabilities as
specified by the arcs. The point is to explain a given network activation state
as well as possible by using a minimum number of "effector nodes"; these are
selected before the activation process starts.
We correct, complement, and extend previous work from the data mining
community by a more thorough computational complexity analysis of EFFECTORS,
identifying both tractable and intractable cases. To this end, we also exploit
a parameterization measuring the "degree of randomness" (the number of "really"
probabilistic arcs) which might prove useful for analyzing other probabilistic
network diffusion problems as well.Comment: 28 page
Recommended from our members
FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome.
Alternative splicing plays an important role in many cellular processes of eukaryotic organisms. The exon-inclusion ratio, also known as percent spliced in, is often regarded as one of the most effective measures of alternative splicing events. The existing methods for estimating exon-inclusion ratios at the genome scale all require the existence of a reference transcriptome. In this paper, we propose an alignment-free method, FreePSI, to perform genome-wide estimation of exon-inclusion ratios from RNA-Seq data without relying on the guidance of a reference transcriptome. It uses a novel probabilistic generative model based on k-mer profiles to quantify the exon-inclusion ratios at the genome scale and an efficient expectation-maximization algorithm based on a divide-and-conquer strategy and ultrafast conjugate gradient projection descent method to solve the model. We compare FreePSI with the existing methods on simulated and real RNA-seq data in terms of both accuracy and efficiency and show that it is able to achieve very good performance even though a reference transcriptome is not provided. Our results suggest that FreePSI may have important applications in performing alternative splicing analysis for organisms that do not have quality reference transcriptomes. FreePSI is implemented in C++ and freely available to the public on GitHub
Model-Based Method for Social Network Clustering
We propose a simple mixed membership model for social network clustering in
this note. A flexible function is adopted to measure affinities among a set of
entities in a social network. The model not only allows each entity in the
network to possess more than one membership, but also provides accurate
statistical inference about network structure. We estimate the membership
parameters by using an MCMC algorithm. We evaluate the performance of the
proposed algorithm by applying our model to two empirical social network data,
the Zachary club data and the bottlenose dolphin network data. We also conduct
some numerical studies for different types of simulated networks for assessing
the effectiveness of our algorithm. In the end, some concluding remarks and
future work are addressed briefly
Flow-based Influence Graph Visual Summarization
Visually mining a large influence graph is appealing yet challenging. People
are amazed by pictures of newscasting graph on Twitter, engaged by hidden
citation networks in academics, nevertheless often troubled by the unpleasant
readability of the underlying visualization. Existing summarization methods
enhance the graph visualization with blocked views, but have adverse effect on
the latent influence structure. How can we visually summarize a large graph to
maximize influence flows? In particular, how can we illustrate the impact of an
individual node through the summarization? Can we maintain the appealing graph
metaphor while preserving both the overall influence pattern and fine
readability?
To answer these questions, we first formally define the influence graph
summarization problem. Second, we propose an end-to-end framework to solve the
new problem. Our method can not only highlight the flow-based influence
patterns in the visual summarization, but also inherently support rich graph
attributes. Last, we present a theoretic analysis and report our experiment
results. Both evidences demonstrate that our framework can effectively
approximate the proposed influence graph summarization objective while
outperforming previous methods in a typical scenario of visually mining
academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM),
Shen Zhen, China, December 201
Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)
Many exact and approximate solution methods for Markov Decision Processes
(MDPs) attempt to exploit structure in the problem and are based on
factorization of the value function. Especially multiagent settings, however,
are known to suffer from an exponential increase in value component sizes as
interactions become denser, meaning that approximation architectures are
restricted in the problem sizes and types they can handle. We present an
approach to mitigate this limitation for certain types of multiagent systems,
exploiting a property that can be thought of as "anonymous influence" in the
factored MDP. Anonymous influence summarizes joint variable effects efficiently
whenever the explicit representation of variable identity in the problem can be
avoided. We show how representational benefits from anonymity translate into
computational efficiencies, both for general variable elimination in a factor
graph but in particular also for the approximate linear programming solution to
factored MDPs. The latter allows to scale linear programming to factored MDPs
that were previously unsolvable. Our results are shown for the control of a
stochastic disease process over a densely connected graph with 50 nodes and 25
agents.Comment: Extended version of AAAI 2016 pape
- …