2,341,255 research outputs found
Detecting Blackholes and Volcanoes in Directed Networks
In this paper, we formulate a novel problem for finding blackhole and volcano
patterns in a large directed graph. Specifically, a blackhole pattern is a
group which is made of a set of nodes in a way such that there are only inlinks
to this group from the rest nodes in the graph. In contrast, a volcano pattern
is a group which only has outlinks to the rest nodes in the graph. Both
patterns can be observed in real world. For instance, in a trading network, a
blackhole pattern may represent a group of traders who are manipulating the
market. In the paper, we first prove that the blackhole mining problem is a
dual problem of finding volcanoes. Therefore, we focus on finding the blackhole
patterns. Along this line, we design two pruning schemes to guide the blackhole
finding process. In the first pruning scheme, we strategically prune the search
space based on a set of pattern-size-independent pruning rules and develop an
iBlackhole algorithm. The second pruning scheme follows a divide-and-conquer
strategy to further exploit the pruning results from the first pruning scheme.
Indeed, a target directed graphs can be divided into several disconnected
subgraphs by the first pruning scheme, and thus the blackhole finding can be
conducted in each disconnected subgraph rather than in a large graph. Based on
these two pruning schemes, we also develop an iBlackhole-DC algorithm. Finally,
experimental results on real-world data show that the iBlackhole-DC algorithm
can be several orders of magnitude faster than the iBlackhole algorithm, which
has a huge computational advantage over a brute-force method.Comment: 18 page
Longest Common Separable Pattern between Permutations
In this article, we study the problem of finding the longest common separable
pattern between several permutations. We give a polynomial-time algorithm when
the number of input permutations is fixed and show that the problem is NP-hard
for an arbitrary number of input permutations even if these permutations are
separable. On the other hand, we show that the NP-hard problem of finding the
longest common pattern between two permutations cannot be approximated better
than within a ratio of (where is the size of an optimal
solution) when taking common patterns belonging to pattern-avoiding classes of
permutations.Comment: 15 page
Manifold embedding for curve registration
We focus on the problem of finding a good representative of a sample of
random curves warped from a common pattern f. We first prove that such a
problem can be moved onto a manifold framework. Then, we propose an estimation
of the common pattern f based on an approximated geodesic distance on a
suitable manifold. We then compare the proposed method to more classical
methods
Effective pattern discovery for text mining
Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance
- …
