4,035 research outputs found
Mining Frequent Neighborhood Patterns in Large Labeled Graphs
Over the years, frequent subgraphs have been an important sort of targeted
patterns in the pattern mining literatures, where most works deal with
databases holding a number of graph transactions, e.g., chemical structures of
compounds. These methods rely heavily on the downward-closure property (DCP) of
the support measure to ensure an efficient pruning of the candidate patterns.
When switching to the emerging scenario of single-graph databases such as
Google Knowledge Graph and Facebook social graph, the traditional support
measure turns out to be trivial (either 0 or 1). However, to the best of our
knowledge, all attempts to redefine a single-graph support resulted in measures
that either lose DCP, or are no longer semantically intuitive.
This paper targets mining patterns in the single-graph setting. We resolve
the "DCP-intuitiveness" dilemma by shifting the mining target from frequent
subgraphs to frequent neighborhoods. A neighborhood is a specific topological
pattern where a vertex is embedded, and the pattern is frequent if it is shared
by a large portion (above a given threshold) of vertices. We show that the new
patterns not only maintain DCP, but also have equally significant semantics as
subgraph patterns. Experiments on real-life datasets display the feasibility of
our algorithms on relatively large graphs, as well as the capability of mining
interesting knowledge that is not discovered in prior works.Comment: 9 page
Image classification by visual bag-of-words refinement and reduction
This paper presents a new framework for visual bag-of-words (BOW) refinement
and reduction to overcome the drawbacks associated with the visual BOW model
which has been widely used for image classification. Although very influential
in the literature, the traditional visual BOW model has two distinct drawbacks.
Firstly, for efficiency purposes, the visual vocabulary is commonly constructed
by directly clustering the low-level visual feature vectors extracted from
local keypoints, without considering the high-level semantics of images. That
is, the visual BOW model still suffers from the semantic gap, and thus may lead
to significant performance degradation in more challenging tasks (e.g. social
image classification). Secondly, typically thousands of visual words are
generated to obtain better performance on a relatively large image dataset. Due
to such large vocabulary size, the subsequent image classification may take
sheer amount of time. To overcome the first drawback, we develop a graph-based
method for visual BOW refinement by exploiting the tags (easy to access
although noisy) of social images. More notably, for efficient image
classification, we further reduce the refined visual BOW model to a much
smaller size through semantic spectral clustering. Extensive experimental
results show the promising performance of the proposed framework for visual BOW
refinement and reduction
Exact Single-Source SimRank Computation on Large Graphs
SimRank is a popular measurement for evaluating the node-to-node similarities
based on the graph topology. In recent years, single-source and top- SimRank
queries have received increasing attention due to their applications in web
mining, social network analysis, and spam detection. However, a fundamental
obstacle in studying SimRank has been the lack of ground truths. The only exact
algorithm, Power Method, is computationally infeasible on graphs with more than
nodes. Consequently, no existing work has evaluated the actual
trade-offs between query time and accuracy on large real-world graphs. In this
paper, we present ExactSim, the first algorithm that computes the exact
single-source and top- SimRank results on large graphs. With high
probability, this algorithm produces ground truths with a rigorous theoretical
guarantee. We conduct extensive experiments on real-world datasets to
demonstrate the efficiency of ExactSim. The results show that ExactSim provides
the ground truth for any single-source SimRank query with a precision up to 7
decimal places within a reasonable query time.Comment: ACM SIGMOD 202
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
Image annotation aims to annotate a given image with a variable number of
class labels corresponding to diverse visual concepts. In this paper, we
address two main issues in large-scale image annotation: 1) how to learn a rich
feature representation suitable for predicting a diverse set of visual concepts
ranging from object, scene to abstract concept; 2) how to annotate an image
with the optimal number of class labels. To address the first issue, we propose
a novel multi-scale deep model for extracting rich and discriminative features
capable of representing a wide range of visual concepts. Specifically, a novel
two-branch deep neural network architecture is proposed which comprises a very
deep main network branch and a companion feature fusion network branch designed
for fusing the multi-scale features computed from the main branch. The deep
model is also made multi-modal by taking noisy user-provided tags as model
input to complement the image input. For tackling the second issue, we
introduce a label quantity prediction auxiliary task to the main label
prediction task to explicitly estimate the optimal label number for a given
image. Extensive experiments are carried out on two large-scale image
annotation benchmark datasets and the results show that our method
significantly outperforms the state-of-the-art.Comment: Submited to IEEE TI
Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization
Video moment localization aims to retrieve the target segment of an untrimmed
video according to the natural language query. Weakly supervised methods gains
attention recently, as the precise temporal location of the target segment is
not always available. However, one of the greatest challenges encountered by
the weakly supervised method is implied in the mismatch between the video and
language induced by the coarse temporal annotations. To refine the
vision-language alignment, recent works contrast the cross-modality
similarities driven by reconstructing masked queries between positive and
negative video proposals. However, the reconstruction may be influenced by the
latent spurious correlation between the unmasked and the masked parts, which
distorts the restoring process and further degrades the efficacy of contrastive
learning since the masked words are not completely reconstructed from the
cross-modality knowledge. In this paper, we discover and mitigate this spurious
correlation through a novel proposed counterfactual cross-modality reasoning
method. Specifically, we first formulate query reconstruction as an aggregated
causal effect of cross-modality and query knowledge. Then by introducing
counterfactual cross-modality knowledge into this aggregation, the spurious
impact of the unmasked part contributing to the reconstruction is explicitly
modeled. Finally, by suppressing the unimodal effect of masked query, we can
rectify the reconstructions of video proposals to perform reasonable
contrastive learning. Extensive experimental evaluations demonstrate the
effectiveness of our proposed method. The code is available at
\href{https://github.com/sLdZ0306/CCR}{https://github.com/sLdZ0306/CCR}.Comment: Accepted by ACM MM 202
Approximating Single-Source Personalized PageRank with Absolute Error Guarantees
Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes s and t on a graph G = (V,E), the PPR value π(s,t) is defined as the probability that an α-discounted random walk from s terminates at t, where the walk terminates with probability α at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node s to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates π̂(s,t) satisfy max_{t ∈ V} |π̂(s,t)-π(s,t)| ≤ ε for a given error bound ε. We propose an algorithm that achieves this with high probability, with an expected running time of
- Õ(√m/ε) for directed graphs, where m = |E|;
- Õ(√{d_max}/ε) for undirected graphs, where d_max is the maximum node degree in the graph;
- Õ(n^{γ-1/2}/ε) for power-law graphs, where n = |V| and γ ∈ (1/2,1) is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring max_{t ∈ V} |π̂(s,t)/d(t)-π(s,t)/d(t)| ≤ ε_d for a given error bound ε_d, where the graph is undirected and d(t) is the degree of node t. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of Õ(√{∑_{t ∈ V} π(s,t)/d(t)}/ε_d). This improves over the previously known O(1/ε_d) complexity
- …