24,901 research outputs found
Fast Detection of Community Structures using Graph Traversal in Social Networks
Finding community structures in social networks is considered to be a
challenging task as many of the proposed algorithms are computationally
expensive and does not scale well for large graphs. Most of the community
detection algorithms proposed till date are unsuitable for applications that
would require detection of communities in real-time, especially for massive
networks. The Louvain method, which uses modularity maximization to detect
clusters, is usually considered to be one of the fastest community detection
algorithms even without any provable bound on its running time. We propose a
novel graph traversal-based community detection framework, which not only runs
faster than the Louvain method but also generates clusters of better quality
for most of the benchmark datasets. We show that our algorithms run in O(|V | +
|E|) time to create an initial cover before using modularity maximization to
get the final cover.
Keywords - community detection; Influenced Neighbor Score; brokers; community
nodes; communitiesComment: 29 pages, 9 tables, and 13 figures. Accepted in "Knowledge and
Information Systems", 201
Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking
The most common paradigm for vision-based multi-object tracking is
tracking-by-detection, due to the availability of reliable detectors for
several important object categories such as cars and pedestrians. However,
future mobile systems will need a capability to cope with rich human-made
environments, in which obtaining detectors for every possible object category
would be infeasible. In this paper, we propose a model-free multi-object
tracking approach that uses a category-agnostic image segmentation method to
track objects. We present an efficient segmentation mask-based tracker which
associates pixel-precise masks reported by the segmentation. Our approach can
utilize semantic information whenever it is available for classifying objects
at the track level, while retaining the capability to track generic unknown
objects in the absence of such information. We demonstrate experimentally that
our approach achieves performance comparable to state-of-the-art
tracking-by-detection methods for popular object categories such as cars and
pedestrians. Additionally, we show that the proposed method can discover and
robustly track a large variety of other objects.Comment: ICRA'18 submissio
A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration
In practical data integration systems, it is common for the data sources
being integrated to provide conflicting information about the same entity.
Consequently, a major challenge for data integration is to derive the most
complete and accurate integrated records from diverse and sometimes conflicting
sources. We term this challenge the truth finding problem. We observe that some
sources are generally more reliable than others, and therefore a good model of
source quality is the key to solving the truth finding problem. In this work,
we propose a probabilistic graphical model that can automatically infer true
records and source quality without any supervision. In contrast to previous
methods, our principled approach leverages a generative process of two types of
errors (false positive and false negative) by modeling two different aspects of
source quality. In so doing, ours is also the first approach designed to merge
multi-valued attribute types. Our method is scalable, due to an efficient
sampling-based inference algorithm that needs very few iterations in practice
and enjoys linear time complexity, with an even faster incremental variant.
Experiments on two real world datasets show that our new method outperforms
existing state-of-the-art approaches to the truth finding problem.Comment: VLDB201
Fast Approximate -Means via Cluster Closures
-means, a simple and effective clustering algorithm, is one of the most
widely used algorithms in multimedia and computer vision community. Traditional
-means is an iterative algorithm---in each iteration new cluster centers are
computed and each data point is re-assigned to its nearest center. The cluster
re-assignment step becomes prohibitively expensive when the number of data
points and cluster centers are large.
In this paper, we propose a novel approximate -means algorithm to greatly
reduce the computational complexity in the assignment step. Our approach is
motivated by the observation that most active points changing their cluster
assignments at each iteration are located on or near cluster boundaries. The
idea is to efficiently identify those active points by pre-assembling the data
into groups of neighboring points using multiple random spatial partition
trees, and to use the neighborhood information to construct a closure for each
cluster, in such a way only a small number of cluster candidates need to be
considered when assigning a data point to its nearest cluster. Using complexity
analysis, image data clustering, and applications to image retrieval, we show
that our approach out-performs state-of-the-art approximate -means
algorithms in terms of clustering quality and efficiency
- …