158 research outputs found
Tree-based Coarsening and Partitioning of Complex Networks
Many applications produce massive complex networks whose analysis would
benefit from parallel processing. Parallel algorithms, in turn, often require a
suitable network partition. For solving optimization tasks such as graph
partitioning on large networks, multilevel methods are preferred in practice.
Yet, complex networks pose challenges to established multilevel algorithms, in
particular to their coarsening phase.
One way to specify a (recursive) coarsening of a graph is to rate its edges
and then contract the edges as prioritized by the rating. In this paper we (i)
define weights for the edges of a network that express the edges' importance
for connectivity, (ii) compute a minimum weight spanning tree with
respect to these weights, and (iii) rate the network edges based on the
conductance values of 's fundamental cuts. To this end, we also (iv)
develop the first optimal linear-time algorithm to compute the conductance
values of \emph{all} fundamental cuts of a given spanning tree. We integrate
the new edge rating into a leading multilevel graph partitioner and equip the
latter with a new greedy postprocessing for optimizing the maximum
communication volume (MCV). Experiments on bipartitioning frequently used
benchmark networks show that the postprocessing already reduces MCV by 11.3%.
Our new edge rating further reduces MCV by 10.3% compared to the previously
best rating with the postprocessing in place for both ratings. In total, with a
modest increase in running time, our new approach reduces the MCV of complex
network partitions by 20.4%
A Graph Theoretic Approach for Object Shape Representation in Compositional Hierarchies Using a Hybrid Generative-Descriptive Model
A graph theoretic approach is proposed for object shape representation in a
hierarchical compositional architecture called Compositional Hierarchy of Parts
(CHOP). In the proposed approach, vocabulary learning is performed using a
hybrid generative-descriptive model. First, statistical relationships between
parts are learned using a Minimum Conditional Entropy Clustering algorithm.
Then, selection of descriptive parts is defined as a frequent subgraph
discovery problem, and solved using a Minimum Description Length (MDL)
principle. Finally, part compositions are constructed by compressing the
internal data representation with discovered substructures. Shape
representation and computational complexity properties of the proposed approach
and algorithms are examined using six benchmark two-dimensional shape image
datasets. Experiments show that CHOP can employ part shareability and indexing
mechanisms for fast inference of part compositions using learned shape
vocabularies. Additionally, CHOP provides better shape retrieval performance
than the state-of-the-art shape retrieval methods.Comment: Paper : 17 pages. 13th European Conference on Computer Vision (ECCV
2014), Zurich, Switzerland, September 6-12, 2014, Proceedings, Part III, pp
566-581. Supplementary material can be downloaded from
http://link.springer.com/content/esm/chp:10.1007/978-3-319-10578-9_37/file/MediaObjects/978-3-319-10578-9_37_MOESM1_ESM.pd
Free-hand sketch synthesis with deformable stroke models
We present a generative model which can automatically summarize the stroke
composition of free-hand sketches of a given category. When our model is fit to
a collection of sketches with similar poses, it discovers and learns the
structure and appearance of a set of coherent parts, with each part represented
by a group of strokes. It represents both consistent (topology) as well as
diverse aspects (structure and appearance variations) of each sketch category.
Key to the success of our model are important insights learned from a
comprehensive study performed on human stroke data. By fitting this model to
images, we are able to synthesize visually similar and pleasant free-hand
sketches
ImageNet Auto-Annotation with Segmentation Propagation
ImageNet is a large-scale hierarchical database of object classes with millions of images.We propose to automatically populate it with pixelwise object-background segmentations, by leveraging existing manual annotations in the form of class labels and bounding-boxes. The key idea is to recursively exploit images segmented so far to guide the segmentation of new images. At each stage this propagation process expands into the images which are easiest to segment at that point in time, e.g. by moving to the semantically most related classes to those segmented so far. The propagation of segmentation occurs both (a) at the image level, by transferring existing segmentations to estimate the probability of a pixel to be foreground, and (b) at the class level, by jointly segmenting images of the same class and by importing the appearance models of classes that are already segmented. Through experiments on 577 classes and 500k images we show that our technique (i) annotates a wide range of classes with accurate segmentations; (ii) effectively exploits the hierarchical structure of ImageNet; (iii) scales efficiently, especially when implemented on superpixels; (iv) outperforms a baseline GrabCut (Rother et al. 2004) initialized on the image center, as well as segmentation transfer from a fixed source pool and run independently on each target image (Kuettel and Ferrari 2012). Moreover, our method also delivers state-of-the-art results on the recent iCoseg dataset for co-segmentation.ISSN:0920-5691ISSN:1573-140
Evaluating Multimedia Features and Fusion for Example-Based Event Detection
Multimedia event detection (MED) is a challenging problem because of the heterogeneous content and variable quality found in large collections of Internet videos. To study the value of multimedia features and fusion for representing and learning events from a set of example video clips, we created SESAME, a system for video SEarch with Speed and Accuracy for Multimedia Events. SESAME includes multiple bag-of-words event classifiers based on single data types: low-level visual, motion, and audio features; high-level semantic visual concepts; and automatic speech recognition. Event detection performance was evaluated for each event classifier. The performance of low-level visual and motion features was improved by the use of difference coding. The accuracy of the visual concepts was nearly as strong as that of the low-level visual features. Experiments with a number of fusion methods for combining the event detection scores from these classifiers revealed that simple fusion methods, such as arithmetic mean, perform as well as or better than other, more complex fusion methods. SESAME’s performance in the 2012 TRECVID MED evaluation was one of the best reported
Object Detection Through Exploration With A Foveated Visual Field
We present a foveated object detector (FOD) as a biologically-inspired
alternative to the sliding window (SW) approach which is the dominant method of
search in computer vision object detection. Similar to the human visual system,
the FOD has higher resolution at the fovea and lower resolution at the visual
periphery. Consequently, more computational resources are allocated at the
fovea and relatively fewer at the periphery. The FOD processes the entire
scene, uses retino-specific object detection classifiers to guide eye
movements, aligns its fovea with regions of interest in the input image and
integrates observations across multiple fixations. Our approach combines modern
object detectors from computer vision with a recent model of peripheral pooling
regions found at the V1 layer of the human visual system. We assessed various
eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD
performs on par with the SW detector while bringing significant computational
cost savings.Comment: An extended version of this manuscript was published in PLOS
Computational Biology (October 2017) at
https://doi.org/10.1371/journal.pcbi.100574
Occlusion and Motion Reasoning for Long-Term Tracking
International audienceObject tracking is a reoccurring problem in computer vision. Tracking-by-detection approaches, in particular Struck (Hare et al., 2011), have shown to be competitive in recent evaluations. However, such approaches fail in the presence of long-term occlusions as well as severe viewpoint changes of the object. In this paper we propose a principled way to combine occlusion and motion reasoning with a tracking-by-detection approach. Occlusion and motion reasoning is based on state-of-the-art long-term trajectories which are labeled as object or background tracks with an energy-based formulation. The overlap between labeled tracks and detected regions allows to identify occlusions. The motion changes of the object between consecutive frames can be estimated robustly from the geometric relation between object trajectories. If this geometric change is significant, an additional detector is trained. Experimental results show that our tracker obtains state-of-the-art results and handles occlusion and viewpoints changes better than competing tracking methods
Transferring Neural Representations for Low-dimensional Indexing of Maya Hieroglyphic Art
We analyze the performance of deep neural architectures for extracting shape representations of binary images, and for generating low-dimensional representations of them. In particular, we focus on indexing binary images exhibiting compounds of Maya hieroglyphic signs, referred to as glyph-blocks, which constitute a very challenging dataset of arts given their visual complexity and large stylistic variety. More precisely, we demonstrate empirically that intermediate outputs of convolutional neural networks can be used as representations for complex shapes, even when their parameters are trained on gray-scale images, and that these representations can be more robust than traditional handcrafted features. We also show that it is possible to compress such representations up to only three dimensions without harming much of their discriminative structure, such that effective visualization of Maya hieroglyphs can be rendered for subsequent epigraphic analysis
Spatio-Temporal Object Detection Proposals
International audienceSpatio-temporal detection of actions and events in video is a challenging problem. Besides the difficulties related to recognition, a major challenge for detection in video is the size of the search space defined by spatio-temporal tubes formed by sequences of bounding boxes along the frames. Recently methods that generate unsupervised detection proposals have proven to be very effective for object detection in still images. These methods open the possibility to use strong but computationally expensive features since only a relatively small number of detection hypotheses need to be assessed. In this paper we make two contributions towards exploiting detection proposals for spatio-temporal detection problems. First, we extend a recent 2D object proposal method, to produce spatio-temporal proposals by a randomized supervoxel merging process. We introduce spatial, temporal, and spatio-temporal pairwise supervoxel features that are used to guide the merging process. Second, we propose a new efficient supervoxel method. We experimentally evaluate our detection proposals, in combination with our new supervoxel method as well as existing ones. This evaluation shows that our supervoxels lead to more accurate proposals when compared to using existing state-of-the-art supervoxel methods
- …