Search CORE

966 research outputs found

Local Edge Betweenness based Label Propagation for Community Detection in Complex Networks

Author: Bagheri Alireza
Shahrivari Hamid
Publication venue
Publication date: 24/09/2017
Field of study

Nowadays, identification and detection community structures in complex networks is an important factor in extracting useful information from networks. Label propagation algorithm with near linear-time complexity is one of the most popular methods for detecting community structures, yet its uncertainty and randomness is a defective factor. Merging LPA with other community detection metrics would improve its accuracy and reduce instability of LPA. Considering this point, in this paper we tried to use edge betweenness centrality to improve LPA performance. On the other hand, calculating edge betweenness centrality is expensive, so as an alternative metric, we try to use local edge betweenness and present LPA-LEB (Label Propagation Algorithm Local Edge Betweenness). Experimental results on both real-world and benchmark networks show that LPA-LEB possesses higher accuracy and stability than LPA when detecting community structures in networks.Comment: 6 page

arXiv.org e-Print Archive

Crossref

A Data-Driven Approach for Tag Refinement and Localization in Web Videos

Author: Ballan Lamberto
Bertini Marco
Del Bimbo Alberto
Serra Giuseppe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Tagging of visual content is becoming more and more widespread as web-based services and social networks have popularized tagging functionalities among their users. These user-generated tags are used to ease browsing and exploration of media collections, e.g. using tag clouds, or to retrieve multimedia content. However, not all media are equally tagged by users. Using the current systems is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook. On the other hand, tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a method for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to keyframes. Our approach exploits collective knowledge embedded in user-generated tags and web sources, and visual similarity of keyframes and images uploaded to social sites like YouTube and Flickr, as well as web sources like Google and Bing. Given a keyframe, our method is able to select on the fly from these visual sources the training exemplars that should be the most relevant for this test sample, and proceeds to transfer labels across similar images. Compared to existing video tagging approaches that require training classifiers for each tag, our system has few parameters, is easy to implement and can deal with an open vocabulary scenario. We demonstrate the approach on tag refinement and localization on DUT-WEBV, a large dataset of web videos, and show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Udine

Florence Research

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Archivio istituzionale della ricerca - Università di Padova

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

Author: A. Makadia
A. Oliva
Andrew Delong
B.C. Russell
B.C. Russell
C. Barnes
C. Liu
C. Liu
C. Rother
C. Rother
D.M. Blei
Daniel Kuettel
J. Sivic
J. Tompkin
J. Winn
Jamie Shotton
Joseph Tighe
Kevin Karsch
L. Liang
Marshall F. Tappen
Michael Rubinstein
N. Snavely
P. Geurts
R. Szeliski
S. Liu
S. Vijayanarasimhan
S.C. Zhu
W.T. Freeman
Y. Jing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We present a principled framework for inferring pixel labels in weakly-annotated image datasets. Most previous, example-based approaches to computer vision rely on a large corpus of densely labeled images. However, for large, modern image datasets, such labels are expensive to obtain and are often unavailable. We establish a large-scale graphical model spanning all labeled and unlabeled images, then solve it to infer pixel labels jointly for all images in the dataset while enforcing consistent annotations over similar visual patterns. This model requires significantly less labeled data and assists in resolving ambiguities by propagating inferred annotations from images with stronger local visual evidences to images with weaker local evidences. We apply our proposed framework to two computer vision problems, namely image annotation with semantic segmentation, and object discovery and co-segmentation (segmenting multiple images containing a common object). Extensive numerical evaluations and comparisons show that our method consistently outperforms the state-of-the-art in automatic annotation and semantic labeling, while requiring significantly less labeled data. In contrast to previous co-segmentation techniques, our method manages to discover and segment objects well even in the presence of substantial amounts of noise images (images not containing the common object), as typical for datasets collected from Internet search

DSpace@MIT

Crossref

Similarity modeling for machine learning

Author: Yang Yingzhen
Publication venue
Publication date: 01/12/2016
Field of study

Similarity is the extent to which two objects resemble each other. Modeling similarity is an important topic for both machine learning and computer vision. In this dissertation, we first propose a discriminative similarity learning method, then introduce two novel sparse similarity modeling methods for high dimensional data from the perspective of manifold learning and subspace learning. Our sparse similarity modeling methods learn sparse similarity and consequently generate a sparse graph over the data. The generated sparse graph leads to superior performance in clustering and semi-supervised learning, compared to existing sparse graph based methods such as

\ell^{1}

-graph and Sparse Subspace Clustering (SSC). More concretely, our discriminative similarity learning method adopts a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and searches for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. Regarding to our sparse similarity modeling methods, we propose a novel

\ell^{0}

regularized

\ell^{1}

-graph (

\ell^{0}

\ell^{1}

-graph) to improve

\ell^{1}

-graph from the perspective of manifold learning. Our

\ell^{0}

\ell^{1}

-graph generates a sparse graph that is aligned to the manifold structure of the data for better clustering performance. From the perspective of learning the subspace structures of the high dimensional data, we propose

\ell^{0}

-graph that generates a subspace-consistent sparse graph for clustering and semi-supervised learning. Subspace-consistent sparse graph is a sparse graph where a data point is only connected to other data that lie in the same subspace, and the representative method Sparse Subspace Clustering (SSC) proves to generate subspace-consistent sparse graph under certain assumptions on the subspaces and the data, e.g. independent/disjoint subspaces and subspace incoherence/affinity. In contrast, our

\ell^{0}

-graph can generate subspace-consistent sparse graph for arbitrary distinct underlying subspaces under far less restrictive assumptions, i.e. only i.i.d. random data generation according to arbitrary continuous distribution. Extensive experimental results on various data sets demonstrate the superiority of

\ell^{0}

-graph compared to other methods including SSC for both clustering and semi-supervised learning. The proposed sparse similarity modeling methods require sparse coding using the entire data as the dictionary, which can be inefficient especially in case of large-scale data. In order to overcome this challenge, we propose Support Regularized Sparse Coding (SRSC) where a compact dictionary is learned. The data similarity induced by the support regularized sparse codes leads to compelling clustering performance. Moreover, a feed-forward neural network, termed Deep-SRSC, is designed as a fast encoder to approximate the codes generated by SRSC, further improving the efficiency of SRSC

Illinois Digital Environment for Access to Learning and Scholarship Repository

Visual Recognition with Deep Nearest Centroids

Author: Han Cheng
Liu Dongfang
Wang Wenguan
Zhou Tianfei
Publication venue
Publication date: 14/03/2023
Field of study

We devise deep nearest centroids (DNC), a conceptually elegant yet surprisingly effective network for large-scale visual recognition, by revisiting Nearest Centroids, one of the most classic and simple classifiers. Current deep models learn the classifier in a fully parametric manner, ignoring the latent data structure and lacking simplicity and explainability. DNC instead conducts nonparametric, case-based reasoning; it utilizes sub-centroids of training samples to describe class distributions and clearly explains the classification as the proximity of test data and the class sub-centroids in the feature space. Due to the distance-based nature, the network output dimensionality is flexible, and all the learnable parameters are only for data embedding. That means all the knowledge learnt for ImageNet classification can be completely transferred for pixel recognition learning, under the "pre-training and fine-tuning" paradigm. Apart from its nested simplicity and intuitive decision-making mechanism, DNC can even possess ad-hoc explainability when the sub-centroids are selected as actual training images that humans can view and inspect. Compared with parametric counterparts, DNC performs better on image classification (CIFAR-10, ImageNet) and greatly boots pixel recognition (ADE20K, Cityscapes), with improved transparency and fewer learnable parameters, using various network architectures (ResNet, Swin) and segmentation models (FCN, DeepLabV3, Swin). We feel this work brings fundamental insights into related fields.Comment: 23 pages, 8 figure

arXiv.org e-Print Archive