158 research outputs found
The Emerging Trends of Multi-Label Learning
Exabytes of data are generated daily by humans, leading to the growing need
for new efforts in dealing with the grand challenges for multi-label learning
brought by big data. For example, extreme multi-label classification is an
active and rapidly growing research area that deals with classification tasks
with an extremely large number of classes or labels; utilizing massive data
with limited supervision to build a multi-label classification model becomes
valuable for practical applications, etc. Besides these, there are tremendous
efforts on how to harvest the strong learning capability of deep learning to
better capture the label dependencies in multi-label learning, which is the key
for deep learning to address real-world classification tasks. However, it is
noted that there has been a lack of systemic studies that focus explicitly on
analyzing the emerging trends and new challenges of multi-label learning in the
era of big data. It is imperative to call for a comprehensive survey to fulfill
this mission and delineate future research directions and new applications.Comment: Accepted to TPAMI 202
Unsupervised Person Re-identification by Soft Multilabel Learning
Although unsupervised person re-identification (RE-ID) has drawn increasing
research attentions due to its potential to address the scalability problem of
supervised RE-ID models, it is very challenging to learn discriminative
information in the absence of pairwise labels across disjoint camera views. To
overcome this problem, we propose a deep model for the soft multilabel learning
for unsupervised RE-ID. The idea is to learn a soft multilabel (real-valued
label likelihood vector) for each unlabeled person by comparing (and
representing) the unlabeled person with a set of known reference persons from
an auxiliary domain. We propose the soft multilabel-guided hard negative mining
to learn a discriminative embedding for the unlabeled target domain by
exploring the similarity consistency of the visual features and the soft
multilabels of unlabeled target pairs. Since most target pairs are cross-view
pairs, we develop the cross-view consistent soft multilabel learning to achieve
the learning goal that the soft multilabels are consistently good across
different camera views. To enable effecient soft multilabel learning, we
introduce the reference agent learning to represent each reference person by a
reference agent in a joint embedding. We evaluate our unified deep model on
Market-1501 and DukeMTMC-reID. Our model outperforms the state-of-the-art
unsupervised RE-ID methods by clear margins. Code is available at
https://github.com/KovenYu/MAR.Comment: CVPR19, ora
Pairwise Instance Relation Augmentation for Long-tailed Multi-label Text Classification
Multi-label text classification (MLTC) is one of the key tasks in natural
language processing. It aims to assign multiple target labels to one document.
Due to the uneven popularity of labels, the number of documents per label
follows a long-tailed distribution in most cases. It is much more challenging
to learn classifiers for data-scarce tail labels than for data-rich head
labels. The main reason is that head labels usually have sufficient
information, e.g., a large intra-class diversity, while tail labels do not. In
response, we propose a Pairwise Instance Relation Augmentation Network (PIRAN)
to augment tailed-label documents for balancing tail labels and head labels.
PIRAN consists of a relation collector and an instance generator. The former
aims to extract the document pairwise relations from head labels. Taking these
relations as perturbations, the latter tries to generate new document instances
in high-level feature space around the limited given tailed-label instances.
Meanwhile, two regularizers (diversity and consistency) are designed to
constrain the generation process. The consistency-regularizer encourages the
variance of tail labels to be close to head labels and further balances the
whole datasets. And diversity-regularizer makes sure the generated instances
have diversity and avoids generating redundant instances. Extensive
experimental results on three benchmark datasets demonstrate that PIRAN
consistently outperforms the SOTA methods, and dramatically improves the
performance of tail labels
Noisy multi-label semi-supervised dimensionality reduction
Noisy labeled data represent a rich source of information that often are
easily accessible and cheap to obtain, but label noise might also have many
negative consequences if not accounted for. How to fully utilize noisy labels
has been studied extensively within the framework of standard supervised
machine learning over a period of several decades. However, very little
research has been conducted on solving the challenge posed by noisy labels in
non-standard settings. This includes situations where only a fraction of the
samples are labeled (semi-supervised) and each high-dimensional sample is
associated with multiple labels. In this work, we present a novel
semi-supervised and multi-label dimensionality reduction method that
effectively utilizes information from both noisy multi-labels and unlabeled
data. With the proposed Noisy multi-label semi-supervised dimensionality
reduction (NMLSDR) method, the noisy multi-labels are denoised and unlabeled
data are labeled simultaneously via a specially designed label propagation
algorithm. NMLSDR then learns a projection matrix for reducing the
dimensionality by maximizing the dependence between the enlarged and denoised
multi-label space and the features in the projected space. Extensive
experiments on synthetic data, benchmark datasets, as well as a real-world case
study, demonstrate the effectiveness of the proposed algorithm and show that it
outperforms state-of-the-art multi-label feature extraction algorithms.Comment: 38 page
Data Mining
The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining
Deep Active Learning Explored Across Diverse Label Spaces
abstract: Deep learning architectures have been widely explored in computer vision and have
depicted commendable performance in a variety of applications. A fundamental challenge
in training deep networks is the requirement of large amounts of labeled training
data. While gathering large quantities of unlabeled data is cheap and easy, annotating
the data is an expensive process in terms of time, labor and human expertise.
Thus, developing algorithms that minimize the human effort in training deep models
is of immense practical importance. Active learning algorithms automatically identify
salient and exemplar samples from large amounts of unlabeled data and can augment
maximal information to supervised learning models, thereby reducing the human annotation
effort in training machine learning models. The goal of this dissertation is to
fuse ideas from deep learning and active learning and design novel deep active learning
algorithms. The proposed learning methodologies explore diverse label spaces to
solve different computer vision applications. Three major contributions have emerged
from this work; (i) a deep active framework for multi-class image classication, (ii)
a deep active model with and without label correlation for multi-label image classi-
cation and (iii) a deep active paradigm for regression. Extensive empirical studies
on a variety of multi-class, multi-label and regression vision datasets corroborate the
potential of the proposed methods for real-world applications. Additional contributions
include: (i) a multimodal emotion database consisting of recordings of facial
expressions, body gestures, vocal expressions and physiological signals of actors enacting
various emotions, (ii) four multimodal deep belief network models and (iii)
an in-depth analysis of the effect of transfer of multimodal emotion features between
source and target networks on classification accuracy and training time. These related
contributions help comprehend the challenges involved in training deep learning
models and motivate the main goal of this dissertation.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
Graph Analysis and Applications in Clustering and Content-based Image Retrieval
About 300 years ago, when studying Seven Bridges of Kƶnigsberg problem - a famous problem concerning paths on graphs - the great mathematician Leonhard Euler said, āThis question is very banal, but seems to me worthy of attentionā. Since then, graph theory and graph analysis have not only become one of the most important branches of mathematics, but have also found an enormous range of important applications in many other areas. A graph is a mathematical model that abstracts entities and the relationships between them as nodes and edges. Many types of interactions between the entities can be modeled by graphs, for example, social interactions between people, the communications between the entities in computer networks and relations between biological species. Although not appearing to be a graph, many other types of data can be converted into graphs by cer- tain operations, for example, the k-nearest neighborhood graph built from pixels in an image.
Cluster structure is a common phenomenon in many real-world graphs, for example, social networks. Finding the clusters in a large graph is important to understand the underlying relationships between the nodes. Graph clustering is a technique that partitions nodes into clus- ters such that connections among nodes in a cluster are dense and connections between nodes in diļ¬erent clusters are sparse. Various approaches have been proposed to solve graph clustering problems. A common approach is to optimize a predeļ¬ned clustering metric using diļ¬erent optimization methods. However, most of these optimization problems are NP-hard due to the discrete set-up of the hard-clustering. These optimization problems can be relaxed, and a sub-optimal solu- tion can be found. A diļ¬erent approach is to apply data clustering
algorithms in solving graph clustering problems. With this approach, one must ļ¬rst ļ¬nd appropriate features for each node that represent the local structure of the graph. Limited Random Walk algorithm uses the random walk procedure to explore the graph and extracts ef- ļ¬cient features for the nodes. It incorporates the embarrassing parallel paradigm, thus, it can process large graph data eļ¬ciently using mod- ern high-performance computing facilities. This thesis gives the details of this algorithm and analyzes the stability issues of the algorithm.
Based on the study of the cluster structures in a graph, we deļ¬ne the authenticity score of an edge as the diļ¬erence between the actual and the expected number of edges that connect the two groups of the neighboring nodes of the two end nodes. Authenticity score can be used in many important applications, such as graph clustering, outlier detection, and graph data preprocessing. In particular, a data clus- tering algorithm that uses the authenticity scores on mutual k-nearest neighborhood graph achieves more reliable and superior performance comparing to other popular algorithms. This thesis also theoretically proves that this algorithm can asymptotically ļ¬nd the complete re- covery of the ground truth of the graphs that were generated by a stochastic r-block model.
Content-based image retrieval (CBIR) is an important application in computer vision, media information retrieval, and data mining. Given a query image, a CBIR system ranks the images in a large image database by their āsimilaritiesā to the query image. However, because of the ambiguities of the deļ¬nition of the āsimilarityā, it is very diļ¬- cult for a CBIR system to select the optimal feature set and ranking algorithm to satisfy the purpose of the query. Graph technologies have been used to improve the performance of CBIR systems in var- ious ways. In this thesis, a novel method is proposed to construct a visual-semantic graphāa graph where nodes represent semantic concepts and edges represent visual associations between concepts. The constructed visual-semantic graph not only helps the user to locate the target images quickly but also helps answer the questions related to the query image. Experiments show that the eļ¬orts of locating the target image are reduced by 25% with the help of visual-semantic graphs.
Graph analysis will continue to play an important role in future data analysis. In particular, the visual-semantic graph that captures important and interesting visual associations between the concepts is worthyof further attention
- ā¦