Search CORE

205 research outputs found

Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

Author: Li Wanqing
Ogunbona Philip
Xu Dong
Zhang Jing
Publication venue
Publication date: 01/01/2019
Field of study

This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

arXiv.org e-Print Archive

Research Online

Multi-Label Dimensionality Reduction

Author
Publication venue
Publication date: 01/01/2011
Field of study

abstract: Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering the correlation among different labels in multi-label learning. Specifically, I propose Hypergraph Spectral Learning (HSL) to perform dimensionality reduction for multi-label data by exploiting correlations among different labels using a hypergraph. The regularization effect on the classical dimensionality reduction algorithm known as Canonical Correlation Analysis (CCA) is elucidated in this thesis. The relationship between CCA and Orthonormalized Partial Least Squares (OPLS) is also investigated. To perform dimensionality reduction efficiently for large-scale problems, two efficient implementations are proposed for a class of dimensionality reduction algorithms, including canonical correlation analysis, orthonormalized partial least squares, linear discriminant analysis, and hypergraph spectral learning. The first approach is a direct least squares approach which allows the use of different regularization penalties, but is applicable under a certain assumption; the second one is a two-stage approach which can be applied in the regularization setting without any assumption. Furthermore, an online implementation for the same class of dimensionality reduction algorithms is proposed when the data comes sequentially. A Matlab toolbox for multi-label dimensionality reduction has been developed and released. The proposed algorithms have been applied successfully in the Drosophila gene expression pattern image annotation. The experimental results on some benchmark data sets in multi-label learning also demonstrate the effectiveness and efficiency of the proposed algorithms.Dissertation/ThesisPh.D. Computer Science 201

ASU Digital Repository

The Emerging Trends of Multi-Label Learning

Author: Liu Weiwei
Shen Xiaobo
Tsang Ivor W.
Wang Haobo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/11/2021
Field of study

Exabytes of data are generated daily by humans, leading to the growing need for new efforts in dealing with the grand challenges for multi-label learning brought by big data. For example, extreme multi-label classification is an active and rapidly growing research area that deals with classification tasks with an extremely large number of classes or labels; utilizing massive data with limited supervision to build a multi-label classification model becomes valuable for practical applications, etc. Besides these, there are tremendous efforts on how to harvest the strong learning capability of deep learning to better capture the label dependencies in multi-label learning, which is the key for deep learning to address real-world classification tasks. However, it is noted that there has been a lack of systemic studies that focus explicitly on analyzing the emerging trends and new challenges of multi-label learning in the era of big data. It is imperative to call for a comprehensive survey to fulfill this mission and delineate future research directions and new applications.Comment: Accepted to TPAMI 202

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Image Structured Annotation Based on Deep Neural Network Natural Language Processing

Author: Hua Jing
Jia Jing
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 31/08/2024
Field of study

The image structuring process was mainly divided into three stages: model training, model prediction, and report structuring. In the report structure stage, based on the feature annotation sequence, this paper associated the text sequence with the corresponding table structure and stored the text sequence in the corresponding database in the background. In dataset 1, the accuracy rate of removing visual information submodel was 30 %, and that of removing semantic information submodel was 50 %. The scheme proposed in this paper was to better perform automatic image annotation and meet the requirements of image annotation in the era of Big Data

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Multimedia question answering

Author: NIE LIQIANG
Publication venue
Publication date: 04/07/2013
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Recommended from our members

DATA-DRIVEN APPROACH TO IMAGE CLASSIFICATION

Author: NarasimhaMurthy Venkatesh
Publication venue: ScholarWorks@UMass Amherst
Publication date: 02/07/2019
Field of study

Image classification has been a core topic in the computer vision community. Its recent success with convolutional neural network (CNN) algorithm has led to various real world applications such as large scale management of photos/videos on cloud/social-media, image based search for online retailers, self-driving cars, building robots and healthcare. Image classification can be broadly categorized into binary, multi-class and multi-label classification problems. Binary classification involves assigning one of the two class labels to an instance. In multi-class classification problem, an instance should be categorized into one of more than two classes. Multi-label classification is a generalized version of the multi-class classification problem where each image is assigned multiple labels as opposed to a single label. In this work, we first present various methods that take advantage of deep representations (fully connected layer of pre-trained CNN on the ImageNet dataset) and yield better performance on multi-label classification when compared to methods that use over a dozen conventional visual features. Following the success of deep representations, we intend to build a generic end-to-end deep learning framework to address all three problem categories of image classification. However, there are still no well established guidelines (in terms of choosing the number of layers to go deeper, the number of kernels and the size, the type of regularizer, the choice of non-linear function, etc.) to build an efficient deep neural network and often network architecture design is specific to a problem/dataset. Hence, we present some initial efforts in building a computational framework called Deep Decision Network (DDN) which is completely data-driven. DDN is a tree-like structured built stage-wise. During the learning phase, starting from the root network node, DDN automatically builds a network that splits the data into disjoint clusters of classes which would be handled by the subsequent expert networks. This results in a tree-like structured network driven by the data. The proposed approach provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. This feature is crucial for people trying to solve the problem with little or no domain knowledge, especially for applications in medical domain. Initially, we evaluate DDN on a binary classification problem and later extend it to more challenging multi-class and multi-label classification problems. The extension of DDN to multi-class and multi-label involves some changes but they still operate under the same underlying principle. In all the three cases, the proposed approach is tested for its recognition performance and scalability on publicly available datasets providing comparison to other methods

ScholarWorks@UMass Amherst

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

Author: Ballan Lamberto
Bertini Marco
Del Bimbo Alberto
Li Xirong
Snoek Cees G. M.
Uricchio Tiberio
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

arXiv.org e-Print Archive

Crossref

Florence Research

Archivio istituzionale della ricerca - Università di Macerata

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Archivio istituzionale della ricerca - Università di Padova