66,099 research outputs found
Multinomial latent logistic regression
University of Technology Sydney. Faculty of Engineering and Information Technology.We are arriving at the era of big data. The booming of data gives birth to more complicated research objectives, for which it is important to utilize the superior discriminative power brought by explicitly designed feature representations. However, training models based on these features usually requires detailed human annotations, which is being intractable due to the exponential growth of data scale.
A possible solution for this problem is to employ a restricted form of training data, while regarding the others as latent variables and performing latent variable inference during the training process. This solution is termed weakly supervised learning, which usually relies on the development of latent variable models. In this dissertation, we propose a novel latent variable model - multinomial latent logistic regression (MLLR), and present a set of applications on utilizing the proposed model on weakly supervised scenarios, which, at the same time, cover multiple practical issues in real-world applications.
We first derive the proposed MLLR in Chapter 3, together with theoretical analysis including the concave and convex property, optimization methods, and the comparison with existing latent variable models on structured outputs. Our key discovery is that by performing “maximization” over latent variables and “averaging” over output labels, MLLR is particularly effective when the latent variables have a large set of possible values or no well-defined graphical structure is existed, and when probabilistic analysis is preferred on the output predictions. Based on it, the following three sections will discuss the application of MLLR in a variety of tasks on weakly supervised learning.
In Chapter 4, we study the application of MLLR on a novel task of architectural style classification. Due to a unique property of this task that rich inter-class relationships between the recognizing classes make it difficult to describe a building using “hard” assignments of styles, MLLR is believed to be particularly effective due to its ability to produce probabilistic analysis on output predictions in weakly supervised scenarios. Experiments are conducted on a new self-collected dataset, where several interesting discoveries on architectural styles are presented together with the traditional classification task.
In Chapter 5, we study the application of MLLR on an extreme case of weakly supervised learning for fine-grained visual categorization. The core challenge here is that the inter-class variance between subordinate categories is very limited, sometimes even lower than the intra-class variance. On the other hand, due to the non-convex objective function, latent variable models including MLLR are usually very sensitive to the initialization. To conquer these problems, we propose a novel multi-task co-localization strategy to perform warm start for MLLR, which in turn takes advantage of the small inter-class variance between subordinate categories by regarding them as related tasks. Experimental results on several benchmarks demonstrate the effectiveness of the proposed method, achieving comparable results with latest methods with stronger supervision.
In Chapter 6, we aim to further facilitate and scale weakly supervised learning via a novel knowledge transferring strategy, which introduces detailed domain knowledge from sophisticated methods trained on strongly supervised datasets. The proposed strategy is proved to be applicable in a much larger web scale, especially accounting for the ability of performing noise removal with the help of the transferred domain knowledge. A generalized MLLR is proposed to solve this problem using a combination of strongly and weakly supervised training data
Sparse Multinomial Logistic Regression via Approximate Message Passing
For the problem of multi-class linear classification and feature selection,
we propose approximate message passing approaches to sparse multinomial
logistic regression (MLR). First, we propose two algorithms based on the Hybrid
Generalized Approximate Message Passing (HyGAMP) framework: one finds the
maximum a posteriori (MAP) linear classifier and the other finds an
approximation of the test-error-rate minimizing linear classifier. Then we
design computationally simplified variants of these two algorithms. Next, we
detail methods to tune the hyperparameters of their assumed statistical models
using Stein's unbiased risk estimate (SURE) and expectation-maximization (EM),
respectively. Finally, using both synthetic and real-world datasets, we
demonstrate improved error-rate and runtime performance relative to existing
state-of-the-art approaches to sparse MLR
- …