1,046 research outputs found
Efficient Image Gallery Representations at Scale Through Multi-Task Learning
Image galleries provide a rich source of diverse information about a product
which can be leveraged across many recommendation and retrieval applications.
We study the problem of building a universal image gallery encoder through
multi-task learning (MTL) approach and demonstrate that it is indeed a
practical way to achieve generalizability of learned representations to new
downstream tasks. Additionally, we analyze the relative predictive performance
of MTL-trained solutions against optimal and substantially more expensive
solutions, and find signals that MTL can be a useful mechanism to address
sparsity in low-resource binary tasks.Comment: Proceedings of the 43rd International ACM SIGIR Conference on
Research and Development in Information Retrieva
PredNet and Predictive Coding: A Critical Review
PredNet, a deep predictive coding network developed by Lotter et al.,
combines a biologically inspired architecture based on the propagation of
prediction error with self-supervised representation learning in video. While
the architecture has drawn a lot of attention and various extensions of the
model exist, there is a lack of a critical analysis. We fill in the gap by
evaluating PredNet both as an implementation of the predictive coding theory
and as a self-supervised video prediction model using a challenging video
action classification dataset. We design an extended model to test if
conditioning future frame predictions on the action class of the video improves
the model performance. We show that PredNet does not yet completely follow the
principles of predictive coding. The proposed top-down conditioning leads to a
performance gain on synthetic data, but does not scale up to the more complex
real-world action classification dataset. Our analysis is aimed at guiding
future research on similar architectures based on the predictive coding theory
Shared Roots: Regularizing Deep Neural Networks through Multitask Learning
In this paper, we propose to regularize deep neural nets with a new type of multitask learning where the auxiliary task is formed by agglomerating classes into super-classes. As such, it is possible to jointly train the network on the class-based classification problem AND super-class based classification problem. We study this in settings where the training set is small and show that , concurrently with a regularization scheme of randomly reinitializing weights in deeper layers, this leads to competitive results on the ImageNet and Caltech-256 datasets and state-of-the-art results on CIFAR-100
Text Classification
There is an abundance of text data in this world but most of it is raw. We need to extract information from this data to make use of it. One way to extract this information from raw text is to apply informative labels drawn from a pre-defined fixed set i.e. Text Classification. In this thesis, we focus on the general problem of text classification, and work towards solving challenges associated to binary/multi-class/multi-label classification. More specifically, we deal with the problem of (i) Zero-shot labels during testing; (ii) Active learning for text screening; (iii) Multi-label classification under low supervision; (iv) Structured label space; (v) Classifying pairs of words in raw text i.e. Relation Extraction. For (i), we use a zero-shot classification model that utilizes independently learned semantic embeddings. Regarding (ii), we propose a novel active learning algorithm that reduces problem of bias in naive active learning algorithms. For (iii), we propose neural candidate-selector architecture that starts from a set of high-recall candidate labels to obtain high-precision predictions. In the case of (iv), we proposed an attention based neural tree decoder that recursively decodes an abstract into the ontology tree. For (v), we propose using second-order relations that are derived by explicitly connecting pairs of words via context token(s) for improved relation extraction. We use a wide variety of both traditional and deep machine learning tools. More specifically, we used traditional machine learning models like multi-valued linear regression and logistic regression for (i, ii), deep convolutional neural networks for (iii), recurrent neural networks for (iv) and transformer networks for (v)
- …