753 research outputs found
Associative embeddings for large-scale knowledge transfer with self-assessment
We propose a method for knowledge transfer between semantically related
classes in ImageNet. By transferring knowledge from the images that have
bounding-box annotations to the others, our method is capable of automatically
populating ImageNet with many more bounding-boxes and even pixel-level
segmentations. The underlying assumption that objects from semantically related
classes look alike is formalized in our novel Associative Embedding (AE)
representation. AE recovers the latent low-dimensional space of appearance
variations among image windows. The dimensions of AE space tend to correspond
to aspects of window appearance (e.g. side view, close up, background). We
model the overlap of a window with an object using Gaussian Processes (GP)
regression, which spreads annotation smoothly through AE space. The
probabilistic nature of GP allows our method to perform self-assessment, i.e.
assigning a quality estimate to its own output. It enables trading off the
amount of returned annotations for their quality. A large scale experiment on
219 classes and 0.5 million images demonstrates that our method outperforms
state-of-the-art methods and baselines for both object localization and
segmentation. Using self-assessment we can automatically return bounding-box
annotations for 30% of all images with high localization accuracy (i.e.~73%
average overlap with ground-truth).Comment: A final CVPR version with a correction in (1). IEEE Computer Vision
and Pattern Recognition, 201
Recommended from our members
Where are you talking about? Advances and Challenges of Geographic Analysis of Text with Application to Disease Monitoring
The Natural Language Processing task we focus on in this thesis is Geoparsing. Geoparsing is the process of extraction and grounding of toponyms (place names). Consider this sentence: "The victims of the Spanish earthquake off the coast of Malaga were of American and Mexican origin." Four toponyms will be extracted (called Geotagging) and grounded to their geographic coordinates (called Toponym Resolution). However, our research goes further than any previous work by showing how to distinguish the literal place(s) of the event (Spain, Malaga) from other linguistic types/uses such as nationalities (Mexican, American), improving downstream task accuracy. We consolidate and extend the Standard Evaluation Framework, discuss key research problems, then present concrete solutions in order to advance each stage of geoparsing. For geotagging, as well as training a SOTA neural Location-NER tagger, we simplify Metonymy Resolution with a novel minimalist feature extraction combined with an LSTM-based classifier, matching SOTA results. For toponym resolution, we deploy the latest deep learning methods to achieve SOTA performance by augmenting neural models with hitherto unused geographic features called Map Vectors. With each research project, we provide high-quality datasets and system prototypes, further building resources in this field. We then show how these geoparsing advances coupled with our proposed Intra-Document Analysis can be used to associate news articles with locations in order to monitor the spread of public health threats. To this end, we evaluate our research contributions with production data from a real-time downstream application to improve geolocation of news events for disease monitoring. The data was made available to us by the Joint Research Centre (JRC), which operates one such system called MediSys that processes incoming news articles in order to monitor threats to public health and make these available to a variety of governmental, business and non-profit organisations. We also discuss steps towards an end-to-end, automated news monitoring system and make actionable recommendations for future work. In summary, the thesis aims are twofold: (1) Generate original geoparsing research aimed at advancing each stage of the pipeline by addressing pertinent challenges with concrete solutions and actionable proposals. (2) Demonstrate how this research can be applied to news event monitoring to increase the efficacy of existing biosurveillance systems, e.g. European Commissionâs MediSys.I was generously funded by DREAM CDT, which was funded by NERC of UKRI
Transfer learning through greedy subset selection
We study the binary transfer learning problem, focusing on how to select sources from a large pool and how to combine them to yield a good performance on a target task. In particular, we consider the transfer learning setting where one does not have direct access to the source data, but rather employs the source hypotheses trained from them. Building on the literature on the best subset selection problem, we propose an efficient algorithm that selects relevant source hypotheses and feature dimensions simultaneously. On three computer vision datasets we achieve state-of-the-art results, substantially outperforming transfer learning and popular feature selection baselines in a small-sample setting. Also, we theoretically prove that, under reasonable assumptions on the source hypotheses, our algorithm can learn effectively from few examples
Analogical Reasoning: An Algorithm Comparison for Natural Language Processing
There is a continual push to make Artificial Intelligence (AI) as human-like as possible; however, this is a difficult task. A significant limitation is the inability of AI to learn beyond its current comprehension. Analogical reasoning (AR), whereby learning by analogy occurs, has been proposed as one method to achieve this goal. Current AR models have their roots in symbolist, connectionist, or hybrid approaches which indicate how analogies are evaluated. No current studies have compared psychologically-inspired and natural language processing (NLP)-produced algorithms to one another; this study compares seven AR algorithms from both realms on multiple-choice word-based analogy problems. Assessment is based on selection of the correct answer, âcorrectness,â and their similarity score prediction compared to the âidealâ score, which is defined as the âgoodnessâ metric. Psychologically-based models have an advantage based on our metrics; however, there is not a clear one-size-fits-all algorithm for all AR problems
Analogical Reasoning: An Algorithm Comparison for Natural Language Processing
There is a continual push to make Artificial Intelligence (AI) as human-like as possible; however, this is a difficult task. A significant limitation is the inability of AI to learn beyond its current comprehension. Analogical reasoning (AR), whereby learning by analogy occurs, has been proposed as one method to achieve this goal. Current AR models have their roots in symbolist, connectionist, or hybrid approaches which indicate how analogies are evaluated. No current studies have compared psychologically-inspired and natural language processing (NLP)-produced algorithms to one another; this study compares seven AR algorithms from both realms on multiple-choice word-based analogy problems. Assessment is based on selection of the correct answer, âcorrectness,â and their similarity score prediction compared to the âidealâ score, which is defined as the âgoodnessâ metric. Psychologically-based models have an advantage based on our metrics; however, there is not a clear one-size-fits-all algorithm for all AR problems
Scalable Greedy Algorithms for Transfer Learning
In this paper we consider the binary transfer learning problem, focusing on
how to select and combine sources from a large pool to yield a good performance
on a target task. Constraining our scenario to real world, we do not assume the
direct access to the source data, but rather we employ the source hypotheses
trained from them. We propose an efficient algorithm that selects relevant
source hypotheses and feature dimensions simultaneously, building on the
literature on the best subset selection problem. Our algorithm achieves
state-of-the-art results on three computer vision datasets, substantially
outperforming both transfer learning and popular feature selection baselines in
a small-sample setting. We also present a randomized variant that achieves the
same results with the computational cost independent from the number of source
hypotheses and feature dimensions. Also, we theoretically prove that, under
reasonable assumptions on the source hypotheses, our algorithm can learn
effectively from few examples
Information theory-based compositional distributional semantics
In the context of text representation, Compositional Distributional Semantics models aim to fuse the Distributional Hypothesis and the Principle of Compositionality. Text embedding is based on co-ocurrence distributions and the representations are in turn combined by compositional functions taking into account the text structure. However, the theoretical basis of compositional functions is still an open issue. In this article we define and study the notion of Information Theory-based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon's Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empirical study on several textual similarity datasets that include sentences with a high and low lexical overlap, and on the similarity between words and their description. Our theoretical analysis and empirical results show that fulfilling formal properties affects positively the accuracy of text representation models in terms of correspondence (isometry) between the embedding and meaning spaces
- âŠ