715 research outputs found
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
We propose a method for visual question answering which combines an internal
representation of the content of an image with information extracted from a
general knowledge base to answer a broad range of image-based questions. This
allows more complex questions to be answered using the predominant neural
network-based approach than has previously been possible. It particularly
allows questions to be asked about the contents of an image, even when the
image itself does not contain the whole answer. The method constructs a textual
representation of the semantic content of an image, and merges it with textual
information sourced from a knowledge base, to develop a deeper understanding of
the scene viewed. Priming a recurrent neural network with this combined
information, and the submitted question, leads to a very flexible visual
question answering approach. We are specifically able to answer questions posed
in natural language, that refer to information not contained in the image. We
demonstrate the effectiveness of our model on two publicly available datasets,
Toronto COCO-QA and MS COCO-VQA and show that it produces the best reported
results in both cases.Comment: Accepted to IEEE Conf. Computer Vision and Pattern Recognitio
Learning to rank using privileged information
Many computer vision problems have an asymmetric distribution of information between training and test time. In this work, we study the case where we are given additional information about the training data, which however will not be available at test time. This situation is called learning using privileged information (LUPI). We introduce two maximum-margin techniques that are able to make use of this additional source of information, and we show that the framework is applicable to several scenarios that have been studied in computer vision before. Experiments with attributes, bounding boxes, image tags and rationales as additional information in object classification show promising results
Presenting the networked home: a content analysis of promotion material of Ambient Intelligence applications
Ambient Intelligence (AmI) for the home uses information and communication technologies to make usersâ everyday life more comfortable. AmI is still in its developmental phase and is headed towards the first stages of diffusion. \ud
Characteristics of AmI design can be observed, among others, in the promotion material of initial producers. A literature study revealed that AmI originally envisioned a central role for the user, convenience that AmI offers them and that attention should be paid to critical policy issues such as privacy and a potential loss of freedom. A content analysis of current promotion material of several high-tech companies revealed that these original ideas are not all reflected in the material. Attributes which were used most in the promotion material were âconnectednessâ, âcontrolâ, âeasinessâ and âpersonalizationâ. An analysis of the pictures in the promotion material showed that almost half of the pictures contained no humans but appliances. These results only partly correspond to the original vision on AmI, since the emphasis is now on technology. The results represent a serious problem, since both users, as well as critical policy issues are underexposed in the current promotion material
âA space for myself to go:â Early patterns in Small YA spaces
While young adults (teenagers) are routinely recognized as constituting nearly 25 percent of the nation\u27s public library users, the vast majority of libraries devote more space and design attention to restrooms than to young people. Worse, there are currently no consistent or established metrics, no evaluation criteria, few conceptual standards of best practices, and little consistency in the methods by which we collect empirical evidence about young adult (YA) spaces. This study is the first systematic attempt to both collect and analyze empirical data on libraries\u27 recent trend toward providing greater spatial equity for YA library service
Learning Multimodal Latent Attributes
AbstractâThe rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning
Designing Fair Ranking Schemes
Items from a database are often ranked based on a combination of multiple
criteria. A user may have the flexibility to accept combinations that weigh
these criteria differently, within limits. On the other hand, this choice of
weights can greatly affect the fairness of the produced ranking. In this paper,
we develop a system that helps users choose criterion weights that lead to
greater fairness.
We consider ranking functions that compute the score of each item as a
weighted sum of (numeric) attribute values, and then sort items on their score.
Each ranking function can be expressed as a vector of weights, or as a point in
a multi-dimensional space. For a broad range of fairness criteria, we show how
to efficiently identify regions in this space that satisfy these criteria.
Using this identification method, our system is able to tell users whether
their proposed ranking function satisfies the desired fairness criteria and, if
it does not, to suggest the smallest modification that does. We develop
user-controllable approximation that and indexing techniques that are applied
during preprocessing, and support sub-second response times during the online
phase. Our extensive experiments on real datasets demonstrate that our methods
are able to find solutions that satisfy fairness criteria effectively and
efficiently
- âŠ