31,971 research outputs found
Seeing What You're Told: Sentence-Guided Activity Recognition In Video
We present a system that demonstrates how the compositional structure of
events, in concert with the compositional structure of language, can interplay
with the underlying focusing mechanisms in video action recognition, thereby
providing a medium, not only for top-down and bottom-up integration, but also
for multi-modal integration between vision and language. We show how the roles
played by participants (nouns), their characteristics (adjectives), the actions
performed (verbs), the manner of such actions (adverbs), and changing spatial
relations between participants (prepositions) in the form of whole sentential
descriptions mediated by a grammar, guides the activity-recognition process.
Further, the utility and expressiveness of our framework is demonstrated by
performing three separate tasks in the domain of multi-activity videos:
sentence-guided focus of attention, generation of sentential descriptions of
video, and query-based video search, simply by leveraging the framework in
different manners.Comment: To appear in CVPR 201
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
On Quantifying Qualitative Geospatial Data: A Probabilistic Approach
Living in the era of data deluge, we have witnessed a web content explosion,
largely due to the massive availability of User-Generated Content (UGC). In
this work, we specifically consider the problem of geospatial information
extraction and representation, where one can exploit diverse sources of
information (such as image and audio data, text data, etc), going beyond
traditional volunteered geographic information. Our ambition is to include
available narrative information in an effort to better explain geospatial
relationships: with spatial reasoning being a basic form of human cognition,
narratives expressing such experiences typically contain qualitative spatial
data, i.e., spatial objects and spatial relationships.
To this end, we formulate a quantitative approach for the representation of
qualitative spatial relations extracted from UGC in the form of texts. The
proposed method quantifies such relations based on multiple text observations.
Such observations provide distance and orientation features which are utilized
by a greedy Expectation Maximization-based (EM) algorithm to infer a
probability distribution over predefined spatial relationships; the latter
represent the quantified relationships under user-defined probabilistic
assumptions. We evaluate the applicability and quality of the proposed approach
using real UGC data originating from an actual travel blog text corpus. To
verify the quality of the result, we generate grid-based maps visualizing the
spatial extent of the various relations
Recommended from our members
Language engineering - a champion for European culture
Language is key to culture. It is a direct cultural medium as well as a means of recording and providing access to non-lingual elements of culture. Language is also fundamental to a sense of cultural identity. For this reason, it is vital, in a changing Europe, that we preserve the multi-lingual character of our society in order to move successfully towards closer co-operation at a political, economic, and social level.
Language engineering is the application of knowledge of language to the development of computer software which can recognise, understand, interpret, and generate human language in all its forms.
The paper provides a high level view of the ‘state of the art’ in language engineering and indicates ways in which it will have a profound impact on our culture in the future. It shows how advances in language engineering are an important aid in maintaining cultural diversity in a multi-lingual European society, while enabling the development of social cohesion across cultural and national divides. It addresses issues raised by the prospect of the Multi-lingual Information Society, including education, human communication with technology and information management, as well as aspects of digital cities such as tele-presence in digital libraries, virtual art galleries and electronic museums. The paper raises the issue of language as a factor in cultural domination, showing the contribution that language engineering can make towards countering it.
The paper also raises a number of controversial issues concerning the likely benefits arising from the ways in which language is likely to influence the culture of Europe
Hybrid Bayesian Eigenobjects: Combining Linear Subspace and Deep Network Methods for 3D Robot Vision
We introduce Hybrid Bayesian Eigenobjects (HBEOs), a novel representation for
3D objects designed to allow a robot to jointly estimate the pose, class, and
full 3D geometry of a novel object observed from a single viewpoint in a single
practical framework. By combining both linear subspace methods and deep
convolutional prediction, HBEOs efficiently learn nonlinear object
representations without directly regressing into high-dimensional space. HBEOs
also remove the onerous and generally impractical necessity of input data
voxelization prior to inference. We experimentally evaluate the suitability of
HBEOs to the challenging task of joint pose, class, and shape inference on
novel objects and show that, compared to preceding work, HBEOs offer
dramatically improved performance in all three tasks along with several orders
of magnitude faster runtime performance.Comment: To appear in the International Conference on Intelligent Robots
(IROS) - Madrid, 201
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
- …