14,070 research outputs found
Analysing imperfect temporal information in GIS using the Triangular Model
Rough set and fuzzy set are two frequently used approaches for modelling and reasoning about imperfect time intervals. In this paper, we focus on imperfect time intervals that can be modelled by rough sets and use an innovative graphic model [i.e. the triangular model (TM)] to represent this kind of imperfect time intervals. This work shows that TM is potentially advantageous in visualizing and querying imperfect time intervals, and its analytical power can be better exploited when it is implemented in a computer application with graphical user interfaces and interactive functions. Moreover, a probabilistic framework is proposed to handle the uncertainty issues in temporal queries. We use a case study to illustrate how the unique insights gained by TM can assist a geographical information system for exploratory spatio-temporal analysis
Contextual Media Retrieval Using Natural Language Queries
The widespread integration of cameras in hand-held and head-worn devices as
well as the ability to share content online enables a large and diverse visual
capture of the world that millions of users build up collectively every day. We
envision these images as well as associated meta information, such as GPS
coordinates and timestamps, to form a collective visual memory that can be
queried while automatically taking the ever-changing context of mobile users
into account. As a first step towards this vision, in this work we present
Xplore-M-Ego: a novel media retrieval system that allows users to query a
dynamic database of images and videos using spatio-temporal natural language
queries. We evaluate our system using a new dataset of real user queries as
well as through a usability study. One key finding is that there is a
considerable amount of inter-user variability, for example in the resolution of
spatial relations in natural language utterances. We show that our retrieval
system can cope with this variability using personalisation through an online
learning-based retrieval formulation.Comment: 8 pages, 9 figures, 1 tabl
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
Time indeterminacy and spatio-temporal building transformations: an approach for architectural heritage understanding
Nowadays most digital reconstructions in architecture and archeology describe buildings heritage as awhole of static and unchangeable entities. However, historical sites can have a rich and complex history, sometimes full of evolutions, sometimes only partially known by means of documentary sources. Various aspects condition the analysis and the interpretation of cultural heritage. First of all, buildings are not inexorably constant in time: creation, destruction, union, division, annexation, partial demolition and change of function are the transformations that buildings can undergo over time. Moreover, other factors sometimes contradictory can condition the knowledge about an historical site, such as historical sources and uncertainty. On one hand, historical documentation concerning past states can be heterogeneous, dubious, incomplete and even contradictory. On the other hand, uncertainty is prevalent in cultural heritage in various forms: sometimes it is impossible to define the dating period, sometimes the building original shape or yet its spatial position. This paper proposes amodeling approach of the geometrical representation of buildings, taking into account the kind of transformations and the notion of temporal indetermination
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
PlaNet - Photo Geolocation with Convolutional Neural Networks
Is it possible to build a system to determine the location where a photo was
taken using just its pixels? In general, the problem seems exceptionally
difficult: it is trivial to construct situations where no location can be
inferred. Yet images often contain informative cues such as landmarks, weather
patterns, vegetation, road markings, and architectural details, which in
combination may allow one to determine an approximate location and occasionally
an exact location. Websites such as GeoGuessr and View from your Window suggest
that humans are relatively good at integrating these cues to geolocate images,
especially en-masse. In computer vision, the photo geolocation problem is
usually approached using image retrieval methods. In contrast, we pose the
problem as one of classification by subdividing the surface of the earth into
thousands of multi-scale geographic cells, and train a deep network using
millions of geotagged images. While previous approaches only recognize
landmarks or perform approximate matching using global image descriptors, our
model is able to use and integrate multiple visible cues. We show that the
resulting model, called PlaNet, outperforms previous approaches and even
attains superhuman levels of accuracy in some cases. Moreover, we extend our
model to photo albums by combining it with a long short-term memory (LSTM)
architecture. By learning to exploit temporal coherence to geolocate uncertain
photos, we demonstrate that this model achieves a 50% performance improvement
over the single-image model
- âŠ