3,145 research outputs found
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Multi-agent simulation: new approaches to exploring space-time dynamics in GIS
As part of the long term quest to develop more disaggregate, temporally dynamic models of spatial behaviour, micro-simulation has evolved to the point where the actions of many individuals can be computed. These multi-agent systems/simulation(MAS) models are a consequence of much better micro data, more powerful and user-friendly computer environments often based on parallel processing, and the generally recognised need in spatial science for modelling temporal process. In this paper, we develop a series of multi-agent models which operate in cellular space.These demonstrate the well-known principle that local action can give rise to global pattern but also how such pattern emerges as the consequence of positive feedback and learned behaviour. We first summarise the way cellular representation is important in adding new process functionality to GIS, and the way this is effected through ideas from cellular automata (CA) modelling. We then outline the key ideas of multi-agent simulation and this sets the scene for three applications to problems involving the use of agents to explore geographic space. We first illustrate how agents can be programmed to search route networks, finding shortest routes in adhoc as well as structured ways equivalent to the operation of the Bellman-Dijkstra algorithm. We then demonstrate how the agent-based approach can be used to simulate the dynamics of water flow, implying that such models can be used to effectively model the evolution of river systems. Finally we show how agents can detect the geometric properties of space, generating powerful results that are notpossible using conventional geometry, and we illustrate these ideas by computing the visual fields or isovists associated with different viewpoints within the Tate Gallery.Our forays into MAS are all based on developing reactive agent models with minimal interaction and we conclude with suggestions for how these models might incorporate cognition, planning, and stronger positive feedbacks between agents
Motion Invariance in Visual Environments
The puzzle of computer vision might find new challenging solutions when we
realize that most successful methods are working at image level, which is
remarkably more difficult than processing directly visual streams, just as
happens in nature. In this paper, we claim that their processing naturally
leads to formulate the motion invariance principle, which enables the
construction of a new theory of visual learning based on convolutional
features. The theory addresses a number of intriguing questions that arise in
natural vision, and offers a well-posed computational scheme for the discovery
of convolutional filters over the retina. They are driven by the Euler-Lagrange
differential equations derived from the principle of least cognitive action,
that parallels laws of mechanics. Unlike traditional convolutional networks,
which need massive supervision, the proposed theory offers a truly new scenario
in which feature learning takes place by unsupervised processing of video
signals. An experimental report of the theory is presented where we show that
features extracted under motion invariance yield an improvement that can be
assessed by measuring information-based indexes.Comment: arXiv admin note: substantial text overlap with arXiv:1801.0711
Accelerating Cooperative Planning for Automated Vehicles with Learned Heuristics and Monte Carlo Tree Search
Efficient driving in urban traffic scenarios requires foresight. The
observation of other traffic participants and the inference of their possible
next actions depending on the own action is considered cooperative prediction
and planning. Humans are well equipped with the capability to predict the
actions of multiple interacting traffic participants and plan accordingly,
without the need to directly communicate with others. Prior work has shown that
it is possible to achieve effective cooperative planning without the need for
explicit communication. However, the search space for cooperative plans is so
large that most of the computational budget is spent on exploring the search
space in unpromising regions that are far away from the solution. To accelerate
the planning process, we combined learned heuristics with a cooperative
planning method to guide the search towards regions with promising actions,
yielding better solutions at lower computational costs
- …