8,244 research outputs found
Fidelity metrics for virtual environment simulations based on spatial memory awareness states
This paper describes a methodology based on human judgments of memory awareness
states for assessing the simulation fidelity of a virtual environment (VE) in relation
to its real scene counterpart. To demonstrate the distinction between task
performance-based approaches and additional human evaluation of cognitive awareness
states, a photorealistic VE was created. Resulting scenes displayed on a headmounted
display (HMD) with or without head tracking and desktop monitor were
then compared to the real-world task situation they represented, investigating spatial
memory after exposure. Participants described how they completed their spatial
recollections by selecting one of four choices of awareness states after retrieval in
an initial test and a retention test a week after exposure to the environment. These
reflected the level of visual mental imagery involved during retrieval, the familiarity
of the recollection and also included guesses, even if informed. Experimental results
revealed variations in the distribution of participants’ awareness states across conditions
while, in certain cases, task performance failed to reveal any. Experimental
conditions that incorporated head tracking were not associated with visually induced
recollections. Generally, simulation of task performance does not necessarily
lead to simulation of the awareness states involved when completing a memory
task. The general premise of this research focuses on how tasks are achieved,
rather than only on what is achieved. The extent to which judgments of human
memory recall, memory awareness states, and presence in the physical and VE are
similar provides a fidelity metric of the simulation in question
Facetwise Mesh Refinement for Multi-View Stereo
Mesh refinement is a fundamental step for accurate Multi-View Stereo. It
modifies the geometry of an initial manifold mesh to minimize the photometric
error induced in a set of camera pairs. This initial mesh is usually the output
of volumetric 3D reconstruction based on min-cut over Delaunay Triangulations.
Such methods produce a significant amount of non-manifold vertices, therefore
they require a vertex split step to explicitly repair them. In this paper, we
extend this method to preemptively fix the non-manifold vertices by reasoning
directly on the Delaunay Triangulation and avoid most vertex splits. The main
contribution of this paper addresses the problem of choosing the camera pairs
adopted by the refinement process. We treat the problem as a mesh labeling
process, where each label corresponds to a camera pair. Differently from the
state-of-the-art methods, which use each camera pair to refine all the visible
parts of the mesh, we choose, for each facet, the best pair that enforces both
the overall visibility and coverage. The refinement step is applied for each
facet using only the camera pair selected. This facetwise refinement helps the
process to be applied in the most evenly way possible.Comment: Accepted as Oral ICPR202
Deconstructing the stereotypes: building mutual respect
Through a combination of a detailed literature review and structure online survey, the study seeks to establish the extent of interdisciplinary attitudes within built environment students at Kingston University, whilst building a picture of not only the stereotypes held amongst and between disciplines, but also the fundamental root of such perceptions
Visual Place Recognition for Autonomous Robots
Autonomous robotics has been the subject of great interest within the research community over the past few decades. Its applications are wide-spread, ranging from health-care to manufacturing, goods transportation to home deliveries, site-maintenance to construction, planetary explorations to rescue operations and many others, including but not limited to agriculture, defence, commerce, leisure and extreme environments. At the core of robot autonomy lies the problem of localisation, i.e, knowing where it is and within the robotics community, this problem is termed as place recognition. Place recognition using only visual input is termed as Visual Place Recognition (VPR) and refers to the ability of an autonomous system to recall a previously visited place using only visual input, under changing viewpoint, illumination and seasonal conditions, and given computational and storage constraints.
This thesis is a collection of 4 inter-linked, mutually-relevant but branching-out topics within VPR: 1) What makes a place/image worthy for VPR?, 2) How to define a state-of-the-art in VPR?, 3) Do VPR techniques designed for ground-based platforms extend to aerial platforms? and 4) Can a handcrafted VPR technique outperform deep-learning-based VPR techniques? Each of these questions is a dedicated, peer-reviewed chapter in this thesis and the author attempts to answer these questions to the best of his abilities.
The worthiness of a place essentially refers to the salience and distinctiveness of the content in the image of this place. This salience is modelled as a framework, namely memorable-maps, comprising of 3 conjoint criteria: a) Human-memorability of an image, 2) Staticity and 3) Information content. Because a large number of VPR techniques have been proposed over the past 10-15 years, and due to the variation of employed VPR datasets and metrics for evaluation, the correct state-of-the-art remains ambiguous. The author levels this playing field by deploying 10 contemporary techniques on a common platform and use the most challenging VPR datasets to provide a holistic performance comparison. This platform is then extended to aerial place recognition datasets to answer the 3rd question above. Finally, the author designs a novel, handcrafted, compute-efficient and training-free VPR technique that outperforms state-of-the-art VPR techniques on 5 different VPR datasets
Systems and Methods for Behavior Detection Using 3D Tracking and Machine Learning
Systems and methods for performing behavioral detection using three-dimensional tracking
and machine learning in accordance with various embodiments of the invention are
disclosed. One embodiment of the invention involves a the classification application that
directs a microprocessor to: identify at least a primary subject interacting with a secondary
subject within a sequence of frames of image data including depth information; determine
poses of the subjects; extract a set of parameters describing the poses and movement of at
least the primary and secondary subjects; and detect a social behavior performed by at least
the primary subject and involving at least the second subject using a classifier trained to
discriminate between a plurality of social behaviors based upon the set of parameters
describing poses and movement
V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints
We introduce a learning-based depth map fusion framework that accepts a set
of depth and confidence maps generated by a Multi-View Stereo (MVS) algorithm
as input and improves them. This is accomplished by integrating volumetric
visibility constraints that encode long-range surface relationships across
different views into an end-to-end trainable architecture. We also introduce a
depth search window estimation sub-network trained jointly with the larger
fusion sub-network to reduce the depth hypothesis search space along each ray.
Our method learns to model depth consensus and violations of visibility
constraints directly from the data; effectively removing the necessity of
fine-tuning fusion parameters. Extensive experiments on MVS datasets show
substantial improvements in the accuracy of the output fused depth and
confidence maps.Comment: ICCV 202
- …