421,531 research outputs found

    An investigation of visual cues used to create and support frames of reference and visual search tasks in desktop virtual environments

    Get PDF
    Visual depth cues are combined to produce the essential depth and dimensionality of Desktop Virtual Environments (DVEs). This study discusses DVEs in terms of the visual depth cues that create and support perception of frames of references and accomplishment of visual search tasks. This paper presents the results of an investigation that identifies the effects of the experimental stimuli positions and visual depth cues: luminance, texture, relative height and motion parallax on precise depth judgements made within a DVE. Results indicate that the experimental stimuli positions significantly affect precise depth judgements, texture is only significantly effective for certain conditions, and motion parallax, in line with previous results, is inconclusive to determine depth judgement accuracy for egocentrically viewed DVEs. Results also show that exocentric views, incorporating relative height and motion parallax visual cues, are effective for precise depth judgements made in DVEs. The results help us to understand the effects of certain visual depth cues to support the perception of frames of references and precise depth judgements, suggesting that the visual depth cues employed to create frames of references in DVEs may influence how effectively precise depth judgements are undertaken

    Olfactory cue use by three-spined sticklebacks foraging in turbid water: prey detection or prey location?

    Get PDF
    Foraging, when senses are limited to olfaction, is composed of two distinct stages: the detection of prey and the location of prey. While specialist olfactory foragers are able to locate prey using olfactory cues alone, this may not be the case for foragers that rely primarily on vision. Visual predators in aquatic systems may be faced with poor visual conditions such as natural or human-induced turbidity. The ability of visual predators to compensate for poor visual conditions by using other senses is not well understood, although it is widely accepted that primarily visual fish, such as three-spined sticklebacks, Gasterosteus aculeatus, can detect and use olfactory cues for a range of purposes. We investigated the ability of sticklebacks to detect the presence of prey and to locate prey precisely, using olfaction, in clear and turbid (two levels) water. When provided with only a visual cue, or only an olfactory cue, sticklebacks showed a similar ability to detect prey, but a combination of these cues improved their performance. In open-arena foraging trials, a dispersed olfactory cue added to the water (masking cues from the prey) improved foraging success, contrary to our expectations, whereas activity levels and swimming speed did not change as a result of olfactory cue availability. We suggest that olfaction functions to allow visual predators to detect rather than locate prey and that olfactory cues have an appetitive effect, enhancing motivation to forage

    Direct and generative retrieval of autobiographical memories : the roles of visual imagery and executive processes

    Get PDF
    Two experiments used a dual task methodology to investigate the role of visual imagery and executive resources in the retrieval of specific autobiographical memories. In Experiment 1, dynamic visual noise led to a reduction in the number of specific memories retrieved in response to both high and low imageability cues, but did not affect retrieval times. In Experiment 2, irrelevant pictures reduced the number of specific memories but only in response to low imageability cues. Irrelevant pictures also increased response times to both high and low imageability cues. The findings are in line with previous work suggesting that disrupting executive resources may impair generative, but not direct, retrieval of autobiographical memories. In contrast, visual distractor tasks appear to impair access to specific autobiographical memories via both the direct and generative retrieval routes, thereby highlighting the potential role of visual imagery in both pathways

    DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion

    Get PDF
    In this paper, we study the task of detecting semantic parts of an object, e.g., a wheel of a car, under partial occlusion. We propose that all models should be trained without seeing occlusions while being able to transfer the learned knowledge to deal with occlusions. This setting alleviates the difficulty in collecting an exponentially large dataset to cover occlusion patterns and is more essential. In this scenario, the proposal-based deep networks, like RCNN-series, often produce unsatisfactory results, because both the proposal extraction and classification stages may be confused by the irrelevant occluders. To address this, [25] proposed a voting mechanism that combines multiple local visual cues to detect semantic parts. The semantic parts can still be detected even though some visual cues are missing due to occlusions. However, this method is manually-designed, thus is hard to be optimized in an end-to-end manner. In this paper, we present DeepVoting, which incorporates the robustness shown by [25] into a deep network, so that the whole pipeline can be jointly optimized. Specifically, it adds two layers after the intermediate features of a deep network, e.g., the pool-4 layer of VGGNet. The first layer extracts the evidence of local visual cues, and the second layer performs a voting mechanism by utilizing the spatial relationship between visual cues and semantic parts. We also propose an improved version DeepVoting+ by learning visual cues from context outside objects. In experiments, DeepVoting achieves significantly better performance than several baseline methods, including Faster-RCNN, for semantic part detection under occlusion. In addition, DeepVoting enjoys explainability as the detection results can be diagnosed via looking up the voting cues

    Solving Visual Madlibs with Multiple Cues

    Get PDF
    This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset. Previous approaches to Visual Question Answering (VQA) have mainly used generic image features from networks trained on the ImageNet dataset, despite the wide scope of questions. In contrast, our approach employs features derived from networks trained for specialized tasks of scene classification, person activity prediction, and person and object attribute prediction. We also present a method for selecting sub-regions of an image that are relevant for evaluating the appropriateness of a putative answer. Visual features are computed both from the whole image and from local regions, while sentences are mapped to a common space using a simple normalized canonical correlation analysis (CCA) model. Our results show a significant improvement over the previous state of the art, and indicate that answering different question types benefits from examining a variety of image cues and carefully choosing informative image sub-regions

    Audio-visual speech perception: a developmental ERP investigation

    Get PDF
    Being able to see a talking face confers a considerable advantage for speech perception in adulthood. However, behavioural data currently suggest that children fail to make full use of these available visual speech cues until age 8 or 9. This is particularly surprising given the potential utility of multiple informational cues during language learning. We therefore explored this at the neural level. The event-related potential (ERP) technique has been used to assess the mechanisms of audio-visual speech perception in adults, with visual cues reliably modulating auditory ERP responses to speech. Previous work has shown congruence-dependent shortening of auditory N1/P2 latency and congruence-independent attenuation of amplitude in the presence of auditory and visual speech signals, compared to auditory alone. The aim of this study was to chart the development of these well-established modulatory effects over mid-to-late childhood. Experiment 1 employed an adult sample to validate a child-friendly stimulus set and paradigm by replicating previously observed effects of N1/P2 amplitude and latency modulation by visual speech cues; it also revealed greater attenuation of component amplitude given incongruent audio-visual stimuli, pointing to a new interpretation of the amplitude modulation effect. Experiment 2 used the same paradigm to map cross-sectional developmental change in these ERP responses between 6 and 11 years of age. The effect of amplitude modulation by visual cues emerged over development, while the effect of latency modulation was stable over the child sample. These data suggest that auditory ERP modulation by visual speech represents separable underlying cognitive processes, some of which show earlier maturation than others over the course of development

    Know2Look: Commonsense Knowledge for Visual Search

    No full text
    With the rise in popularity of social media, images accompanied by contextual text form a huge section of the web. However, search and retrieval of documents are still largely dependent on solely textual cues. Although visual cues have started to gain focus, the imperfection in object/scene detection do not lead to significantly improved results. We hypothesize that the use of background commonsense knowledge on query terms can significantly aid in retrieval of documents with associated images. To this end we deploy three different modalities - text, visual cues, and commonsense knowledge pertaining to the query - as a recipe for efficient search and retrieval

    Timing and correction of stepping movements with a virtual reality avatar

    Get PDF
    Research into the ability to coordinate one’s movements with external cues has focussed on the use of simple rhythmic, auditory and visual stimuli, or interpersonal coordination with another person. Coordinating movements with a virtual avatar has not been explored, in the context of responses to temporal cues. To determine whether cueing of movements using a virtual avatar is effective, people’s ability to accurately coordinate with the stimuli needs to be investigated. Here we focus on temporal cues, as we know from timing studies that visual cues can be difficult to follow in the timing context. Real stepping movements were mapped onto an avatar using motion capture data. Healthy participants were then motion captured whilst stepping in time with the avatar’s movements, as viewed through a virtual reality headset. The timing of one of the avatar step cycles was accelerated or decelerated by 15% to create a temporal perturbation, for which participants would need to correct to, in order to remain in time. Step onset times of participants relative to the corresponding step-onsets of the avatar were used to measure the timing errors (asynchronies) between them. Participants completed either a visual-only condition, or auditory-visual with footstep sounds included, at two stepping tempo conditions (Fast: 400ms interval, Slow: 800ms interval). Participants’ asynchronies exhibited slow drift in the Visual-Only condition, but became stable in the Auditory-Visual condition. Moreover, we observed a clear corrective response to the phase perturbation in both the fast and slow tempo auditory-visual conditions. We conclude that an avatar’s movements can be used to influence a person’s own motion, but should include relevant auditory cues congruent with the movement to ensure a suitable level of entrainment is achieved. This approach has applications in physiotherapy, where virtual avatars present an opportunity to provide the guidance to assist patients in adhering to prescribed exercises

    Words, Numbers and Visual Heuristics in Web Surveys: Is there a Hierarchy of Importance?

    Get PDF
    In interpreting questions, respondents extract meaning from how the information in a questionnaire is shaped, spaced, and shaded. This makes it important to pay close attention to the arrangement of visual information on a questionnaire. Respondents follow simple heuristics in interpreting the visual features of questions. We carried out five experiments to investigate how the effect of visual heuristics affected the answers to survey questions. We varied verbal, numerical, and other visual cues such as color. In some instances the use of words helps overcome visual layout effects. In at least one instance, a fundamental difference in visual layout (violating the 'left and top means first' heuristic) influenced answers on top of word labels. This suggests that both visual and verbal languages are important. Yet sometimes one can override the other. To reduce the effect of visual cues, it is better to use fully labeled scales in survey questions.questionnaire design;layout;visual language;response effects;visual cues
    corecore