239 research outputs found

    "Mental Rotation" by Optimizing Transforming Distance

    Full text link
    The human visual system is able to recognize objects despite transformations that can drastically alter their appearance. To this end, much effort has been devoted to the invariance properties of recognition systems. Invariance can be engineered (e.g. convolutional nets), or learned from data explicitly (e.g. temporal coherence) or implicitly (e.g. by data augmentation). One idea that has not, to date, been explored is the integration of latent variables which permit a search over a learned space of transformations. Motivated by evidence that people mentally simulate transformations in space while comparing examples, so-called "mental rotation", we propose a transforming distance. Here, a trained relational model actively transforms pairs of examples so that they are maximally similar in some feature space yet respect the learned transformational constraints. We apply our method to nearest-neighbour problems on the Toronto Face Database and NORB

    Would Motor-Imagery based BCI user training benefit from more women experimenters?

    Full text link
    Mental Imagery based Brain-Computer Interfaces (MI-BCI) are a mean to control digital technologies by performing MI tasks alone. Throughout MI-BCI use, human supervision (e.g., experimenter or caregiver) plays a central role. While providing emotional and social feedback, people present BCIs to users and ensure smooth users' progress with BCI use. Though, very little is known about the influence experimenters might have on the results obtained. Such influence is to be expected as social and emotional feedback were shown to influence MI-BCI performances. Furthermore, literature from different fields showed an experimenter effect, and specifically of their gender, on experimental outcome. We assessed the impact of the interaction between experi-menter and participant gender on MI-BCI performances and progress throughout a session. Our results revealed an interaction between participants gender, experimenter gender and progress over runs. It seems to suggest that women experimenters may positively influence partici-pants' progress compared to men experimenters

    Compensating for Large In-Plane Rotations in Natural Images

    Full text link
    Rotation invariance has been studied in the computer vision community primarily in the context of small in-plane rotations. This is usually achieved by building invariant image features. However, the problem of achieving invariance for large rotation angles remains largely unexplored. In this work, we tackle this problem by directly compensating for large rotations, as opposed to building invariant features. This is inspired by the neuro-scientific concept of mental rotation, which humans use to compare pairs of rotated objects. Our contributions here are three-fold. First, we train a Convolutional Neural Network (CNN) to detect image rotations. We find that generic CNN architectures are not suitable for this purpose. To this end, we introduce a convolutional template layer, which learns representations for canonical 'unrotated' images. Second, we use Bayesian Optimization to quickly sift through a large number of candidate images to find the canonical 'unrotated' image. Third, we use this method to achieve robustness to large angles in an image retrieval scenario. Our method is task-agnostic, and can be used as a pre-processing step in any computer vision system.Comment: Accepted at Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 201

    ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids

    Full text link
    We introduce an unsupervised feature learning approach that embeds 3D shape information into a single-view image representation. The main idea is a self-supervised training objective that, given only a single 2D image, requires all unseen views of the object to be predictable from learned features. We implement this idea as an encoder-decoder convolutional neural network. The network maps an input image of an unknown category and unknown viewpoint to a latent space, from which a deconvolutional decoder can best "lift" the image to its complete viewgrid showing the object from all viewing angles. Our class-agnostic training procedure encourages the representation to capture fundamental shape primitives and semantic regularities in a data-driven manner---without manual semantic labels. Our results on two widely-used shape datasets show 1) our approach successfully learns to perform "mental rotation" even for objects unseen during training, and 2) the learned latent space is a powerful representation for object recognition, outperforming several existing unsupervised feature learning methods.Comment: To appear at ECCV 201

    Engineers’ abilities influence spatial perspective changing

    Get PDF
    In this paper we studied the effect of engineering expertise in providing directional judgments. We asked two groups of people, engineers and non-engineers, to observe and memorize five maps, each including a four-point path, for 30 sec. The path was then removed and the participants had to provide two directional judgments: aligned (the imagined perspective on the task was the same as the one just learned), and counter-aligned (the imagined perspective on the task was rotated by 180°). Our results showed that engineers are equally able to perform aligned and counter-aligned directional judgments. The alignment effect due to the distance from the learning perspective was, in fact, shown only by non-engineers. Results are discussed considering engineering both learning expertise and specific predisposition

    The Immersive Mental Rotations Test: Evaluating Spatial Ability in Virtual Reality

    Get PDF
    Advancements in extended reality (XR) have inspired new uses and users of advanced visualization interfaces, transforming geospatial data visualization and consumption by enabling interactive 3D geospatial data experiences in 3D. Conventional metrics (e.g., mental rotations test (MRT)) are often used to assess and predict the appropriateness of these visualizations without accounting for the effect the interface has on those metrics. We developed the Immersive MRT (IMRT) to evaluate the impact that virtual reality (VR) based visualizations and 3D virtual environments have on mental rotation performance. Consistent with previous work, the results of our pilot study suggest that mental rotation tasks are performed more accurately and rapidly with stereo 3D stimuli than with 2D images of those stimuli

    Gated networks: an inventory

    Get PDF
    Gated networks are networks that contain gating connections, in which the outputs of at least two neurons are multiplied. Initially, gated networks were used to learn relationships between two input sources, such as pixels from two images. More recently, they have been applied to learning activity recognition or multi-modal representations. The aims of this paper are threefold: 1) to explain the basic computations in gated networks to the non-expert, while adopting a standpoint that insists on their symmetric nature. 2) to serve as a quick reference guide to the recent literature, by providing an inventory of applications of these networks, as well as recent extensions to the basic architecture. 3) to suggest future research directions and applications.Comment: Unpublished manuscript, 17 page
    corecore