269 research outputs found

    Coupled Ensembles of Neural Networks

    Full text link
    We investigate in this paper the architecture of deep convolutional networks. Building on existing state of the art models, we propose a reconfiguration of the model parameters into several parallel branches at the global network level, with each branch being a standalone CNN. We show that this arrangement is an efficient way to significantly reduce the number of parameters without losing performance or to significantly improve the performance with the same level of performance. The use of branches brings an additional form of regularization. In addition to the split into parallel branches, we propose a tighter coupling of these branches by placing the "fuse (averaging) layer" before the Log-Likelihood and SoftMax layers during training. This gives another significant performance improvement, the tighter coupling favouring the learning of better representations, even at the level of the individual branches. We refer to this branched architecture as "coupled ensembles". The approach is very generic and can be applied with almost any DCNN architecture. With coupled ensembles of DenseNet-BC and parameter budget of 25M, we obtain error rates of 2.92%, 15.68% and 1.50% respectively on CIFAR-10, CIFAR-100 and SVHN tasks. For the same budget, DenseNet-BC has error rate of 3.46%, 17.18%, and 1.8% respectively. With ensembles of coupled ensembles, of DenseNet-BC networks, with 50M total parameters, we obtain error rates of 2.72%, 15.13% and 1.42% respectively on these tasks

    The Quest for Stereoscopic Movement: Was the First Film ever in 3-D?

    Get PDF
    THE QUEST FOR STEREOSCOPIC MOVEMENT: WAS THE FIRST FILM EVER IN 3-D?    DENIS PELLERIN - London Stereoscopic Compan

    Different types of sounds influence gaze differently in videos

    Get PDF
    This paper presents an analysis of the effect of different types of sounds on visual gaze when a person is looking freely at videos, which would be helpful to predict eye position. In order to test the effect of sound, an audio-visual experiment was designed with two groups of participants, with audio-visual (AV) and visual (V) conditions. By using statistical tools, we analyzed the difference between eye position of participants with AV and V conditions. We observed that the effect of sound is different depending on the kind of sound, and that the classes with human voice (i.e. speech, singer, human noise and singers) have the greatest effect. Furthermore, the results of the distance between sound source and eye position of the group with AV condition, suggested that only particular types of sound attract human eye position to the sound source. Finally, an analysis of the fixation duration between AV and V conditions showed that participants with AV condition move eyes more frequently than those with V condition

    Influence of number, location and size of faces on gaze in video

    Get PDF
    Many studies have reported the preference for faces and influence of faces on gaze, most of them in static images and a few in videos. In this paper, we study the influence of faces in complex free-viewing videos, with respect to the effects of number, location and size of the faces. This knowledge could be used to enrich a face pathway in a visual saliency model. We used eye fixation data from an eye movement experiment, hand-labeled all the faces in the videos watched, and compared the labeled face regions against the eye fixations. We observed that fixations made are in proximity to, or inside the face regions. We found that 50% of the fixations landed directly on face regions that occupy less than 10% of the entire visual scene. Moreover, the fixation duration on videos with face is longer than without face, and longer than fixation duration on static images with faces. Finally, we analyzed the three influencing factors (Eccentricity, Area, Closeness) with linear regression models. For one face, the E +A combined model is slightly better than the E model and better than the A model. For two faces, the three variables (E,A,C) are tightly coupled and the E +A+C model had the highest score.

    Les lucarnes de l'infini

    Get PDF
    " Des milliers d'yeux avides se penchaient sur les trous du stéréoscope comme sur les lucarnes de l'infini. " Charles Baudelaire, Salon de 1859. Sujette aux caprices de la mode, gênée depuis l'origine dans son essor par l'intermédiaire optique nécessaire à sa production, défendue malgré cela par une poignée d'irréductibles éparpillés de par le monde, la stéréoscopie semble connaître actuellement un regain d'intérêt tant dans les médias que chez les artistes contemporains et les gens d..

    Does color influence eye movements while exploring videos?

    Get PDF
    Although visual attention studies consider color as one of the most important features in guiding visual attention, few studies have investigated how color influences eye movements while viewing natural scenes without any particular task. To better understand the visual features that drive attention, the aim of this paper was to quantify the influence of color on eye movements when viewing dynamic natural scenes. The influence of color was investigated by comparing the eye positions of several observers eye-tracked while viewing video stimuli in two conditions: color and grayscale. The comparison was made using the dispersion between the eye positions of observers, the number of attractive regions measured with a clustering method applied to the eye positions, and by comparing eye positions to the predictions of a saliency model. The mean amplitude of saccades and the mean duration of fixations were compared as well. Globally, a slight influence of color on eye movements was measured; only the number of attractive regions for color stimuli was slightly higher than for grayscale stimuli. However, a luminance-based saliency model predicts the eye positions for color stimuli as efficiently as for grayscale stimuli

    Multi-layer Dictionary Learning for Image Classification

    No full text
    International audienceThis paper presents a multi-layer dictionary learning method for classification tasks. The goal of the proposed multi-layer framework is to use the supervised dictionary learning approach locally on raw images in order to learn local features. This method starts by building a sparse representation at the patch-level and relies on a hierarchy of learned dictionaries to output a global sparse representation for the whole image. It relies on a succession of sparse coding and pooling steps in order to find an efficient representation of the data for classification. This method has been tested on a classification task with good results

    Color Information in a Model of Saliency

    No full text
    International audienceBottom-up saliency models have been developed to predict the location of gaze according to the low level features of visual scenes, such as intensity, color, frequency and motion. We investigate in this paper the contribution of color features in computing the bottom-up saliency. We incorporated a chrominance pathway to a luminance-based model (Marat et al.). We evaluated the performance of the model with and without chrominance pathway. We added an efficient multi-GPU implementation of the chrominance pathway to the parallel implementation of the luminance-based model proposed by Rahman et al., preserving real time solution. Results show that color information improves the performance of the saliency model in predicting eye positions
    • …
    corecore