533 research outputs found

    The Conditional Lucas & Kanade Algorithm

    Full text link
    The Lucas & Kanade (LK) algorithm is the method of choice for efficient dense image and object alignment. The approach is efficient as it attempts to model the connection between appearance and geometric displacement through a linear relationship that assumes independence across pixel coordinates. A drawback of the approach, however, is its generative nature. Specifically, its performance is tightly coupled with how well the linear model can synthesize appearance from geometric displacement, even though the alignment task itself is associated with the inverse problem. In this paper, we present a new approach, referred to as the Conditional LK algorithm, which: (i) directly learns linear models that predict geometric displacement as a function of appearance, and (ii) employs a novel strategy for ensuring that the generative pixel independence assumption can still be taken advantage of. We demonstrate that our approach exhibits superior performance to classical generative forms of the LK algorithm. Furthermore, we demonstrate its comparable performance to state-of-the-art methods such as the Supervised Descent Method with substantially less training examples, as well as the unique ability to "swap" geometric warp functions without having to retrain from scratch. Finally, from a theoretical perspective, our approach hints at possible redundancies that exist in current state-of-the-art methods for alignment that could be leveraged in vision systems of the future.Comment: 17 pages, 11 figure

    Synchronization in Complex Systems Following the Decision Based Queuing Process: The Rhythmic Applause as a Test Case

    Full text link
    Living communities can be considered as complex systems, thus a fertile ground for studies related to their statistics and dynamics. In this study we revisit the case of the rhythmic applause by utilizing the model proposed by V\'azquez et al. [A. V\'azquez et al., Phys. Rev. E 73, 036127 (2006)] augmented with two contradicted {\it driving forces}, namely: {\it Individuality} and {\it Companionship}. To that extend, after performing computer simulations with a large number of oscillators we propose an explanation on the following open questions (a) why synchronization occurs suddenly, and b) why synchronization is observed when the clapping period (TcT_c) is 1.5Ts<Tc<2.0Ts1.5 \cdot T_s < T_c < 2.0 \cdot T_s (TsT_s is the mean self period of the spectators) and is lost after a time. Moreover, based on the model, a weak preferential attachment principle is proposed which can produce complex networks obeying power law in the distribution of number edges per node with exponent greater than 3.Comment: 16 pages, 5 figure

    Phase-based video motion processing

    Get PDF
    We introduce a technique to manipulate small movements in videos based on an analysis of motion in complex-valued image pyramids. Phase variations of the coefficients of a complex-valued steerable pyramid over time correspond to motion, and can be temporally processed and amplified to reveal imperceptible motions, or attenuated to remove distracting changes. This processing does not involve the computation of optical flow, and in comparison to the previous Eulerian Video Magnification method it supports larger amplification factors and is significantly less sensitive to noise. These improved capabilities broaden the set of applications for motion processing in videos. We demonstrate the advantages of this approach on synthetic and natural video sequences, and explore applications in scientific analysis, visualization and video enhancement.Shell ResearchUnited States. Defense Advanced Research Projects Agency. Soldier Centric Imaging via Computational CamerasNational Science Foundation (U.S.) (CGV-1111415)Cognex CorporationMicrosoft Research (PhD Fellowship)American Society for Engineering Education. National Defense Science and Engineering Graduate Fellowshi

    A point process framework for modeling electrical stimulation of the auditory nerve

    Full text link
    Model-based studies of auditory nerve responses to electrical stimulation can provide insight into the functioning of cochlear implants. Ideally, these studies can identify limitations in sound processing strategies and lead to improved methods for providing sound information to cochlear implant users. To accomplish this, models must accurately describe auditory nerve spiking while avoiding excessive complexity that would preclude large-scale simulations of populations of auditory nerve fibers and obscure insight into the mechanisms that influence neural encoding of sound information. In this spirit, we develop a point process model of the auditory nerve that provides a compact and accurate description of neural responses to electric stimulation. Inspired by the framework of generalized linear models, the proposed model consists of a cascade of linear and nonlinear stages. We show how each of these stages can be associated with biophysical mechanisms and related to models of neuronal dynamics. Moreover, we derive a semi-analytical procedure that uniquely determines each parameter in the model on the basis of fundamental statistics from recordings of single fiber responses to electric stimulation, including threshold, relative spread, jitter, and chronaxie. The model also accounts for refractory and summation effects that influence the responses of auditory nerve fibers to high pulse rate stimulation. Throughout, we compare model predictions to published physiological data and explain differences in auditory nerve responses to high and low pulse rate stimulation. We close by performing an ideal observer analysis of simulated spike trains in response to sinusoidally amplitude modulated stimuli and find that carrier pulse rate does not affect modulation detection thresholds.Comment: 1 title page, 27 manuscript pages, 14 figures, 1 table, 1 appendi

    Culture shapes how we look at faces

    Get PDF
    Background: Face processing, amongst many basic visual skills, is thought to be invariant across all humans. From as early as 1965, studies of eye movements have consistently revealed a systematic triangular sequence of fixations over the eyes and the mouth, suggesting that faces elicit a universal, biologically-determined information extraction pattern. Methodology/Principal Findings: Here we monitored the eye movements of Western Caucasian and East Asian observers while they learned, recognized, and categorized by race Western Caucasian and East Asian faces. Western Caucasian observers reproduced a scattered triangular pattern of fixations for faces of both races and across tasks. Contrary to intuition, East Asian observers focused more on the central region of the face. Conclusions/Significance: These results demonstrate that face processing can no longer be considered as arising from a universal series of perceptual events. The strategy employed to extract visual information from faces differs across cultures

    Local biases drive, but do not determine, the perception of illusory trajectories

    Get PDF
    When a dot moves horizontally across a set of tilted lines of alternating orientations, the dot appears to be moving up and down along its trajectory. This perceptual phenomenon, known as the slalom illusion, reveals a mismatch between the veridical motion signals and the subjective percept of the motion trajectory, which has not been comprehensively explained. In the present study, we investigated the empirical boundaries of the slalom illusion using psychophysical methods. The phenomenon was found to occur both under conditions of smooth pursuit eye movements and constant fixation, and to be consistently amplified by intermittently occluding the dot trajectory. When the motion direction of the dot was not constant, however, the stimulus display did not elicit the expected illusory percept. These findings confirm that a local bias towards perpendicularity at the intersection points between the dot trajectory and the tilted lines cause the illusion, but also highlight that higher-level cortical processes are involved in interpreting and amplifying the biased local motion signals into a global illusion of trajectory perception

    Longer fixation duration while viewing face images

    Get PDF
    The spatio-temporal properties of saccadic eye movements can be influenced by the cognitive demand and the characteristics of the observed scene. Probably due to its crucial role in social communication, it is argued that face perception may involve different cognitive processes compared with non-face object or scene perception. In this study, we investigated whether and how face and natural scene images can influence the patterns of visuomotor activity. We recorded monkeys’ saccadic eye movements as they freely viewed monkey face and natural scene images. The face and natural scene images attracted similar number of fixations, but viewing of faces was accompanied by longer fixations compared with natural scenes. These longer fixations were dependent on the context of facial features. The duration of fixations directed at facial contours decreased when the face images were scrambled, and increased at the later stage of normal face viewing. The results suggest that face and natural scene images can generate different patterns of visuomotor activity. The extra fixation duration on faces may be correlated with the detailed analysis of facial features

    Optical and near-infrared observations of the GRB020405 afterglow

    Get PDF
    (Abridged) We report on observations of the optical and NIR afterglow of GRB020405. Ground-based optical observations started about 1 day after the GRB and spanned a period of ~10 days; archival HST data extended the coverage up to 70 days after the GRB. We report the first detection of the afterglow in NIR bands. The detection of emission lines in the optical spectrum indicates that the GRB is located at z = 0.691. Absorptions are also detected at z = 0.691 and at z = 0.472. The latter system is likely caused by clouds in a galaxy located 2 arcsec southwest of the GRB host. Hence, for the first time, the galaxy responsible for an intervening absorption system in the spectrum of a GRB afterglow is identified. Optical and NIR photometry indicates that the decay in all bands follows a single power law of index alpha = 1.54. The late-epoch VLT and HST points lie above the extrapolation of this power law, so that a plateau is apparent in the VRIJ light curves at 10-20 days after the GRB. The light curves at epochs later than day ~20 after the GRB are consistent with a power-law decay with index alphaprime = 1.85. We suggest that this deviation can be modeled with a SN having the same temporal profile as SN2002ap, but 1.3 mag brighter at peak, and located at the GRB redshift. Alternatively, a shock re-energization may be responsible for the rebrightening. A polarimetric R-band measurement shows that the afterglow is polarized, with P = 1.5 % and theta = 172 degrees. Optical-NIR spectral flux distributions show a change of slope across the J band which we interpret as due to the presence of nu_c. The analysis of the multiwavelength spectrum within the fireball model suggests that a population of relativistic electrons produces the optical-NIR emission via synchrotron in an adiabatically expanding blastwave, and the X-rays via IC.Comment: 17 pages, 10 figures, 4 tables, accepted for publication on A&A, main journa

    Action Recognition with a Bio--Inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions

    Get PDF
    International audienceHere we show that reproducing the functional properties of MT cells with various center--surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio--inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure, and more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio--inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos

    Natural images from the birthplace of the human eye

    Get PDF
    Here we introduce a database of calibrated natural images publicly available through an easy-to-use web interface. Using a Nikon D70 digital SLR camera, we acquired about 5000 six-megapixel images of Okavango Delta of Botswana, a tropical savanna habitat similar to where the human eye is thought to have evolved. Some sequences of images were captured unsystematically while following a baboon troop, while others were designed to vary a single parameter such as aperture, object distance, time of day or position on the horizon. Images are available in the raw RGB format and in grayscale. Images are also available in units relevant to the physiology of human cone photoreceptors, where pixel values represent the expected number of photoisomerizations per second for cones sensitive to long (L), medium (M) and short (S) wavelengths. This database is distributed under a Creative Commons Attribution-Noncommercial Unported license to facilitate research in computer vision, psychophysics of perception, and visual neuroscience.Comment: Submitted to PLoS ON
    corecore