37 research outputs found

    Spatial and temporal background modelling of non-stationary visual scenes

    Get PDF
    PhDThe prevalence of electronic imaging systems in everyday life has become increasingly apparent in recent years. Applications are to be found in medical scanning, automated manufacture, and perhaps most significantly, surveillance. Metropolitan areas, shopping malls, and road traffic management all employ and benefit from an unprecedented quantity of video cameras for monitoring purposes. But the high cost and limited effectiveness of employing humans as the final link in the monitoring chain has driven scientists to seek solutions based on machine vision techniques. Whilst the field of machine vision has enjoyed consistent rapid development in the last 20 years, some of the most fundamental issues still remain to be solved in a satisfactory manner. Central to a great many vision applications is the concept of segmentation, and in particular, most practical systems perform background subtraction as one of the first stages of video processing. This involves separation of ‘interesting foreground’ from the less informative but persistent background. But the definition of what is ‘interesting’ is somewhat subjective, and liable to be application specific. Furthermore, the background may be interpreted as including the visual appearance of normal activity of any agents present in the scene, human or otherwise. Thus a background model might be called upon to absorb lighting changes, moving trees and foliage, or normal traffic flow and pedestrian activity, in order to effect what might be termed in ‘biologically-inspired’ vision as pre-attentive selection. This challenge is one of the Holy Grails of the computer vision field, and consequently the subject has received considerable attention. This thesis sets out to address some of the limitations of contemporary methods of background segmentation by investigating methods of inducing local mutual support amongst pixels in three starkly contrasting paradigms: (1) locality in the spatial domain, (2) locality in the shortterm time domain, and (3) locality in the domain of cyclic repetition frequency. Conventional per pixel models, such as those based on Gaussian Mixture Models, offer no spatial support between adjacent pixels at all. At the other extreme, eigenspace models impose a structure in which every image pixel bears the same relation to every other pixel. But Markov Random Fields permit definition of arbitrary local cliques by construction of a suitable graph, and 3 are used here to facilitate a novel structure capable of exploiting probabilistic local cooccurrence of adjacent Local Binary Patterns. The result is a method exhibiting strong sensitivity to multiple learned local pattern hypotheses, whilst relying solely on monochrome image data. Many background models enforce temporal consistency constraints on a pixel in attempt to confirm background membership before being accepted as part of the model, and typically some control over this process is exercised by a learning rate parameter. But in busy scenes, a true background pixel may be visible for a relatively small fraction of the time and in a temporally fragmented fashion, thus hindering such background acquisition. However, support in terms of temporal locality may still be achieved by using Combinatorial Optimization to derive shortterm background estimates which induce a similar consistency, but are considerably more robust to disturbance. A novel technique is presented here in which the short-term estimates act as ‘pre-filtered’ data from which a far more compact eigen-background may be constructed. Many scenes entail elements exhibiting repetitive periodic behaviour. Some road junctions employing traffic signals are among these, yet little is to be found amongst the literature regarding the explicit modelling of such periodic processes in a scene. Previous work focussing on gait recognition has demonstrated approaches based on recurrence of self-similarity by which local periodicity may be identified. The present work harnesses and extends this method in order to characterize scenes displaying multiple distinct periodicities by building a spatio-temporal model. The model may then be used to highlight abnormality in scene activity. Furthermore, a Phase Locked Loop technique with a novel phase detector is detailed, enabling such a model to maintain correct synchronization with scene activity in spite of noise and drift of periodicity. This thesis contends that these three approaches are all manifestations of the same broad underlying concept: local support in each of the space, time and frequency domains, and furthermore, that the support can be harnessed practically, as will be demonstrated experimentally

    Interaction and integration of visual and noise impacts of motorways

    Get PDF
    This study aimed to achieve a better understanding of the visual and noise impacts of motorways and their integrated impact on the environmental quality via an aural-visual interaction approach, to contribute to more reliable and efficient assessments of the impacts. The study was based on perceptual experiments involving human participants using computer-visualised scenes and edited audio recordings as experimental stimuli. Factors related to road project characteristics and existing landscape characters that potentially influence the perceived visual impact of motorways were first investigated on without considering the impact from moving traffic. An online preference survey was conducted for this part of study. The results showed substantial visual impact from motorways especially in more natural landscapes and significant increase in the impact by opaque noise barriers. Map-based predictors were identified and a regression model was developed to predict and map the perceived visual impact in GIS. The second part of the study investigated the effects of traffic condition, distance to road and background landscape on the perceived visual impact of motorway traffic, and the contribution of traffic noise to the perceived visual impact. A laboratory experiment was carried out where experimental scenarios were presented to participants both with and without sound. The results showed significant visual impact from motorway traffic which was higher in the natural landscape than in the residential counterpart, increased by traffic volume and decreased by distance. Noise increased the perceived visual impact by a largely constant level despite changes in noise level and other factors. With findings on visual impact from above studies and knowledge on noise impact from current literature, the third part of this study, with a second laboratory experiment, investigated on the perceived integrated impact of visual intrusion and noise of motorways, and explored the predictability of the impact by noise exposure indices. The results showed that traffic volume expressed by noise emission level was the most influential factor, followed by distance and background landscape. A regression model using noise level at receiver position and type of background landscape as predictors was developed, explaining about a quarter of the variation in the perceived impact. Concerning the acoustical and visual effects of noise barriers found on perceived environmental quality, the fourth part of the study focused on mitigation of the integrated visual and noise impact by noise barrier. A third laboratory experiment was conducted and the results showed that noise barriers always had either beneficial or insignificant effect in mitigating integrated impact, and the effect was largely similar to that of tree belt. Generally, barriers varying in size and transparency did not differ much in their performance, but there seems to be some difference by barrier size at different distances. Lastly, using the above findings of this study, impact mappings as possible prototype of more advanced tools to assist visual and noise impact assessment were demonstrated

    Chinese Ink-and-Brush Painting with Film Lighting Aesthetics in 3D Computer Graphics

    Get PDF
    This thesis explores the topic of recreating Chinese ink-and-brush painting in 3D computer graphics and introducing film lighting aesthetics into the result. The method is primarily based on non-photorealistic shader development and digital compositing. The goal of this research is to study how to bing the visual aesthetics of Chinese ink-and-brush painting into 3D computer graphics as well as explore the artistic possibility of using film lighting principles in Chinese painting for visual story telling by using 3D computer graphics. In this research, we use the Jiangnan water country paintings by renowned contemporary Chinese artist Yang Ming-Yi as our primary visual reference. An analysis of the paintings is performed to study the visual characteristics of Yang's paintings. These include how the artist expresses shading, forms, shadow, reflection and compositing principles, which will be used as the guidelines for recreating the painting in computer graphics. 3D meshes are used to represent the subjects in the painting like houses, boats and water. Then procedural non-photorealistic shaders are developed and applied on 3D meshes to give the models an ink-look. Additionally, different types of 3D data are organized and rendered into different layers, which include shading, depth, and geometric information. Those layers are then composed together by using 2D image processing algorithms with custom artistic controls to achieve a more natural-looking ink-painting result. As a result, a short animation of Chinese ink-and-brush painting in 3D computer graphics will be created in which the same environment is rendered with different lighting designs to demonstrate the artistic intention

    Dimensions of Motion: Monocular Prediction through Flow Subspaces

    Full text link
    We introduce a way to learn to estimate a scene representation from a single image by predicting a low-dimensional subspace of optical flow for each training example, which encompasses the variety of possible camera and object movement. Supervision is provided by a novel loss which measures the distance between this predicted flow subspace and an observed optical flow. This provides a new approach to learning scene representation tasks, such as monocular depth prediction or instance segmentation, in an unsupervised fashion using in-the-wild input videos without requiring camera poses, intrinsics, or an explicit multi-view stereo step. We evaluate our method in multiple settings, including an indoor depth prediction task where it achieves comparable performance to recent methods trained with more supervision.Comment: Project page at https://dimensions-of-motion.github.io

    Felt_space infrastructure: Hyper vigilant spatiality to valence the visceral dimension

    Get PDF
    Felt_space infrastructure: Hypervigilant spatiality to valence the visceral dimension. This thesis evolves perception as a hypothesis to reframe architectural praxis negotiated through agent-situation interaction. The research questions the geometric principles of architectural ordination to originate the ‘felt_space infrastructure’, a relational system of measurement concerned with the role of perception in mediating sensory space and the cognised environment. The methodological model for this research fuses perception and environmental stimuli, into a consistent generative process that penetrates the inner essence of space, to reveal the visceral parameter. These concepts are applied to develop a ‘coefficient of affordance’ typology, ‘hypervigilant’ tool set, and ‘cognitive_tope’ design methodology. Thus, by extending the architectural platform to consider perception as a design parameter, the thesis interprets the ‘inference schema’ as an instructional model to coordinate the acquisition of spatial reality through tensional and counter-tensional feedback dynamics. Three site-responsive case studies are used to advance the thesis. The first case study is descriptive and develops a typology of situated cognition to extend the ‘granularity’ of perceptual sensitisation (i.e. a fine-grained means of perceiving space). The second project is relational and questions how mapping can coordinate perceptual, cognitive and associative attention, as a ‘multi-webbed vector field’ comprised of attractors and deformations within a viewer-centred gravitational space. The third case study is causal, and demonstrates how a transactional-biased schema can generate, amplify and attenuate perceptual misalignment, thus triggering a visceral niche. The significance of the research is that it progresses generative perception as an additional variable for spatial practice, and promotes transactional methodologies to gain enhanced modes of spatial acuity to extend the repertoire of architectural practice

    A Narrative in One Scene

    Get PDF
    Filmmakers are visual storytellers, thus it is important to understand basic film theory as well as the elements of a narrative, such as voice, look, and feel. It is just as important for filmmakers to understand how film theory and the elements of a narrative work together to effectively convey stories to the people viewing the film. In this thesis, I researched basic film theory and analyzed three personally influential movies and directors including Charlie Chaplin’s Modern Times (1936), Juan Antonio Bayona’s The Impossible (2012), and Akira Kurosawa’s The Seven Samurai (1954). I chose one technique from each of the the films, synthesized them into an original scene, and described the process of creating the scene

    The Virtual Worlds of Cinema Visual Effects, Simulation, and the Aesthetics of Cinematic Immersion

    Get PDF
    This thesis develops a phenomenology of immersive cinematic spectatorship. During an immersive experience in the cinema, the images, sounds, events, emotions, and characters that form a fictional diegesis become so compelling that our conscious experience of the real world is displaced by a virtual world. Theorists and audiences have long recognized cinema’s ability to momentarily substitute for the lived experience of reality, but it remains an under-theorized aspect of cinematic spectatorship. The first aim of this thesis is therefore to examine these immersive responses to cinema from three perspectives – the formal, the technological, and the neuroscientific – to describe the exact mechanisms through which a spectator’s immersion in a cinematic world is achieved. A second aim is to examine the historical development of the technologies of visual simulation that are used to create these immersive diegetic worlds. My analysis shows a consistent increase in the vividness and transparency of simulative technologies, two factors that are crucial determinants in a spectator’s immersion. In contrast to the cultural anxiety that often surrounds immersive responses to simulative technologies, I examine immersive spectatorship as an aesthetic phenomenon that is central to our engagement with cinema. The ubiquity of narrative – written, verbal, cinematic – shows that the ability to achieve immersion is a fundamental property of the human mind found in cultures diverse in both time and place. This thesis is thus an attempt to illuminate this unique human ability and examine the technologies that allow it to flourish

    Efficiently mapping high-performance early vision algorithms onto multicore embedded platforms

    Get PDF
    The combination of low-cost imaging chips and high-performance, multicore, embedded processors heralds a new era in portable vision systems. Early vision algorithms have the potential for highly data-parallel, integer execution. However, an implementation must operate within the constraints of embedded systems including low clock rate, low-power operation and with limited memory. This dissertation explores new approaches to adapt novel pixel-based vision algorithms for tomorrow's multicore embedded processors. It presents : - An adaptive, multimodal background modeling technique called Multimodal Mean that achieves high accuracy and frame rate performance with limited memory and a slow-clock, energy-efficient, integer processing core. - A new workload partitioning technique to optimize the execution of early vision algorithms on multi-core systems. - A novel data transfer technique called cat-tail dma that provides globally-ordered, non-blocking data transfers on a multicore system. By using efficient data representations, Multimodal Mean provides comparable accuracy to the widely used Mixture of Gaussians (MoG) multimodal method. However, it achieves a 6.2x improvement in performance while using 18% less storage than MoG while executing on a representative embedded platform. When this algorithm is adapted to a multicore execution environment, the new workload partitioning technique demonstrates an improvement in execution times of 25% with only a 125 ms system reaction time. It also reduced the overall number of data transfers by 50%. Finally, the cat-tail buffering technique reduces the data-transfer latency between execution cores and main memory by 32.8% over the baseline technique when executing Multimodal Mean. This technique concurrently performs data transfers with code execution on individual cores, while maintaining global ordering through low-overhead scheduling to prevent collisions.Ph.D.Committee Chair: Wills, Scott; Committee Co-Chair: Wills, Linda; Committee Member: Bader, David; Committee Member: Davis, Jeff; Committee Member: Hamblen, James; Committee Member: Lanterman, Aaro

    Feature-rich distance-based terrain synthesis

    Get PDF
    This thesis describes a novel terrain synthesis method based on distances in a weighted graph. The method begins with a regular lattice with arbitrary edge weights; heights are determined by path cost from a set of generator nodes. The shapes of individual terrain features, such as mountains, hills, and craters, are specified by a monotonically decreasing profile describing the cross-sectional shape of a feature, while the locations of features in the terrain are specified by placing the generators. Pathing places ridges whose initial location have a dendritic shape. The method is robust and easy to control, making it possible to create pareidolia effects. It can produce a wide range of realistic synthetic terrains such as mountain ranges, craters, faults, cinder cones, and hills. The algorithm incorporates random graph edge weights, permits the inclusion of multiple topography profiles, and allows precise control over placement of terrain features and their heights. These properties all allow the artist to create highly heterogeneous terrains that compare quite favorably to existing methods
    corecore