6 research outputs found

    Multiple View Geometry For Video Analysis And Post-production

    Get PDF
    Multiple view geometry is the foundation of an important class of computer vision techniques for simultaneous recovery of camera motion and scene structure from a set of images. There are numerous important applications in this area. Examples include video post-production, scene reconstruction, registration, surveillance, tracking, and segmentation. In video post-production, which is the topic being addressed in this dissertation, computer analysis of the motion of the camera can replace the currently used manual methods for correctly aligning an artificially inserted object in a scene. However, existing single view methods typically require multiple vanishing points, and therefore would fail when only one vanishing point is available. In addition, current multiple view techniques, making use of either epipolar geometry or trifocal tensor, do not exploit fully the properties of constant or known camera motion. Finally, there does not exist a general solution to the problem of synchronization of N video sequences of distinct general scenes captured by cameras undergoing similar ego-motions, which is the necessary step for video post-production among different input videos. This dissertation proposes several advancements that overcome these limitations. These advancements are used to develop an efficient framework for video analysis and post-production in multiple cameras. In the first part of the dissertation, the novel inter-image constraints are introduced that are particularly useful for scenes where minimal information is available. This result extends the current state-of-the-art in single view geometry techniques to situations where only one vanishing point is available. The property of constant or known camera motion is also described in this dissertation for applications such as calibration of a network of cameras in video surveillance systems, and Euclidean reconstruction from turn-table image sequences in the presence of zoom and focus. We then propose a new framework for the estimation and alignment of camera motions, including both simple (panning, tracking and zooming) and complex (e.g. hand-held) camera motions. Accuracy of these results is demonstrated by applying our approach to video post-production applications such as video cut-and-paste and shadow synthesis. As realistic image-based rendering problems, these applications require extreme accuracy in the estimation of camera geometry, the position and the orientation of the light source, and the photometric properties of the resulting cast shadows. In each case, the theoretical results are fully supported and illustrated by both numerical simulations and thorough experimentation on real data

    Weakly-supervised Semantic Segmentation with Regularized Loss Hyperparameter Search

    Get PDF
    Weakly supervised segmentation signi cantly reduces user annotation e ort. Recently, regularized loss was proposed for single object class segmentation under image-level weak supervision. Regularized loss consists of several components. Each component, if used in isolation, would lead to some trivial solution. However, a weighted combination of the loss components introduces a balance between the individual biases. The weight of each component in regularized loss is controlled by a hyperparameter. We propose an approach that searches for regularized loss hyperparameters. The main idea is to set the most important regularized loss component to a high weight while ensuring the other loss components are set to weights just su ciently high to prevent the trivial solution favoured by the most important component. Our approach results in a signi cantly improved performance over prior work with xed hyperparameters and improves the state of the art in salient and semantic image level supervised segmentation. In addition to image level weak supervision, we propose a new approach for semantic segmentation with weak supervision using bounding box annotations. Our new approach to weak supervision from bounding boxes also makes use of hyperparameter search regularized loss. Previous work on weak supervision from bounding boxes constructs pseudo-ground truth by segmenting each box into the object and the background for each box independently from all the other boxes in the dataset. We argue that the collection of boxes for the same class naturally provides a dataset from which we can learn the appearance of that object class. Learning a good appearance model, in turn, leads to a better segmentation of each individual box. Thus for each class, we propose to train a segmentation CNN as from the dataset consisting of the bounding boxes for that class using our proposed single object approach. After we train these single-class CNNs, we apply them back to the training bounding boxes to obtain object/background segmentations and merge them to construct pseudo-ground truth. The obtained pseudo-ground truth is used for training a standard segmentation CNN. We improve the state of the art on Pascal VOC 2012 benchmark in bounding box weak supervision setting

    Combining omnidirectional vision with polarization vision for robot navigation

    Get PDF
    La polarisation est le phénomène qui décrit les orientations des oscillations des ondes lumineuses qui sont limitées en direction. La lumière polarisée est largement utilisée dans le règne animal,à partir de la recherche de nourriture, la défense et la communication et la navigation. Le chapitre (1) aborde brièvement certains aspects importants de la polarisation et explique notre problématique de recherche. Nous visons à utiliser un capteur polarimétrique-catadioptrique car il existe de nombreuses applications qui peuvent bénéficier d'une telle combinaison en vision par ordinateur et en robotique, en particulier pour l'estimation d'attitude et les applications de navigation. Le chapitre (2) couvre essentiellement l'état de l'art de l'estimation d'attitude basée sur la vision.Quand la lumière non-polarisée du soleil pénètre dans l'atmosphère, l'air entraine une diffusion de Rayleigh, et la lumière devient partiellement linéairement polarisée. Le chapitre (3) présente les motifs de polarisation de la lumière naturelle et couvre l'état de l'art des méthodes d'acquisition des motifs de polarisation de la lumière naturelle utilisant des capteurs omnidirectionnels (par exemple fisheye et capteurs catadioptriques). Nous expliquons également les caractéristiques de polarisation de la lumière naturelle et donnons une nouvelle dérivation théorique de son angle de polarisation.Notre objectif est d'obtenir une vue omnidirectionnelle à 360 associée aux caractéristiques de polarisation. Pour ce faire, ce travail est basé sur des capteurs catadioptriques qui sont composées de surfaces réfléchissantes et de lentilles. Généralement, la surface réfléchissante est métallique et donc l'état de polarisation de la lumière incidente, qui est le plus souvent partiellement linéairement polarisée, est modifiée pour être polarisée elliptiquement après réflexion. A partir de la mesure de l'état de polarisation de la lumière réfléchie, nous voulons obtenir l'état de polarisation incident. Le chapitre (4) propose une nouvelle méthode pour mesurer les paramètres de polarisation de la lumière en utilisant un capteur catadioptrique. La possibilité de mesurer le vecteur de Stokes du rayon incident est démontré à partir de trois composants du vecteur de Stokes du rayon réfléchi sur les quatre existants.Lorsque les motifs de polarisation incidents sont disponibles, les angles zénithal et azimutal du soleil peuvent être directement estimés à l'aide de ces modèles. Le chapitre (5) traite de l'orientation et de la navigation de robot basées sur la polarisation et différents algorithmes sont proposés pour estimer ces angles dans ce chapitre. A notre connaissance, l'angle zénithal du soleil est pour la première fois estimé dans ce travail à partir des schémas de polarisation incidents. Nous proposons également d'estimer l'orientation d'un véhicule à partir de ces motifs de polarisation.Enfin, le travail est conclu et les possibles perspectives de recherche sont discutées dans le chapitre (6). D'autres exemples de schémas de polarisation de la lumière naturelle, leur calibrage et des applications sont proposées en annexe (B).Notre travail pourrait ouvrir un accès au monde de la vision polarimétrique omnidirectionnelle en plus des approches conventionnelles. Cela inclut l'orientation bio-inspirée des robots, des applications de navigation, ou bien la localisation en plein air pour laquelle les motifs de polarisation de la lumière naturelle associés à l'orientation du soleil à une heure précise peuvent aboutir à la localisation géographique d'un véhiculePolarization is the phenomenon that describes the oscillations orientations of the light waves which are restricted in direction. Polarized light has multiple uses in the animal kingdom ranging from foraging, defense and communication to orientation and navigation. Chapter (1) briefly covers some important aspects of polarization and explains our research problem. We are aiming to use a polarimetric-catadioptric sensor since there are many applications which can benefit from such combination in computer vision and robotics specially robot orientation (attitude estimation) and navigation applications. Chapter (2) mainly covers the state of art of visual based attitude estimation.As the unpolarized sunlight enters the Earth s atmosphere, it is Rayleigh-scattered by air, and it becomes partially linearly polarized. This skylight polarization provides a signi cant clue to understanding the environment. Its state conveys the information for obtaining the sun orientation. Robot navigation, sensor planning, and many other applications may bene t from using this navigation clue. Chapter (3) covers the state of art in capturing the skylight polarization patterns using omnidirectional sensors (e.g fisheye and catadioptric sensors). It also explains the skylight polarization characteristics and gives a new theoretical derivation of the skylight angle of polarization pattern. Our aim is to obtain an omnidirectional 360 view combined with polarization characteristics. Hence, this work is based on catadioptric sensors which are composed of reflective surfaces and lenses. Usually the reflective surface is metallic and hence the incident skylight polarization state, which is mostly partially linearly polarized, is changed to be elliptically polarized after reflection. Given the measured reflected polarization state, we want to obtain the incident polarization state. Chapter (4) proposes a method to measure the light polarization parameters using a catadioptric sensor. The possibility to measure the incident Stokes is proved given three Stokes out of the four reflected Stokes. Once the incident polarization patterns are available, the solar angles can be directly estimated using these patterns. Chapter (5) discusses polarization based robot orientation and navigation and proposes new algorithms to estimate these solar angles where, to the best of our knowledge, the sun zenith angle is firstly estimated in this work given these incident polarization patterns. We also propose to estimate any vehicle orientation given these polarization patterns. Finally the work is concluded and possible future research directions are discussed in chapter (6). More examples of skylight polarization patterns, their calibration, and the proposed applications are given in appendix (B). Our work may pave the way to move from the conventional polarization vision world to the omnidirectional one. It enables bio-inspired robot orientation and navigation applications and possible outdoor localization based on the skylight polarization patterns where given the solar angles at a certain date and instant of time may infer the current vehicle geographical location.DIJON-BU Doc.électronique (212319901) / SudocSudocFranceF

    The gesturing screen : art and screen agency within postmedia assemblages

    Full text link
    This thesis investigates screens as key elements in postmedia assemblages where multiple technical devices and media platforms function relationally to activate new capacities. I characterize the agency of screens as a gesturality that re-arranges and sustains medial relations with and between other components. The gestures of screens reformulate mediality, but also shift experiences and open elements to novel formations and affects. In each re- organisation of the postmedial assemblage, the function and relations of screens are not pre-defined but emerge in process. I develop this dynamic conception of screens and assemblages to account for their diverse manifestations in postmedia. Here I employ an agential realist framework found in the work of Karen Barad and draw on the concept of gestures set out by Giorgio Agamben. This research contributes to a new understanding of postmediality in conjunction with a new conception of the agency of screens. The thesis focuses on mainly digital screen-oriented artworks, repositioning these as heralding or firmly engaging with the postmedial condition. By challenging an understanding of screens that limits them to mere casings for images, this thesis expands the scope and role of screens in postmedia art practices stretching as far back as three decades. It argues that such art works and practices foreground the gesturality of screens and offers in-depth studies of works by Shilpa Gupta, Ulrike Gabriel, Natalie Bookchin, Blast Theory, Ragnar Kjartansson, Sandra Mujinga and Sondra Perry. Such works highlight how screens come to be relationally enacted in postmedia and how that enactment occurs through their performance of medial gestures. I identify two kinds of gestures of postmedia screens that support and connect the technical, aesthetic and in some cases political components of an assemblage. I turn to the multiplicity of frames both on-screen and distributed across screens observed by theorists of media such as Lev Manovich and Anne Friedberg. But the postmedial frame is consistently accompanied by what is out-of-frame – scrolling, swiping and ‘pinching’ continually calls on the out-of-frame to be moved on-screen. The out-of-frame is a postmedial screen gesture, then, that maintains an ongoing relation to the ‘inside’ of the frame, supporting and conditioning it. The multiple temporalities of postmedia assemblages – such as that of images, participants and software – allow an ‘out-of-frame’ to endure beyond the framed image. The second gesture is observed in the pervasiveness of chroma screens — blue and green screens used for compositing other images in postproduction. I suggest that this now ubiquitous technique suspends images from screens. Through a relational analysis of colour, and focusing on Perry’s work, I draw upon ways in which the blankness of chroma screens can be made to gesture a different enactment of race – ‘blackness’ as productive difference. Such gesture in postmedia entails the circulation and transference of social and cultural setting. These two gestures of screens highlight multiple dimensions of the relations of postmedial screens beyond that of framed images, offering us ways to be attentive to enactments of screens as they continue to gather relevance in our expanded visual setting. As screens multiply, this research suggests alternate ways of conceiving the aesthetics and experience of screens’ persistent medial configurations

    Synchronized Illumination Modulation for Digital Video Compositing

    Get PDF
    Informationsaustausch ist eines der Grundbedürfnisse der Menschen. Während früher dazu Wandmalereien,Handschrift, Buchdruck und Malerei eingesetzt wurden, begann man später, Bildfolgen zu erstellen, die als sogenanntes ”Daumenkino” den Eindruck einer Animation vermitteln. Diese wurden schnell durch den Einsatz rotierender Bildscheiben, auf denen mit Hilfe von Schlitzblenden, Spiegeln oder Optiken eine Animation sichtbar wurde, automatisiert – mit sogenannten Phenakistiskopen,Zoetropen oder Praxinoskopen. Mit der Erfindung der Fotografie begannen in der zweiten Hälfte des 19. Jahrhunderts die ersten Wissenschaftler wie Eadweard Muybridge, Etienne-Jules Marey und Ottomar Anschütz, Serienbildaufnahmen zu erstellen und diese dann in schneller Abfolge, als Film, abzuspielen. Mit dem Beginn der Filmproduktion wurden auch die ersten Versuche unternommen, mit Hilfe dieser neuen Technik spezielle visuelle Effekte zu generieren, um damit die Immersion der Bewegtbildproduktionen weiter zu erhöhen. Während diese Effekte in der analogen Phase der Filmproduktion bis in die achtziger Jahre des 20.Jahrhunderts recht beschränkt und sehr aufwendig mit einem enormen manuellen Arbeitsaufwand erzeugt werden mussten, gewannen sie mit der sich rapide beschleunigenden Entwicklung der Halbleitertechnologie und der daraus resultierenden vereinfachten digitalen Bearbeitung immer mehr an Bedeutung. Die enormen Möglichkeiten, die mit der verlustlosen Nachbearbeitung in Kombination mit fotorealistischen, dreidimensionalen Renderings entstanden, führten dazu, dass nahezu alle heute produzierten Filme eine Vielfalt an digitalen Videokompositionseffekten beinhalten. ...Besides home entertainment and business presentations, video projectors are powerful tools for modulating images spatially as well as temporally. The re-evolving need for stereoscopic displays increases the demand for low-latency projectors and recent advances in LED technology also offer high modulation frequencies. Combining such high-frequency illumination modules with synchronized, fast cameras, makes it possible to develop specialized high-speed illumination systems for visual effects production. In this thesis we present different systems for using spatially as well as temporally modulated illumination in combination with a synchronized camera to simplify the requirements of standard digital video composition techniques for film and television productions and to offer new possibilities for visual effects generation. After an overview of the basic terminology and a summary of related methods, we discuss and give examples of how modulated light can be applied to a scene recording context to enable a variety of effects which cannot be realized using standard methods, such as virtual studio technology or chroma keying. We propose using high-frequency, synchronized illumination which, in addition to providing illumination, is modulated in terms of intensity and wavelength to encode technical information for visual effects generation. This is carried out in such a way that the technical components do not influence the final composite and are also not visible to observers on the film set. Using this approach we present a real-time flash keying system for the generation of perspectively correct augmented composites by projecting imperceptible markers for optical camera tracking. Furthermore, we present a system which enables the generation of various digital video compositing effects outside of completely controlled studio environments, such as virtual studios. A third temporal keying system is presented that aims to overcome the constraints of traditional chroma keying in terms of color spill and color dependency. ..
    corecore