56 research outputs found

    The Pierre Auger Observatory IV: Operation and Monitoring

    Full text link
    Technical reports on operations and monitoring of the Pierre Auger ObservatoryComment: Constributions to 32nd International Cosmic Ray Conference, Beijing, China, August 201

    Perception Intelligence Integrated Vehicle-to-Vehicle Optical Camera Communication.

    Get PDF
    Ubiquitous usage of cameras and LEDs in modern road and aerial vehicles open up endless opportunities for novel applications in intelligent machine navigation, communication, and networking. To this end, in this thesis work, we hypothesize the benefit of dual-mode usage of vehicular built-in cameras through novel machine perception capabilities combined with optical camera communication (OCC). Current key conception of understanding a line-of-sight (LOS) scenery is from the aspect of object, event, and road situation detection. However, the idea of blending the non-line-of-sight (NLOS) information with the LOS information to achieve a see-through vision virtually is new. This improves the assistive driving performance by enabling a machine to see beyond occlusion. Another aspect of OCC in the vehicular setup is to understand the nature of mobility and its impact on the optical communication channel quality. The research questions gathered from both the car-car mobility modelling, and evaluating a working setup of OCC communication channel can also be inherited to aerial vehicular situations like drone-drone OCC. The aim of this thesis is to answer the research questions along these new application domains, particularly, (i) how to enable a virtual see-through perception in the car assisting system that alerts the human driver about the visible and invisible critical driving events to help drive more safely, (ii) how transmitter-receiver cars behaves while in the mobility and the overall channel performance of OCC in motion modality, (iii) how to help rescue lost Unmanned Aerial Vehicles (UAVs) through coordinated localization with fusion of OCC and WiFi, (iv) how to model and simulate an in-field drone swarm operation experience to design and validate UAV coordinated localization for group of positioning distressed drones. In this regard, in this thesis, we present the end-to-end system design, proposed novel algorithms to solve the challenges in applying such a system, and evaluation results through experimentation and/or simulation

    Table tennis event detection and classification

    Get PDF
    It is well understood that multiple video cameras and computer vision (CV) technology can be used in sport for match officiating, statistics and player performance analysis. A review of the literature reveals a number of existing solutions, both commercial and theoretical, within this domain. However, these solutions are expensive and often complex in their installation. The hypothesis for this research states that by considering only changes in ball motion, automatic event classification is achievable with low-cost monocular video recording devices, without the need for 3-dimensional (3D) positional ball data and representation. The focus of this research is a rigorous empirical study of low cost single consumer-grade video camera solutions applied to table tennis, confirming that monocular CV based detected ball location data contains sufficient information to enable key match-play events to be recognised and measured. In total a library of 276 event-based video sequences, using a range of recording hardware, were produced for this research. The research has four key considerations: i) an investigation into an effective recording environment with minimum configuration and calibration, ii) the selection and optimisation of a CV algorithm to detect the ball from the resulting single source video data, iii) validation of the accuracy of the 2-dimensional (2D) CV data for motion change detection, and iv) the data requirements and processing techniques necessary to automatically detect changes in ball motion and match those to match-play events. Throughout the thesis, table tennis has been chosen as the example sport for observational and experimental analysis since it offers a number of specific CV challenges due to the relatively high ball speed (in excess of 100kph) and small ball size (40mm in diameter). Furthermore, the inherent rules of table tennis show potential for a monocular based event classification vision system. As the initial stage, a proposed optimum location and configuration of the single camera is defined. Next, the selection of a CV algorithm is critical in obtaining usable ball motion data. It is shown in this research that segmentation processes vary in their ball detection capabilities and location out-puts, which ultimately affects the ability of automated event detection and decision making solutions. Therefore, a comparison of CV algorithms is necessary to establish confidence in the accuracy of the derived location of the ball. As part of the research, a CV software environment has been developed to allow robust, repeatable and direct comparisons between different CV algorithms. An event based method of evaluating the success of a CV algorithm is proposed. Comparison of CV algorithms is made against the novel Efficacy Metric Set (EMS), producing a measurable Relative Efficacy Index (REI). Within the context of this low cost, single camera ball trajectory and event investigation, experimental results provided show that the Horn-Schunck Optical Flow algorithm, with a REI of 163.5 is the most successful method when compared to a discrete selection of CV detection and extraction techniques gathered from the literature review. Furthermore, evidence based data from the REI also suggests switching to the Canny edge detector (a REI of 186.4) for segmentation of the ball when in close proximity to the net. In addition to and in support of the data generated from the CV software environment, a novel method is presented for producing simultaneous data from 3D marker based recordings, reduced to 2D and compared directly to the CV output to establish comparative time-resolved data for the ball location. It is proposed here that a continuous scale factor, based on the known dimensions of the ball, is incorporated at every frame. Using this method, comparison results show a mean accuracy of 3.01mm when applied to a selection of nineteen video sequences and events. This tolerance is within 10% of the diameter of the ball and accountable by the limits of image resolution. Further experimental results demonstrate the ability to identify a number of match-play events from a monocular image sequence using a combination of the suggested optimum algorithm and ball motion analysis methods. The results show a promising application of 2D based CV processing to match-play event classification with an overall success rate of 95.9%. The majority of failures occur when the ball, during returns and services, is partially occluded by either the player or racket, due to the inherent problem of using a monocular recording device. Finally, the thesis proposes further research and extensions for developing and implementing monocular based CV processing of motion based event analysis and classification in a wider range of applications

    Intelligent system for interaction with virtual characters based on volumetric sensors

    Get PDF
    Dissertação de Mestrado, Engenharia Elétrica e Eletrónica, Instituto Superior de Engenharia, Universidade do Algarve, 2015A tecnologia vem sendo desenvolvida para ajudar-nos a completar ou aumentar a produtividade nas nossas tarefas diárias. Muitas das máquinas construídas têm sido progressivamente aperfeiçoadas para funcionar mais como um ser humano, usando para isso os mais variados sensores. Um dos problemas mais desafiantes que a tecnologia encontrou é como dar a uma máquina a capacidade que um "animal" tem de perceber o mundo através do seu sistema visual. Uma solução será usar na máquina sistemas inteligentes que usem visão computacional. Uma grande ajuda pode chegar da perceção de profundidade pela máquina, tornando menos complexa a deteção e a compreensão de objetos numa imagem por parte desta. Com o aparecimento de sensores volumétricos (tridimensional 3D) no mercado consumidor, aumentaram os desenvolvimentos feitos nesta área científica, permitindo assim a sua integração na maioria dos dispositivos, tais como computadores ou dispositivos móveis, a um preço muito competitivo. Os sensores volumétricos podem ser usados nas mais variadas áreas pois apesar de terem aparecido inicialmente na área dos videojogos, estendemse ainda à área de vídeo, modelação 3D, interfaces, jogos ou realidade virtual e aumentada. Esta dissertação foca essencialmente no desenvolvimento de sistemas (inteligentes) baseados em sensores volumétricos (neste caso a Microsoft Kinect) para a interação com avatares ou filmes. Quanto a aplicações na área de vídeo, foi desenvolvida uma solução onde um sensor 3D ajuda um utilizador a seguir uma narrativa que é iniciada assim que o utilizador é detetado, mudando os acontecimentos do vídeo consoante ações pré-determinadas do utilizador. O utilizador pode então mudar o rumo da história mudando de posição ou efetuando um gesto. Esta solução é ilustrada utilizando retroprojeção, existindo ainda a possibilidade de ser apresentada em modo holograma numa abordagem à escala. O descrito no anterior parágrafo pode também ser aplicada a uma solução de vertente mais comercial. Para isso, foi desenvolvido uma aplicação altamente configurável, podendo-se ajustar (em termos visuais) às necessidades de diferentes companhias. O ambiente gráfico é acompanhado por um avatar ou por um video (previamente gravado), que interage com um utilizador através de gestos, dando uma sensação mais realista devido à utilização de holografia. Ao interagir com a instalação, são registados todos os movimentos e interações efetuadas pelo utilizador para que estatísticas sejam construídas, de maneira a perceber os conteúdos com mais interesse bem como as áreas físicas com mais interação. Adicionalmente, o utilizador poderá ter a sua fotografia completa ou tipo BI extraída, podendo-lhe ser oferecidos em produtos promocionais da empresa. Devido à curta área de interação oferecida por um sensor deste tipo (Kinect), foi também desenvolvida a possibilidade de juntar vários sensores, 4 para cobrir 180º (graus) em frente da instalação ou ainda 8 para cobrir os 360º à volta da instalação, de maneira a que os utilizadores possam ser detetados por qualquer um deles e que não sejam perdidos quando atravessam para uma zona de outro sensor, ou mesmo quando saem do campo de visão dos sensores e retornam mais tarde. Apesar dos sensores referidos serem mais conhecidos na interação com um jogo virtual, jogos reais e físicos também podem beneficiar deste tipo de sensor. Neste último ponto, é apresentada uma ferramenta de realidade aumentada para snooker ou bilhar. Nesta aplicação, um sensor 3D colocado por cima da mesa, capta a área de jogo sendo depois processada para que sejam detetadas as bolas, o taco e as tabelas. Sempre que possível, esta deteção é feita usando a terceira dimensão (profundidade) oferecida por estes sensores, tornando-se por exemplo mais robusto a mudanças quanto a condições luminosas. Com estes dados é então previsto, utilizando álgebra vetorial, a trajetória da bola, sendo projetado o resultado na mesa

    Blickpunktabhängige Computergraphik

    Get PDF
    Contemporary digital displays feature multi-million pixels at ever-increasing refresh rates. Reality, on the other hand, provides us with a view of the world that is continuous in space and time. The discrepancy between viewing the physical world and its sampled depiction on digital displays gives rise to perceptual quality degradations. By measuring or estimating where we look, gaze-contingent algorithms aim at exploiting the way we visually perceive to remedy visible artifacts. This dissertation presents a variety of novel gaze-contingent algorithms and respective perceptual studies. Chapter 4 and 5 present methods to boost perceived visual quality of conventional video footage when viewed on commodity monitors or projectors. In Chapter 6 a novel head-mounted display with real-time gaze tracking is described. The device enables a large variety of applications in the context of Virtual Reality and Augmented Reality. Using the gaze-tracking VR headset, a novel gaze-contingent render method is described in Chapter 7. The gaze-aware approach greatly reduces computational efforts for shading virtual worlds. The described methods and studies show that gaze-contingent algorithms are able to improve the quality of displayed images and videos or reduce the computational effort for image generation, while display quality perceived by the user does not change.Moderne digitale Bildschirme ermöglichen immer höhere Auflösungen bei ebenfalls steigenden Bildwiederholraten. Die Realität hingegen ist in Raum und Zeit kontinuierlich. Diese Grundverschiedenheit führt beim Betrachter zu perzeptuellen Unterschieden. Die Verfolgung der Aug-Blickrichtung ermöglicht blickpunktabhängige Darstellungsmethoden, die sichtbare Artefakte verhindern können. Diese Dissertation trägt zu vier Bereichen blickpunktabhängiger und wahrnehmungstreuer Darstellungsmethoden bei. Die Verfahren in Kapitel 4 und 5 haben zum Ziel, die wahrgenommene visuelle Qualität von Videos für den Betrachter zu erhöhen, wobei die Videos auf gewöhnlicher Ausgabehardware wie z.B. einem Fernseher oder Projektor dargestellt werden. Kapitel 6 beschreibt die Entwicklung eines neuartigen Head-mounted Displays mit Unterstützung zur Erfassung der Blickrichtung in Echtzeit. Die Kombination der Funktionen ermöglicht eine Reihe interessanter Anwendungen in Bezug auf Virtuelle Realität (VR) und Erweiterte Realität (AR). Das vierte und abschließende Verfahren in Kapitel 7 dieser Dissertation beschreibt einen neuen Algorithmus, der das entwickelte Eye-Tracking Head-mounted Display zum blickpunktabhängigen Rendern nutzt. Die Qualität des Shadings wird hierbei auf Basis eines Wahrnehmungsmodells für jeden Bildpixel in Echtzeit analysiert und angepasst. Das Verfahren hat das Potenzial den Berechnungsaufwand für das Shading einer virtuellen Szene auf ein Bruchteil zu reduzieren. Die in dieser Dissertation beschriebenen Verfahren und Untersuchungen zeigen, dass blickpunktabhängige Algorithmen die Darstellungsqualität von Bildern und Videos wirksam verbessern können, beziehungsweise sich bei gleichbleibender Bildqualität der Berechnungsaufwand des bildgebenden Verfahrens erheblich verringern lässt

    Techniques For Video Surveillance: Automatic Video Editing And Target Tracking

    Get PDF
    Typical video surveillance control rooms include a collection of monitors connected to a large camera network, with many fewer operators than monitors. The cameras are usually cycled through the monitors, with provisions for manual over-ride to display a camera of interest. In addition, cameras are often provided with pan, tilt and zoom capabilities to capture objects of interest. In this dissertation, we develop novel ways to control the limited resources by focusing them into acquiring and visualizing the critical information contained in the surveyed scenes. First, we consider the problem of cropping surveillance videos. This process chooses a trajectory that a small sub-window can take through the video, selecting the most important parts of the video for display on a smaller monitor area. We model the information content of the video simply, by whether the image changes at each pixel. Then we show that we can find the globally optimal trajectory for a cropping window by using a shortest path algorithm. In practice, we can speed up this process without affecting the results, by stitching together trajectories computed over short intervals. This also reduces system latency. We then show that we can use a second shortest path formulation to find good cuts from one trajectory to another, improving coverage of interesting events in the video. We describe additional techniques to improve the quality and efficiency of the algorithm, and show results on surveillance videos. Second, we turn our attention to the problem of tracking multiple agents moving amongst obstacles, using multiple cameras. Given an environment with obstacles, and many people moving through it, we construct a separate narrow field of view video for as many people as possible, by stitching together video segments from multiple cameras over time. We employ a novel approach to assign cameras to people as a function of time, with camera switches when needed. The problem is modeled as a bipartite graph and the solution corresponds to a maximum matching. As people move, the solution is efficiently updated by computing an augmenting path rather than by solving for a new matching. This reduces computation time by an order of magnitude. In addition, solving for the shortest augmenting path minimizes the number of camera switches at each update. When not all people can be covered by the available cameras, we cluster as many people as possible into small groups, then assign cameras to groups using a minimum cost matching algorithm. We test our method using numerous runs from different simulators. Third, we relax the restriction of using fixed cameras in tracking agents. In particular, we study the problem of maintaining a good view of an agent moving amongst obstacles by a moving camera, possibly fixed to a pursuing robot. This is known as a two-player pursuit evasion game. Using a mesh discretization of the environment, we develop an algorithm that determines, given initial positions of both pursuer and evader, if the evader can take any moving strategy to go out of sight of the pursuer, and thus win the game. If it is decided that there is no winning strategy for the evader, we also compute a pursuer's trajectory that keeps the evader within sight, for every trajectory that the evader can take. We study the effect of varying the mesh size on both the efficiency and accuracy of our algorithm. Finally, we show some earlier work that has been done in the domain of anomaly detection. Based on modeling co-occurrence statistics of moving objects in time and space, experiments are described on synthetic data, in which time intervals and locations of unusual activity are identified

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Pixelated Domes: Cinematic Code Changes through a Frank Lloyd Wright Lens

    Get PDF
    Panoramic 360-degree documentary videos continue to saturate the visual landscape. As practitioners\u27 experiment with a new genre, understanding meaning and making awaits the academic and marketplace landscape. The new media journey of 360-degree documentary storytelling is ripe for media archaeologist to explore. New media scholar Lev Manovich (2016) believes we are witnessing the new emergence of a cultural metalanguage, something that will be at least as significant as the printed word and cinema before it (p. 49) Considering the meta- development of this new media genre, my dissertation seeks to discuss the historical roots of the panoramic image, define 360-degree Cinematic Virtual Reality (CVR) documentary video, establish production distinctions between 360-degree CVR and two-dimensional documentary video, and reveal the spatial cognitive abilities of 360-degree documentary video. The purpose of this dissertation study is to establish a media archaeological context of the 360-degree image and reveals the development of new cinematic code variations between 360 CVR modalities and two-dimensional documentary form. The theoretical framework developed within this study will inform current and future 360-degree documentary narrative engagement practices. Secondly, this project seeks to evaluate spatial cognition levels when viewing a Frank Lloyd Wright walking tour through 360 CVR modalities and examine the influence this has on narrative engagement comparative to traditional two-dimensional documentary form

    Videos in Context for Telecommunication and Spatial Browsing

    Get PDF
    The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance

    Advances in Object and Activity Detection in Remote Sensing Imagery

    Get PDF
    The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms
    corecore