7,433 research outputs found

    Mosaics from arbitrary stereo video sequences

    Get PDF
    lthough mosaics are well established as a compact and non-redundant representation of image sequences, their application still suffers from restrictions of the camera motion or has to deal with parallax errors. We present an approach that allows construction of mosaics from arbitrary motion of a head-mounted camera pair. As there are no parallax errors when creating mosaics from planar objects, our approach first decomposes the scene into planar sub-scenes from stereo vision and creates a mosaic for each plane individually. The power of the presented mosaicing technique is evaluated in an office scenario, including the analysis of the parallax error

    From images via symbols to contexts: using augmented reality for interactive model acquisition

    Get PDF
    Systems that perform in real environments need to bind the internal state to externally perceived objects, events, or complete scenes. How to learn this correspondence has been a long standing problem in computer vision as well as artificial intelligence. Augmented Reality provides an interesting perspective on this problem because a human user can directly relate displayed system results to real environments. In the following we present a system that is able to bootstrap internal models from user-system interactions. Starting from pictorial representations it learns symbolic object labels that provide the basis for storing observed episodes. In a second step, more complex relational information is extracted from stored episodes that enables the system to react on specific scene contexts

    Deep-sea image processing

    Get PDF
    High-resolution seafloor mapping often requires optical methods of sensing, to confirm interpretations made from sonar data. Optical digital imagery of seafloor sites can now provide very high resolution and also provides additional cues, such as color information for sediments, biota and divers rock types. During the cruise AT11-7 of the Woods Hole Oceanographic Institution (WHOI) vessel R/V Atlantis (February 2004, East Pacific Rise) visual imagery was acquired from three sources: (1) a digital still down-looking camera mounted on the submersible Alvin, (2) observer-operated 1-and 3-chip video cameras with tilt and pan capabilities mounted on the front of Alvin, and (3) a digital still camera on the WHOI TowCam (Fornari, 2003). Imagery from the first source collected on a previous cruise (AT7-13) to the Galapagos Rift at 86°W was successfully processed and mosaicked post-cruise, resulting in a single image covering area of about 2000 sq.m, with the resolution of 3 mm per pixel (Rzhanov et al., 2003). This paper addresses the issues of the optimal acquisition of visual imagery in deep-seaconditions, and requirements for on-board processing. Shipboard processing of digital imagery allows for reviewing collected imagery immediately after the dive, evaluating its importance and optimizing acquisition parameters, and augmenting acquisition of data over specific sites on subsequent dives.Images from the deepsea power and light (DSPL) digital camera offer the best resolution (3.3 Mega pixels) and are taken at an interval of 10 seconds (determined by the strobe\u27s recharge rate). This makes images suitable for mosaicking only when Alvin moves slowly (≪1/4 kt), which is not always possible for time-critical missions. Video cameras provided a source of imagery more suitable for mosaicking, despite its inferiority in resolution. We discuss required pre-processing and imageenhancement techniques and their influence on the interpretation of mosaic content. An algorithm for determination of camera tilt parameters from acquired imagery is proposed and robustness conditions are discussed

    A cognitive ego-vision system for interactive assistance

    Get PDF
    With increasing computational power and decreasing size, computers nowadays are already wearable and mobile. They become attendant of peoples' everyday life. Personal digital assistants and mobile phones equipped with adequate software gain a lot of interest in public, although the functionality they provide in terms of assistance is little more than a mobile databases for appointments, addresses, to-do lists and photos. Compared to the assistance a human can provide, such systems are hardly to call real assistants. The motivation to construct more human-like assistance systems that develop a certain level of cognitive capabilities leads to the exploration of two central paradigms in this work. The first paradigm is termed cognitive vision systems. Such systems take human cognition as a design principle of underlying concepts and develop learning and adaptation capabilities to be more flexible in their application. They are embodied, active, and situated. Second, the ego-vision paradigm is introduced as a very tight interaction scheme between a user and a computer system that especially eases close collaboration and assistance between these two. Ego-vision systems (EVS) take a user's (visual) perspective and integrate the human in the system's processing loop by means of a shared perception and augmented reality. EVSs adopt techniques of cognitive vision to identify objects, interpret actions, and understand the user's visual perception. And they articulate their knowledge and interpretation by means of augmentations of the user's own view. These two paradigms are studied as rather general concepts, but always with the goal in mind to realize more flexible assistance systems that closely collaborate with its users. This work provides three major contributions. First, a definition and explanation of ego-vision as a novel paradigm is given. Benefits and challenges of this paradigm are discussed as well. Second, a configuration of different approaches that permit an ego-vision system to perceive its environment and its user is presented in terms of object and action recognition, head gesture recognition, and mosaicing. These account for the specific challenges identified for ego-vision systems, whose perception capabilities are based on wearable sensors only. Finally, a visual active memory (VAM) is introduced as a flexible conceptual architecture for cognitive vision systems in general, and for assistance systems in particular. It adopts principles of human cognition to develop a representation for information stored in this memory. So-called memory processes continuously analyze, modify, and extend the content of this VAM. The functionality of the integrated system emerges from their coordinated interplay of these memory processes. An integrated assistance system applying the approaches and concepts outlined before is implemented on the basis of the visual active memory. The system architecture is discussed and some exemplary processing paths in this system are presented and discussed. It assists users in object manipulation tasks and has reached a maturity level that allows to conduct user studies. Quantitative results of different integrated memory processes are as well presented as an assessment of the interactive system by means of these user studies

    Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking

    Full text link
    Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, location and size on the sky, coordinate system and projection, and spatial sampling rate. Many astronomical datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. Montage can be run on both single- and multi-processor computers, including clusters and grids. Standard grid tools are used to run Montage in the case where the data or computers used to construct a mosaic are located remotely on the Internet. This paper describes the architecture, algorithms, and usage of Montage as both a software toolkit and as a grid portal. Timing results are provided to show how Montage performance scales with number of processors on a cluster computer. In addition, we compare the performance of two methods of running Montage in parallel on a grid.Comment: 16 pages, 11 figure

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    Performance analysis on color image mosaicing techniques on FPGA

    Get PDF
    Today, the surveillance systems and other monitoring systems are considering the capturing of image sequences in a single frame. The captured images can be combined to get the mosaiced image or combined image sequence. But the captured image may have quality issues like brightness issue, alignment issue (correlation issue), resolution issue, manual image registration issue etc. The existing technique like cross correlation can offer better image mosaicing but faces brightness issue in mosaicing. Thus, this paper introduces two different methods for mosaicing i.e., (a) Sliding Window Module (SWM) based Color Image Mosaicing (CIM) and (b) Discrete Cosine Transform (DCT) based CIM on Field Programmable Gate Array (FPGA). The SWM based CIM adopted for corner detection of two images and perform the automatic image registration while DCT based CIM aligns both the local as well as global alignment of images by using phase correlation approach. Finally, these two methods performances are analyzed by comparing with parameters like PSNR, MSE, device utilization and execution time. From the analysis it is concluded that the DCT based CIM can offers significant results than SWM based CIM
    corecore