6,712 research outputs found

    The DICEMAN description schemes for still images and video sequences

    Get PDF
    To address the problem of visual content description, two Description Schemes (DSs) developed within the context of a European ACTS project known as DICEMAN, are presented. The DSs, designed based on an analogy with well-known tools for document description, describe both the structure and semantics of still images and video sequences. The overall structure of both DSs including the various sub-DSs and descriptors (Ds) of which they are composed is described. In each case, the hierarchical sub-DS for describing structure can be constructed using automatic (or semi-automatic) image/video analysis tools. The hierarchical sub-DSs for describing the semantics, however, are constructed by a user. The integration of the two DSs into a video indexing application currently under development in DICEMAN is also briefly described.Peer ReviewedPostprint (published version

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    A middleware for a large array of cameras

    Get PDF
    Large arrays of cameras are increasingly being employed for producing high quality image sequences needed for motion analysis research. This leads to the logistical problem with coordination and control of a large number of cameras. In this paper, we used a lightweight multi-agent system for coordinating such camera arrays. The agent framework provides more than a remote sensor access API. It allows reconfigurable and transparent access to cameras, as well as software agents capable of intelligent processing. Furthermore, it eases maintenance by encouraging code reuse. Additionally, our agent system includes an automatic discovery mechanism at startup, and multiple language bindings. Performance tests showed the lightweight nature of the framework while validating its correctness and scalability. Two different camera agents were implemented to provide access to a large array of distributed cameras. Correct operation of these camera agents was confirmed via several image processing agents

    Human mobility monitoring in very low resolution visual sensor network

    Get PDF
    This paper proposes an automated system for monitoring mobility patterns using a network of very low resolution visual sensors (30 30 pixels). The use of very low resolution sensors reduces privacy concern, cost, computation requirement and power consumption. The core of our proposed system is a robust people tracker that uses low resolution videos provided by the visual sensor network. The distributed processing architecture of our tracking system allows all image processing tasks to be done on the digital signal controller in each visual sensor. In this paper, we experimentally show that reliable tracking of people is possible using very low resolution imagery. We also compare the performance of our tracker against a state-of-the-art tracking method and show that our method outperforms. Moreover, the mobility statistics of tracks such as total distance traveled and average speed derived from trajectories are compared with those derived from ground truth given by Ultra-Wide Band sensors. The results of this comparison show that the trajectories from our system are accurate enough to obtain useful mobility statistics

    ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

    Full text link
    Object detection in wide area motion imagery (WAMI) has drawn the attention of the computer vision research community for a number of years. WAMI proposes a number of unique challenges including extremely small object sizes, both sparse and densely-packed objects, and extremely large search spaces (large video frames). Nearly all state-of-the-art methods in WAMI object detection report that appearance-based classifiers fail in this challenging data and instead rely almost entirely on motion information in the form of background subtraction or frame-differencing. In this work, we experimentally verify the failure of appearance-based classifiers in WAMI, such as Faster R-CNN and a heatmap-based fully convolutional neural network (CNN), and propose a novel two-stage spatio-temporal CNN which effectively and efficiently combines both appearance and motion information to significantly surpass the state-of-the-art in WAMI object detection. To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI). These ROOBI can contain from one to clusters of several hundred objects due to the large video frame size and varying object density in WAMI. The second stage (FoveaNet) then estimates the centroid location of all objects in that given ROOBI simultaneously via heatmap estimation. The proposed method exceeds state-of-the-art results on the WPAFB 2009 dataset by 5-16% for moving objects and nearly 50% for stopped objects, as well as being the first proposed method in wide area motion imagery to detect completely stationary objects.Comment: Main paper is 8 pages. Supplemental section contains a walk-through of our method (using a qualitative example) and qualitative results for WPAFB 2009 datase

    Realtime object extraction and tracking with an active camera using image mosaics

    Get PDF
    [[abstract]]Moving object extraction plays a key role in applications such as object-based videoconference, surveillance, and so on. The dimculties of moving object segmentation lie in the fact that physical objects are normally not homogeneous with to low-level features and it's usually tough to segment them accnrately and efficiently. Object segmentation based on prestored background information has proved to be effective and efficient in several applications such as videophone, video conferencing, and surveillance, etc. The previous works, however, were mainly concentrated on object segmentation with a static camera and in a stationary background. In this paper, we propose a robust and fast segmentation algorithm and a reliable tracking strategy without knowing the shape of the object in advance. The proposed system can real-time extract the foreground from the background and track the moving object with an active (pan-tilt) camera such that the moving object always stays around the center of images.[[fileno]]2030144030033[[department]]電機工程學

    Visualization and Correction of Automated Segmentation, Tracking and Lineaging from 5-D Stem Cell Image Sequences

    Get PDF
    Results: We present an application that enables the quantitative analysis of multichannel 5-D (x, y, z, t, channel) and large montage confocal fluorescence microscopy images. The image sequences show stem cells together with blood vessels, enabling quantification of the dynamic behaviors of stem cells in relation to their vascular niche, with applications in developmental and cancer biology. Our application automatically segments, tracks, and lineages the image sequence data and then allows the user to view and edit the results of automated algorithms in a stereoscopic 3-D window while simultaneously viewing the stem cell lineage tree in a 2-D window. Using the GPU to store and render the image sequence data enables a hybrid computational approach. An inference-based approach utilizing user-provided edits to automatically correct related mistakes executes interactively on the system CPU while the GPU handles 3-D visualization tasks. Conclusions: By exploiting commodity computer gaming hardware, we have developed an application that can be run in the laboratory to facilitate rapid iteration through biological experiments. There is a pressing need for visualization and analysis tools for 5-D live cell image data. We combine accurate unsupervised processes with an intuitive visualization of the results. Our validation interface allows for each data set to be corrected to 100% accuracy, ensuring that downstream data analysis is accurate and verifiable. Our tool is the first to combine all of these aspects, leveraging the synergies obtained by utilizing validation information from stereo visualization to improve the low level image processing tasks.Comment: BioVis 2014 conferenc
    • …
    corecore