5,837 research outputs found

    Cognitive visual tracking and camera control

    Get PDF
    Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision

    Deep Learning-Based Multiple Object Visual Tracking on Embedded System for IoT and Mobile Edge Computing Applications

    Get PDF
    Compute and memory demands of state-of-the-art deep learning methods are still a shortcoming that must be addressed to make them useful at IoT end-nodes. In particular, recent results depict a hopeful prospect for image processing using Convolutional Neural Netwoks, CNNs, but the gap between software and hardware implementations is already considerable for IoT and mobile edge computing applications due to their high power consumption. This proposal performs low-power and real time deep learning-based multiple object visual tracking implemented on an NVIDIA Jetson TX2 development kit. It includes a camera and wireless connection capability and it is battery powered for mobile and outdoor applications. A collection of representative sequences captured with the on-board camera, dETRUSC video dataset, is used to exemplify the performance of the proposed algorithm and to facilitate benchmarking. The results in terms of power consumption and frame rate demonstrate the feasibility of deep learning algorithms on embedded platforms although more effort to joint algorithm and hardware design of CNNs is needed.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    DALES: Automated Tool for Detection, Annotation, Labelling and Segmentation of Multiple Objects in Multi-Camera Video Streams

    Get PDF
    In this paper, we propose a new software tool called DALES to extract semantic information from multi-view videos based on the analysis of their visual content. Our system is fully automatic and is well suited for multi-camera environment. Once the multi-view video sequences are loaded into DALES, our software performs the detection, counting, and segmentation of the visual objects evolving in the provided video streams. Then, these objects of interest are processed in order to be labelled, and the related frames are thus annotated with the corresponding semantic content. Moreover, a textual script is automatically generated with the video annotations. DALES system shows excellent performance in terms of accuracy and computational speed and is robustly designed to ensure view synchronization

    ENERGY-EFFICIENT LIGHTWEIGHT ALGORITHMS FOR EMBEDDED SMART CAMERAS: DESIGN, IMPLEMENTATION AND PERFORMANCE ANALYSIS

    Get PDF
    An embedded smart camera is a stand-alone unit that not only captures images, but also includes a processor, memory and communication interface. Battery-powered, embedded smart cameras introduce many additional challenges since they have very limited resources, such as energy, processing power and memory. When camera sensors are added to an embedded system, the problem of limited resources becomes even more pronounced. Hence, computer vision algorithms running on these camera boards should be light-weight and efficient. This thesis is about designing and developing computer vision algorithms, which are aware and successfully overcome the limitations of embedded platforms (in terms of power consumption and memory usage). Particularly, we are interested in object detection and tracking methodologies and the impact of them on the performance and battery life of the CITRIC camera (embedded smart camera employed in this research). This thesis aims to prolong the life time of the Embedded Smart platform, without affecting the reliability of the system during surveillance tasks. Therefore, the reader is walked through the whole designing process, from the development and simulation, followed by the implementation and optimization, to the testing and performance analysis. The work presented in this thesis carries out not only software optimization, but also hardware-level operations during the stages of object detection and tracking. The performance of the algorithms introduced in this thesis are comparable to state-of-the-art object detection and tracking methods, such as Mixture of Gaussians, Eigen segmentation, color and coordinate tracking. Unlike the traditional methods, the newly-designed algorithms present notable reduction of the memory requirements, as well as the reduction of memory accesses per pixel. To accomplish the proposed goals, this work attempts to interconnect different levels of the embedded system architecture to make the platform more efficient in terms of energy and resource savings. Thus, the algorithms proposed are optimized at the API, middleware, and hardware levels to access the pixel information of the CMOS sensor directly. Only the required pixels are acquired in order to reduce the unnecessary communications overhead. Experimental results show that when exploiting the architecture capabilities of an embedded platform, 41.24% decrease in energy consumption, and 107.2% increase in battery-life can be accomplished. Compared to traditional object detection and tracking methods, the proposed work provides an additional 8 hours of continuous processing on 4 AA batteries, increasing the lifetime of the camera to 15.5 hours

    A sub-mW IoT-endnode for always-on visual monitoring and smart triggering

    Full text link
    This work presents a fully-programmable Internet of Things (IoT) visual sensing node that targets sub-mW power consumption in always-on monitoring scenarios. The system features a spatial-contrast 128x64128\mathrm{x}64 binary pixel imager with focal-plane processing. The sensor, when working at its lowest power mode (10μW10\mu W at 10 fps), provides as output the number of changed pixels. Based on this information, a dedicated camera interface, implemented on a low-power FPGA, wakes up an ultra-low-power parallel processing unit to extract context-aware visual information. We evaluate the smart sensor on three always-on visual triggering application scenarios. Triggering accuracy comparable to RGB image sensors is achieved at nominal lighting conditions, while consuming an average power between 193μW193\mu W and 277μW277\mu W, depending on context activity. The digital sub-system is extremely flexible, thanks to a fully-programmable digital signal processing engine, but still achieves 19x lower power consumption compared to MCU-based cameras with significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa

    Video analytics system for surveillance videos

    Get PDF
    Developing an intelligent inspection system that can enhance the public safety is challenging. An efficient video analytics system can help monitor unusual events and mitigate possible damage or loss. This thesis aims to analyze surveillance video data, report abnormal activities and retrieve corresponding video clips. The surveillance video dataset used in this thesis is derived from ALERT Dataset, a collection of surveillance videos at airport security checkpoints. The video analytics system in this thesis can be thought as a pipelined process. The system takes the surveillance video as input, and passes it through a series of processing such as object detection, multi-object tracking, person-bin association and re-identification. In the end, we can obtain trajectories of passengers and baggage in the surveillance videos. Abnormal events like taking away other's belongings will be detected and trigger the alarm automatically. The system could also retrieve the corresponding video clips based on user-defined query
    corecore