504 research outputs found

    Multi-View Pedestrian Detection using Statistical Colour Matching

    Get PDF

    Enabling Runtime Self-Coordination of Reconfigurable Embedded Smart Cameras in Distributed Networks

    Get PDF
    Smart camera networks are real-time distributed embedded systems able to perform computer vision using multiple cameras. This new approach is a confluence of four major disciplines (computer vision, image sensors, embedded computing and sensor networks) and has been subject of intensive work in the past decades. The recent advances in computer vision and network communication, and the rapid growing in the field of high-performance computing, especially using reconfigurable devices, have enabled the design of more robust smart camera systems. Despite these advancements, the effectiveness of current networked vision systems (compared to their operating costs) is still disappointing; the main reason being the poor coordination among cameras entities at runtime and the lack of a clear formalism to dynamically capture and address the self-organization problem without relying on human intervention. In this dissertation, we investigate the use of a declarative-based modeling approach for capturing runtime self-coordination. We combine modeling approaches borrowed from logic programming, computer vision techniques, and high-performance computing for the design of an autonomous and cooperative smart camera. We propose a compact modeling approach based on Answer Set Programming for architecture synthesis of a system-on-reconfigurable-chip camera that is able to support the runtime cooperative work and collaboration with other camera nodes in a distributed network setup. Additionally, we propose a declarative approach for modeling runtime camera self-coordination for distributed object tracking in which moving targets are handed over in a distributed manner and recovered in case of node failure

    Multi-Camera View Based Proactive BS Selection and Beam Switching for V2X

    Full text link
    Due to the short wavelength and large attenuation of millimeter-wave (mmWave), mmWave BSs are densely distributed and require beamforming with high directivity. When the user moves out of the coverage of the current BS or is severely blocked, the mmWave BS must be switched to ensure the communication quality. In this paper, we proposed a multi-camera view based proactive BS selection and beam switching that can predict the optimal BS of the user in the future frame and switch the corresponding beam pair. Specifically, we extract the features of multi-camera view images and a small part of channel state information (CSI) in historical frames, and dynamically adjust the weight of each modality feature. Then we design a multi-task learning module to guide the network to better understand the main task, thereby enhancing the accuracy and the robustness of BS selection and beam switching. Using the outputs of all tasks, a prior knowledge based fine tuning network is designed to further increase the BS switching accuracy. After the optimal BS is obtained, a beam pair switching network is proposed to directly predict the optimal beam pair of the corresponding BS. Simulation results in an outdoor intersection environment show the superior performance of our proposed solution under several metrics such as predicting accuracy, achievable rate, harmonic mean of precision and recall

    Long Range Automated Persistent Surveillance

    Get PDF
    This dissertation addresses long range automated persistent surveillance with focus on three topics: sensor planning, size preserving tracking, and high magnification imaging. field of view should be reserved so that camera handoff can be executed successfully before the object of interest becomes unidentifiable or untraceable. We design a sensor planning algorithm that not only maximizes coverage but also ensures uniform and sufficient overlapped camera’s field of view for an optimal handoff success rate. This algorithm works for environments with multiple dynamic targets using different types of cameras. Significantly improved handoff success rates are illustrated via experiments using floor plans of various scales. Size preserving tracking automatically adjusts the camera’s zoom for a consistent view of the object of interest. Target scale estimation is carried out based on the paraperspective projection model which compensates for the center offset and considers system latency and tracking errors. A computationally efficient foreground segmentation strategy, 3D affine shapes, is proposed. The 3D affine shapes feature direct and real-time implementation and improved flexibility in accommodating the target’s 3D motion, including off-plane rotations. The effectiveness of the scale estimation and foreground segmentation algorithms is validated via both offline and real-time tracking of pedestrians at various resolution levels. Face image quality assessment and enhancement compensate for the performance degradations in face recognition rates caused by high system magnifications and long observation distances. A class of adaptive sharpness measures is proposed to evaluate and predict this degradation. A wavelet based enhancement algorithm with automated frame selection is developed and proves efficient by a considerably elevated face recognition rate for severely blurred long range face images

    Algorithms for trajectory integration in multiple views

    Get PDF
    PhDThis thesis addresses the problem of deriving a coherent and accurate localization of moving objects from partial visual information when data are generated by cameras placed in di erent view angles with respect to the scene. The framework is built around applications of scene monitoring with multiple cameras. Firstly, we demonstrate how a geometric-based solution exploits the relationships between corresponding feature points across views and improves accuracy in object location. Then, we improve the estimation of objects location with geometric transformations that account for lens distortions. Additionally, we study the integration of the partial visual information generated by each individual sensor and their combination into one single frame of observation that considers object association and data fusion. Our approach is fully image-based, only relies on 2D constructs and does not require any complex computation in 3D space. We exploit the continuity and coherence in objects' motion when crossing cameras' elds of view. Additionally, we work under the assumption of planar ground plane and wide baseline (i.e. cameras' viewpoints are far apart). The main contributions are: i) the development of a framework for distributed visual sensing that accounts for inaccuracies in the geometry of multiple views; ii) the reduction of trajectory mapping errors using a statistical-based homography estimation; iii) the integration of a polynomial method for correcting inaccuracies caused by the cameras' lens distortion; iv) a global trajectory reconstruction algorithm that associates and integrates fragments of trajectories generated by each camera

    Improving digital object handoff using the space above the table

    Get PDF
    Object handoff – that is, passing an object or tool to another person – is an extremely common activity in collaborative tabletop work. On digital tables, object handoff is typically accomplished by sliding the object on the table surface – but surface-only interactions can be slow and error-prone, particularly when there are multiple people carrying out multiple handoffs. An alternative approach is to use the space above the table for object handoff; this provides more room to move, but requires above-surface tracking. I developed two above-the-surface handoff techniques that use simple and inexpensive tracking: a force-field technique that uses a depth camera to determine hand proximity, and an electromagnetic-field technique called ElectroTouch that provides positive indication when people touch hands over the table. These new techniques were compared to three kinds of existing surface-only handoff (sliding, flicking, and surface-only Force-Fields). The study showed that the above-surface techniques significantly improved both speed and accuracy, and that ElectroTouch was the best technique overall. Also, as object interactions are moved above-the-surface of the table the representation of off-table objects becomes crucial. To address the issue of off-table digital object representation several object designs were created an evaluated. The result of the present research provides designers with practical new techniques for substantially increasing performance and interaction richness on digital tables

    Dimensionality reduction and sparse representations in computer vision

    Get PDF
    The proliferation of camera equipped devices, such as netbooks, smartphones and game stations, has led to a significant increase in the production of visual content. This visual information could be used for understanding the environment and offering a natural interface between the users and their surroundings. However, the massive amounts of data and the high computational cost associated with them, encumbers the transfer of sophisticated vision algorithms to real life systems, especially ones that exhibit resource limitations such as restrictions in available memory, processing power and bandwidth. One approach for tackling these issues is to generate compact and descriptive representations of image data by exploiting inherent redundancies. We propose the investigation of dimensionality reduction and sparse representations in order to accomplish this task. In dimensionality reduction, the aim is to reduce the dimensions of the space where image data reside in order to allow resource constrained systems to handle them and, ideally, provide a more insightful description. This goal is achieved by exploiting the inherent redundancies that many classes of images, such as faces under different illumination conditions and objects from different viewpoints, exhibit. We explore the description of natural images by low dimensional non-linear models called image manifolds and investigate the performance of computer vision tasks such as recognition and classification using these low dimensional models. In addition to dimensionality reduction, we study a novel approach in representing images as a sparse linear combination of dictionary examples. We investigate how sparse image representations can be used for a variety of tasks including low level image modeling and higher level semantic information extraction. Using tools from dimensionality reduction and sparse representation, we propose the application of these methods in three hierarchical image layers, namely low-level features, mid-level structures and high-level attributes. Low level features are image descriptors that can be extracted directly from the raw image pixels and include pixel intensities, histograms, and gradients. In the first part of this work, we explore how various techniques in dimensionality reduction, ranging from traditional image compression to the recently proposed Random Projections method, affect the performance of computer vision algorithms such as face detection and face recognition. In addition, we discuss a method that is able to increase the spatial resolution of a single image, without using any training examples, according to the sparse representations framework. In the second part, we explore mid-level structures, including image manifolds and sparse models, produced by abstracting information from low-level features and offer compact modeling of high dimensional data. We propose novel techniques for generating more descriptive image representations and investigate their application in face recognition and object tracking. In the third part of this work, we propose the investigation of a novel framework for representing the semantic contents of images. This framework employs high level semantic attributes that aim to bridge the gap between the visual information of an image and its textual description by utilizing low level features and mid level structures. This innovative paradigm offers revolutionary possibilities including recognizing the category of an object from purely textual information without providing any explicit visual example
    • …
    corecore