317,257 research outputs found

    Concurrent Viola Jones classifiers on a portable Beowulf cluster : a thesis presented in partial fulfilment of the requirements for the degree of Master of Engineering in Mechatronics at Massey University

    Get PDF
    Real-time Computer Vision is an interesting application for supercomputing, real-time applications (vision processing in particular) employ special purpose hardware such as DSPs to achieve high performance. This thesis explores parallel computers particularly commodity general purpose hardware. We also build a prototype to better understand the economics of supercomputing, specifically related to mobile computing - low power, rugged design by building a mobile computer. A new communication layer is built, where by the nature of the locality of the nodes allows one to optimise the protocols to reduce the latency comparably. Finally a study and in depth results of the algorithm, the Viola Jones Object detector in parallel are presented followed by reflection and future work based on the current results and platform

    Enabling Runtime Self-Coordination of Reconfigurable Embedded Smart Cameras in Distributed Networks

    Get PDF
    Smart camera networks are real-time distributed embedded systems able to perform computer vision using multiple cameras. This new approach is a confluence of four major disciplines (computer vision, image sensors, embedded computing and sensor networks) and has been subject of intensive work in the past decades. The recent advances in computer vision and network communication, and the rapid growing in the field of high-performance computing, especially using reconfigurable devices, have enabled the design of more robust smart camera systems. Despite these advancements, the effectiveness of current networked vision systems (compared to their operating costs) is still disappointing; the main reason being the poor coordination among cameras entities at runtime and the lack of a clear formalism to dynamically capture and address the self-organization problem without relying on human intervention. In this dissertation, we investigate the use of a declarative-based modeling approach for capturing runtime self-coordination. We combine modeling approaches borrowed from logic programming, computer vision techniques, and high-performance computing for the design of an autonomous and cooperative smart camera. We propose a compact modeling approach based on Answer Set Programming for architecture synthesis of a system-on-reconfigurable-chip camera that is able to support the runtime cooperative work and collaboration with other camera nodes in a distributed network setup. Additionally, we propose a declarative approach for modeling runtime camera self-coordination for distributed object tracking in which moving targets are handed over in a distributed manner and recovered in case of node failure

    Cost-effective HPC clustering for computer vision applications

    Get PDF
    We will present a cost-effective and flexible realization of high performance computing (HPC) clustering and its potential in solving computationally intensive problems in computer vision. The featured software foundation to support the parallel programming is the GNU parallel Knoppix package with message passing interface (MPI) based Octave, Python and C interface capabilities. The implementation is especially of interest in applications where the main objective is to reuse the existing hardware infrastructure and to maintain the overall budget cost. We will present the benchmark results and compare and contrast the performances of Octave and MATLAB

    Fast and accurate object detection in high resolution 4K and 8K video using GPUs

    Full text link
    Machine learning has celebrated a lot of achievements on computer vision tasks such as object detection, but the traditionally used models work with relatively low resolution images. The resolution of recording devices is gradually increasing and there is a rising need for new methods of processing high resolution data. We propose an attention pipeline method which uses two staged evaluation of each image or video frame under rough and refined resolution to limit the total number of necessary evaluations. For both stages, we make use of the fast object detection model YOLO v2. We have implemented our model in code, which distributes the work across GPUs. We maintain high accuracy while reaching the average performance of 3-6 fps on 4K video and 2 fps on 8K video.Comment: 6 pages, 12 figures, Best Paper Finalist at IEEE High Performance Extreme Computing Conference (HPEC) 2018; copyright 2018 IEEE; (DOI will be filled when known

    Computer vision-based structural assessment exploiting large volumes of images

    Get PDF
    Visual assessment is a process to understand the state of a structure based on evaluations originating from visual information. Recent advances in computer vision to explore new sensors, sensing platforms and high-performance computing have shed light on the potential for vision-based visual assessment in civil engineering structures. The use of low-cost, high-resolution visual sensors in conjunction with mobile and aerial platforms can overcome spatial and temporal limitations typically associated with other forms of sensing in civil structures. Also, GPU-accelerated and parallel computing offer unprecedented speed and performance, accelerating processing the collected visual data. However, despite the enormous endeavor in past research to implement such technologies, there are still many practical challenges to overcome to successfully apply these techniques in real world situations. A major challenge lies in dealing with a large volume of unordered and complex visual data, collected under uncontrolled circumstance (e.g. lighting, cluttered region, and variations in environmental conditions), while just a tiny fraction of them are useful for conducting actual assessment. Such difficulty induces an undesirable high rate of false-positive and false-negative errors, reducing the trustworthiness and efficiency of their implementation. To overcome the inherent challenges in using such images for visual assessment, high-level computer vision algorithms must be integrated with relevant prior knowledge and guidance, thus aiming to have similar performance with those of humans conducting visual assessment. Moreover, the techniques must be developed and validated in the realistic context of a large volume of real-world images, which is likely contain numerous practical challenges. In this dissertation, the novel use of computer vision algorithms is explored to address two promising applications of vision-based visual assessment in civil engineering: visual inspection, and visual data analysis for post-disaster evaluation. For both applications, powerful techniques are developed here to enable reliable and efficient visual assessment for civil structures and demonstrate them using a large volume of real-world images collected from actual structures. State-of-art computer vision techniques, such as structure-from-motion and convolutional neural network techniques, facilitate these tasks. The core techniques derived from this study are scalable and expandable to many other applications in vision-based visual assessment, and will serve to close the existing gaps between past research efforts and real-world implementations

    Self-paced Convolutional Neural Network for Computer Aided Detection in Medical Imaging Analysis

    Full text link
    Tissue characterization has long been an important component of Computer Aided Diagnosis (CAD) systems for automatic lesion detection and further clinical planning. Motivated by the superior performance of deep learning methods on various computer vision problems, there has been increasing work applying deep learning to medical image analysis. However, the development of a robust and reliable deep learning model for computer-aided diagnosis is still highly challenging due to the combination of the high heterogeneity in the medical images and the relative lack of training samples. Specifically, annotation and labeling of the medical images is much more expensive and time-consuming than other applications and often involves manual labor from multiple domain experts. In this work, we propose a multi-stage, self-paced learning framework utilizing a convolutional neural network (CNN) to classify Computed Tomography (CT) image patches. The key contribution of this approach is that we augment the size of training samples by refining the unlabeled instances with a self-paced learning CNN. By implementing the framework on high performance computing servers including the NVIDIA DGX1 machine, we obtained the experimental result, showing that the self-pace boosted network consistently outperformed the original network even with very scarce manual labels. The performance gain indicates that applications with limited training samples such as medical image analysis can benefit from using the proposed framework.Comment: accepted by 8th International Workshop on Machine Learning in Medical Imaging (MLMI 2017

    Estimation of Success in Collaborative Learning Based on Multimodal Learning Analytics Features

    Get PDF
    Multimodal learning analytics provides researchers new tools and techniques to capture different types of data from complex learning activities in dynamic learning environments. This paper investigates high-fidelity synchronised multimodal recordings of small groups of learners interacting from diverse sensors that include computer vision, user generated content, and data from the learning objects (like physical computing components or laboratory equipment). We processed and extracted different aspects of the students' interactions to answer the following question: which features of student group work are good predictors of team success in open-ended tasks with physical computing? The answer to the question provides ways to automatically identify the students' performance during the learning activities
    corecore