49 research outputs found

    Parallel Architectures and Parallel Algorithms for Integrated Vision Systems

    Get PDF
    Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems

    Parallel architectures for image analysis

    Get PDF
    This thesis is concerned with the problem of designing an architecture specifically for the application of image analysis and object recognition. Image analysis is a complex subject area that remains only partially defined and only partially solved. This makes the task of designing an architecture aimed at efficiently implementing image analysis and recognition algorithms a difficult one. Within this work a massively parallel heterogeneous architecture, the Warwick Pyramid Machine is described. This architecture consists of SIMD, MIMD and MSIMD modes of parallelism each directed at a different part of the problem. The performance of this architecture is analysed with respect to many tasks drawn from very different areas of the image analysis problem. These tasks include an efficient straight line extraction algorithm and a robust and novel geometric model based recognition system. The straight line extraction method is based on the local extraction of line segments using a Hough style algorithm followed by careful global matching and merging. The recognition system avoids quantising the pose space, hence overcoming many of the problems inherent with this class of methods and includes an analytical verification stage. Results and detailed implementations of both of these tasks are given

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    A multiple-SIMD architecture for image and tracking analysis

    Get PDF
    The computational requirements for real-time image based applications are such as to warrant the use of a parallel architecture. Commonly used parallel architectures conform to the classifications of Single Instruction Multiple Data (SIMD), or Multiple Instruction Multiple Data (MIMD). Each class of architecture has its advantages and dis-advantages. For example, SIMD architectures can be used on data-parallel problems, such as the processing of an image. Whereas MIMD architectures are more flexible and better suited to general purpose computing. Both types of processing are typically required for the analysis of the contents of an image. This thesis describes a novel massively parallel heterogeneous architecture, implemented as the Warwick Pyramid Machine. Both SIMD and MIMD processor types are combined within this architecture. Furthermore, the SIMD array is partitioned, into smaller SIMD sub-arrays, forming a Multiple-SIMD array. Thus, local data parallel, global data parallel, and control parallel processing are supported. After describing the present options available in the design of massively parallel machines and the nature of the image analysis problem, the architecture of the Warwick Pyramid Machine is described in some detail. The performance of this architecture is then analysed, both in terms of peak available computational power and in terms of representative applications in image analysis and numerical computation. Two tracking applications are also analysed to show the performance of this architecture. In addition, they illustrate the possible partitioning of applications between the SIMD and MIMD processor arrays. Load-balancing techniques are then described which have the potential to increase the utilisation of the Warwick Pyramid Machine at run-time. These include mapping techniques for image regions across the Multiple-SIMD arrays, and for the compression of sparse data. It is envisaged that these techniques may be found useful in other parallel systems

    Automatic visual recognition using parallel machines

    Get PDF
    Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity. In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods. Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration. A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture

    Implementation of a real time Hough transform using FPGA technology

    Get PDF
    This thesis is concerned with the modelling, design and implementation of efficient architectures for performing the Hough Transform (HT) on mega-pixel resolution real-time images using Field Programmable Gate Array (FPGA) technology. Although the HT has been around for many years and a number of algorithms have been developed it still remains a significant bottleneck in many image processing applications. Even though, the basic idea of the HT is to locate curves in an image that can be parameterized: e.g. straight lines, polynomials or circles, in a suitable parameter space, the research presented in this thesis will focus only on location of straight lines on binary images. The HT algorithm uses an accumulator array (accumulator bins) to detect the existence of a straight line on an image. As the image needs to be binarized, a novel generic synchronization circuit for windowing operations was designed to perform edge detection. An edge detection method of special interest, the canny method, is used and the design and implementation of it in hardware is achieved in this thesis. As each image pixel can be implemented independently, parallel processing can be performed. However, the main disadvantage of the HT is the large storage and computational requirements. This thesis presents new and state-of-the-art hardware implementations for the minimization of the computational cost, using the Hybrid-Logarithmic Number System (Hybrid-LNS) for calculating the HT for fixed bit-width architectures. It is shown that using the Hybrid-LNS the computational cost is minimized, while the precision of the HT algorithm is maintained. Advances in FPGA technology now make it possible to implement functions as the HT in reconfigurable fabrics. Methods for storing large arrays on FPGA’s are presented, where data from a 1024 x 1024 pixel camera at a rate of up to 25 frames per second are processed

    A high-performance matrix-matrix multiplication methodology for CPU and GPU architectures

    Get PDF
    Current compilers cannot generate code that can compete with hand-tuned code in efficiency, even for a simple kernel like matrix–matrix multiplication (MMM). A key step in program optimization is the estimation of optimal values for parameters such as tile sizes and number of levels of tiling. The scheduling parameter values selection is a very difficult and time-consuming task, since parameter values depend on each other; this is why they are found by using searching methods and empirical techniques. To overcome this problem, the scheduling sub-problems must be optimized together, as one problem and not separately. In this paper, an MMM methodology is presented where the optimum scheduling parameters are found by decreasing the search space theoretically, while the major scheduling sub-problems are addressed together as one problem and not separately according to the hardware architecture parameters and input size; for different hardware architecture parameters and/or input sizes, a different implementation is produced. This is achieved by fully exploiting the software characteristics (e.g., data reuse) and hardware architecture parameters (e.g., data caches sizes and associativities), giving high-quality solutions and a smaller search space. This methodology refers to a wide range of CPU and GPU architectures
    corecore