5,972 research outputs found

    Throughput-Distortion Computation Of Generic Matrix Multiplication: Toward A Computation Channel For Digital Signal Processing Systems

    Get PDF
    The generic matrix multiply (GEMM) function is the core element of high-performance linear algebra libraries used in many computationally-demanding digital signal processing (DSP) systems. We propose an acceleration technique for GEMM based on dynamically adjusting the imprecision (distortion) of computation. Our technique employs adaptive scalar companding and rounding to input matrix blocks followed by two forms of packing in floating-point that allow for concurrent calculation of multiple results. Since the adaptive companding process controls the increase of concurrency (via packing), the increase in processing throughput (and the corresponding increase in distortion) depends on the input data statistics. To demonstrate this, we derive the optimal throughput-distortion control framework for GEMM for the broad class of zero-mean, independent identically distributed, input sources. Our approach converts matrix multiplication in programmable processors into a computation channel: when increasing the processing throughput, the output noise (error) increases due to (i) coarser quantization and (ii) computational errors caused by exceeding the machine-precision limitations. We show that, under certain distortion in the GEMM computation, the proposed framework can significantly surpass 100% of the peak performance of a given processor. The practical benefits of our proposal are shown in a face recognition system and a multi-layer perceptron system trained for metadata learning from a large music feature database.Comment: IEEE Transactions on Signal Processing (vol. 60, 2012

    FPGA applications in signal and image processing

    Get PDF
    The increasing demand for real-time and smart digital signal processing (DSP) systems, calls for a better platform for their implementation. Most of these systems (e.g. digital image processing) are highly parallelisable, memory and processor hungry; such that the increasing performance of today�s general-purpose microprocessors are no longer able to handle them. A highly parallel hardware architecture, which offers enough memory resources, offers an alternative for such DSP implementations

    Multimodal person recognition for human-vehicle interaction

    Get PDF
    Next-generation vehicles will undoubtedly feature biometric person recognition as part of an effort to improve the driving experience. Today's technology prevents such systems from operating satisfactorily under adverse conditions. A proposed framework for achieving person recognition successfully combines different biometric modalities, borne out in two case studies

    Precise motion descriptors extraction from stereoscopic footage using DaVinci DM6446

    Get PDF
    A novel approach to extract target motion descriptors in multi-camera video surveillance systems is presented. Using two static surveillance cameras with partially overlapped field of view (FOV), control points (unique points from each camera) are identified in regions of interest (ROI) from both cameras footage. The control points within the ROI are matched for correspondence and a meshed Euclidean distance based signature is computed. A depth map is estimated using disparity of each control pair and the ROI is graded into number of regions with the help of relative depth information of the control points. The graded regions of different depths will help calculate accurately the pace of the moving target and also its 3D location. The advantage of estimating a depth map for background static control points over depth map of the target itself is its accuracy and robustness to outliers. The performance of the algorithm is evaluated in the paper using several test sequences. Implementation issues of the algorithm onto the TI DaVinci DM6446 platform are considered in the paper
    corecore