766 research outputs found

    H.264/AVC inter prediction on accelerator-based multi-core systems

    Get PDF
    The AVC video coding standard adopts variable block sizes for inter frame coding to increase compression efficiency, among other new features. As a consequence of this, an AVC encoder has to employ a complex mode decision technique that requires high computational complexity. Several techniques aimed at accelerating the inter prediction process have been proposed in the literature in recent years. Recently, with the emergence of many-core processors or accelerators, a new way of supporting inter frame prediction has presented itself. In this paper, we present a step forward in the implementation of an AVC inter prediction algorithm in a graphics processing unit, using Compute Unified Device Architecture. The results show a negligible drop in rate distortion with a time reduction, on average, of over 98.8 % compared with full search and fast full search, and of over 80 % compared with UMHexagonS search

    A Self Organization-Based Optical Flow Estimator with GPU Implementation

    Get PDF
    This work describes a parallelizable optical flow estimator that uses a modified batch version of the Self Organizing Map (SOM). This gradient-based estimator handles the ill-posedness in motion estimation via a novel combination of regression and a self organization strategy. The aperture problem is explicitly modeled using an algebraic framework that partitions motion estimates obtained from regression into two sets, one (set Hc) with estimates with high confidence and another (set Hp) with low confidence estimates. The self organization step uses a uniquely designed pair of training set (Q=Hc) and the initial weights set (W=Hc U Hp). It is shown that with this specific choice of training and initial weights sets, the interpolation of flow vectors is achieved primarily due to the regularization property of SOM. Moreover, the computationally involved step of finding the winner unit in SOM simplifies to indexing into a 2D array making the algorithm parallelizable and highly scalable. To preserve flow discontinuities at occlusion boundaries, we have designed anisotropic neighborhood function for SOM that uses a novel OFCE residual-based distance measure. A multi-resolution or pyramidal approach is used to estimate large motion. As the algorithm is scalable, with sufficient number of computing cores (for example on a GPU), the implementation of the estimator can be made real-time. With the available true motion from Middlebury database, error metrics are computed

    Markerless 3D human pose tracking through multiple cameras and AI: Enabling high accuracy, robustness, and real-time performance

    Full text link
    Tracking 3D human motion in real-time is crucial for numerous applications across many fields. Traditional approaches involve attaching artificial fiducial objects or sensors to the body, limiting their usability and comfort-of-use and consequently narrowing their application fields. Recent advances in Artificial Intelligence (AI) have allowed for markerless solutions. However, most of these methods operate in 2D, while those providing 3D solutions compromise accuracy and real-time performance. To address this challenge and unlock the potential of visual pose estimation methods in real-world scenarios, we propose a markerless framework that combines multi-camera views and 2D AI-based pose estimation methods to track 3D human motion. Our approach integrates a Weighted Least Square (WLS) algorithm that computes 3D human motion from multiple 2D pose estimations provided by an AI-driven method. The method is integrated within the Open-VICO framework allowing simulation and real-world execution. Several experiments have been conducted, which have shown high accuracy and real-time performance, demonstrating the high level of readiness for real-world applications and the potential to revolutionize human motion capture.Comment: 19 pages, 7 figure

    Large Scale Kernel Methods for Fun and Profit

    Get PDF
    Kernel methods are among the most flexible classes of machine learning models with strong theoretical guarantees. Wide classes of functions can be approximated arbitrarily well with kernels, while fast convergence and learning rates have been formally shown to hold. Exact kernel methods are known to scale poorly with increasing dataset size, and we believe that one of the factors limiting their usage in modern machine learning is the lack of scalable and easy to use algorithms and software. The main goal of this thesis is to study kernel methods from the point of view of efficient learning, with particular emphasis on large-scale data, but also on low-latency training, and user efficiency. We improve the state-of-the-art for scaling kernel solvers to datasets with billions of points using the Falkon algorithm, which combines random projections with fast optimization. Running it on GPUs, we show how to fully utilize available computing power for training kernel machines. To boost the ease-of-use of approximate kernel solvers, we propose an algorithm for automated hyperparameter tuning. By minimizing a penalized loss function, a model can be learned together with its hyperparameters, reducing the time needed for user-driven experimentation. In the setting of multi-class learning, we show that – under stringent but realistic assumptions on the separation between classes – a wide set of algorithms needs much fewer data points than in the more general setting (without assumptions on class separation) to reach the same accuracy. The first part of the thesis develops a framework for efficient and scalable kernel machines. This raises the question of whether our approaches can be used successfully in real-world applications, especially compared to alternatives based on deep learning which are often deemed hard to beat. The second part aims to investigate this question on two main applications, chosen because of the paramount importance of having an efficient algorithm. First, we consider the problem of instance segmentation of images taken from the iCub robot. Here Falkon is used as part of a larger pipeline, but the efficiency afforded by our solver is essential to ensure smooth human-robot interactions. In the second instance, we consider time-series forecasting of wind speed, analysing the relevance of different physical variables on the predictions themselves. We investigate different schemes to adapt i.i.d. learning to the time-series setting. Overall, this work aims to demonstrate, through novel algorithms and examples, that kernel methods are up to computationally demanding tasks, and that there are concrete applications in which their use is warranted and more efficient than that of other, more complex, and less theoretically grounded models

    Enhancing State Estimator for Autonomous Race Car : Leveraging Multi-modal System and Managing Computing Resources

    Full text link
    This paper introduces an innovative approach to enhance the state estimator for high-speed autonomous race cars, addressing challenges related to unreliable measurements, localization failures, and computing resource management. The proposed robust localization system utilizes a Bayesian-based probabilistic approach to evaluate multimodal measurements, ensuring the use of credible data for accurate and reliable localization, even in harsh racing conditions. To tackle potential localization failures during intense racing, we present a resilient navigation system. This system enables the race car to continue track-following by leveraging direct perception information in planning and execution, ensuring continuous performance despite localization disruptions. Efficient computing resource management is critical to avoid overload and system failure. We optimize computing resources using an efficient LiDAR-based state estimation method. Leveraging CUDA programming and GPU acceleration, we perform nearest points search and covariance computation efficiently, overcoming CPU bottlenecks. Real-world and simulation tests validate the system's performance and resilience. The proposed approach successfully recovers from failures, effectively preventing accidents and ensuring race car safety.Comment: arXiv admin note: text overlap with arXiv:2207.1223
    • …
    corecore