766 research outputs found
H.264/AVC inter prediction on accelerator-based multi-core systems
The AVC video coding standard adopts variable block sizes for inter frame coding to increase compression efficiency, among other new features. As a consequence of this, an AVC encoder has to employ a complex mode decision technique that requires high computational complexity. Several techniques aimed at accelerating the inter prediction process have been proposed in the literature in recent years. Recently, with the emergence of many-core processors or accelerators, a new way of supporting inter frame prediction has presented itself. In this paper, we present a step forward in the implementation of an AVC inter prediction algorithm in a graphics processing unit, using Compute Unified Device Architecture. The results show a negligible drop in rate distortion with a time reduction, on average, of over 98.8 % compared with full search and fast full search, and of over 80 % compared with UMHexagonS search
A Self Organization-Based Optical Flow Estimator with GPU Implementation
This work describes a parallelizable optical flow estimator that uses a modified batch version of the Self Organizing Map (SOM). This gradient-based estimator handles the ill-posedness in motion estimation via a novel combination of regression and a self organization strategy. The aperture problem is explicitly modeled using an algebraic framework that partitions motion estimates obtained from regression into two sets, one (set Hc) with estimates with high confidence and another (set Hp) with low confidence estimates. The self organization step uses a uniquely designed pair of training set (Q=Hc) and the initial weights set (W=Hc U Hp). It is shown that with this specific choice of training and initial weights sets, the interpolation of flow vectors is achieved primarily due to the regularization property of SOM. Moreover, the computationally involved step of finding the winner unit in SOM simplifies to indexing into a 2D array making the algorithm parallelizable and highly scalable. To preserve flow discontinuities at occlusion boundaries, we have designed anisotropic neighborhood function for SOM that uses a novel OFCE residual-based distance measure. A multi-resolution or pyramidal approach is used to estimate large motion. As the algorithm is scalable, with sufficient number of computing cores (for example on a GPU), the implementation of the estimator can be made real-time. With the available true motion from Middlebury database, error metrics are computed
Markerless 3D human pose tracking through multiple cameras and AI: Enabling high accuracy, robustness, and real-time performance
Tracking 3D human motion in real-time is crucial for numerous applications
across many fields. Traditional approaches involve attaching artificial
fiducial objects or sensors to the body, limiting their usability and
comfort-of-use and consequently narrowing their application fields. Recent
advances in Artificial Intelligence (AI) have allowed for markerless solutions.
However, most of these methods operate in 2D, while those providing 3D
solutions compromise accuracy and real-time performance. To address this
challenge and unlock the potential of visual pose estimation methods in
real-world scenarios, we propose a markerless framework that combines
multi-camera views and 2D AI-based pose estimation methods to track 3D human
motion. Our approach integrates a Weighted Least Square (WLS) algorithm that
computes 3D human motion from multiple 2D pose estimations provided by an
AI-driven method. The method is integrated within the Open-VICO framework
allowing simulation and real-world execution. Several experiments have been
conducted, which have shown high accuracy and real-time performance,
demonstrating the high level of readiness for real-world applications and the
potential to revolutionize human motion capture.Comment: 19 pages, 7 figure
Large Scale Kernel Methods for Fun and Profit
Kernel methods are among the most flexible classes of machine learning models with strong theoretical guarantees. Wide classes of functions can be approximated arbitrarily well with kernels, while fast convergence and learning rates have been formally shown to hold. Exact kernel methods are known to scale poorly with increasing dataset size, and we believe that one of the factors limiting their usage in modern machine learning is the lack of scalable and easy to use algorithms and software. The main goal of this thesis is to study kernel methods from the point of view of efficient learning, with particular emphasis on large-scale data, but also on low-latency training, and user efficiency. We improve the state-of-the-art for scaling kernel solvers to datasets with billions of points using the Falkon algorithm, which combines random projections with fast optimization. Running it on GPUs, we show how to fully utilize available computing power for training kernel machines. To boost the ease-of-use of approximate kernel solvers, we propose an algorithm for automated hyperparameter tuning. By minimizing a penalized loss function, a model can be learned together with its hyperparameters, reducing the time needed for user-driven experimentation. In the setting of multi-class learning, we show that – under stringent but realistic assumptions on the separation between classes – a wide set of algorithms needs much fewer data points than in the more general setting (without assumptions on class separation) to reach the same accuracy. The first part of the thesis develops a framework for efficient and scalable kernel machines. This raises the question of whether our approaches can be used successfully in real-world applications, especially compared to alternatives based on deep learning which are often deemed hard to beat. The second part aims to investigate this question on two main applications, chosen because of the paramount importance of having an efficient algorithm. First, we consider the problem of instance segmentation of images taken from the iCub robot. Here Falkon is used as part of a larger pipeline, but the efficiency afforded by our solver is essential to ensure smooth human-robot interactions. In the second instance, we consider time-series forecasting of wind speed, analysing the relevance of different physical variables on the predictions themselves. We investigate different schemes to adapt i.i.d. learning to the time-series setting. Overall, this work aims to demonstrate, through novel algorithms and examples, that kernel methods are up to computationally demanding tasks, and that there are concrete applications in which their use is warranted and more efficient than that of other, more complex, and less theoretically grounded models
Enhancing State Estimator for Autonomous Race Car : Leveraging Multi-modal System and Managing Computing Resources
This paper introduces an innovative approach to enhance the state estimator
for high-speed autonomous race cars, addressing challenges related to
unreliable measurements, localization failures, and computing resource
management. The proposed robust localization system utilizes a Bayesian-based
probabilistic approach to evaluate multimodal measurements, ensuring the use of
credible data for accurate and reliable localization, even in harsh racing
conditions. To tackle potential localization failures during intense racing, we
present a resilient navigation system. This system enables the race car to
continue track-following by leveraging direct perception information in
planning and execution, ensuring continuous performance despite localization
disruptions. Efficient computing resource management is critical to avoid
overload and system failure. We optimize computing resources using an efficient
LiDAR-based state estimation method. Leveraging CUDA programming and GPU
acceleration, we perform nearest points search and covariance computation
efficiently, overcoming CPU bottlenecks. Real-world and simulation tests
validate the system's performance and resilience. The proposed approach
successfully recovers from failures, effectively preventing accidents and
ensuring race car safety.Comment: arXiv admin note: text overlap with arXiv:2207.1223
- …