2,270 research outputs found

    Design and implementation of an efficient hardware integer motion estimator for an HEVC video encoder

    Get PDF
    High-Efficiency Video Coding (HEVC) was developed to improve its predecessor standard, H264/AVC, by doubling its compression efficiency. As in previous standards, Motion Estimation (ME) is one of the encoder critical blocks to achieve significant compression gains. However, it demands an overwhelming complexity cost to accurately remove video temporal redundancy, especially when encoding very high-resolution video sequences. To reduce the overall video encoding time, we propose the implementation of the HEVC ME block in hardware. The proposed architecture is based on (a) a new memory scan order, and (b) a new adder tree structure, which supports asymmetric partitioning modes in a fast and efficient way. The proposed system has been designed in VHDL (VHSIC Hardware Description Language), synthesized and implemented by means of the Xilinx FPGA, Virtex-7 XC7VX550T-3FFG1158. Our design achieves encoding frame rates up to 116 and 30 fps at 2 and 4K video formats, respectively

    Fast Motion Estimation Algorithms for Block-Based Video Coding Encoders

    Get PDF
    The objective of my research is reducing the complexity of video coding standards in real-time scalable and multi-view applications

    Radar and RGB-depth sensors for fall detection: a review

    Get PDF
    This paper reviews recent works in the literature on the use of systems based on radar and RGB-Depth (RGB-D) sensors for fall detection, and discusses outstanding research challenges and trends related to this research field. Systems to detect reliably fall events and promptly alert carers and first responders have gained significant interest in the past few years in order to address the societal issue of an increasing number of elderly people living alone, with the associated risk of them falling and the consequences in terms of health treatments, reduced well-being, and costs. The interest in radar and RGB-D sensors is related to their capability to enable contactless and non-intrusive monitoring, which is an advantage for practical deployment and users’ acceptance and compliance, compared with other sensor technologies, such as video-cameras, or wearables. Furthermore, the possibility of combining and fusing information from The heterogeneous types of sensors is expected to improve the overall performance of practical fall detection systems. Researchers from different fields can benefit from multidisciplinary knowledge and awareness of the latest developments in radar and RGB-D sensors that this paper is discussing

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca

    Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions

    Get PDF
    PhDThe research presented in this thesis aims to extend the scalability range of the wavelet-based video coding systems in order to achieve fully scalable coding with a wide range of available decoding points. Since the temporal redundancy regularly comprises the main portion of the global video sequence redundancy, the techniques that can be generally termed motion decorrelation techniques have a central role in the overall compression performance. For this reason the scalable motion modelling and coding are of utmost importance, and specifically, in this thesis possible solutions are identified and analysed. The main contributions of the presented research are grouped into two interrelated and complementary topics. Firstly a flexible motion model with rateoptimised estimation technique is introduced. The proposed motion model is based on tree structures and allows high adaptability needed for layered motion coding. The flexible structure for motion compensation allows for optimisation at different stages of the adaptive spatio-temporal decomposition, which is crucial for scalable coding that targets decoding on different resolutions. By utilising an adaptive choice of wavelet filterbank, the model enables high compression based on efficient mode selection. Secondly, solutions for scalable motion modelling and coding are developed. These solutions are based on precision limiting of motion vectors and creation of a layered motion structure that describes hierarchically coded motion. The solution based on precision limiting relies on layered bit-plane coding of motion vector values. The second solution builds on recently established techniques that impose scalability on a motion structure. The new approach is based on two major improvements: the evaluation of distortion in temporal Subbands and motion search in temporal subbands that finds the optimal motion vectors for layered motion structure. Exhaustive tests on the rate-distortion performance in demanding scalable video coding scenarios show benefits of application of both developed flexible motion model and various solutions for scalable motion coding

    Low-power and application-specific SRAM design for energy-efficient motion estimation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 181-189).Video content is expected to account for 70% of total mobile data traffic in 2015. High efficiency video coding, in this context, is crucial for lowering the transmission and storage costs for portable electronics. However, modern video coding standards impose a large hardware complexity. Hence, energy-efficiency of these hardware blocks is becoming more critical than ever before for mobile devices. SRAMs are critical components in almost all SoCs affecting the overall energy-efficiency. This thesis focuses on algorithm and architecture development as well as low-power and application-specific SRAM design targeting motion estimation. First, a motion estimation design is considered for the next generation video standard, HEVC. Hardware cost and coding efficiency trade-offs are quantified and an optimum design choice between hardware complexity and coding efficiency is proposed. Hardware-efficient search algorithm, shared search range across CU engines and pixel pre-fetching algorithms provide 4.3x area, 56x on-chip bandwidth and 151 x off-chip bandwidth reduction. Second, a highly-parallel motion estimation design targeting ultra-low voltage operation and supporting AVC/H.264 and VC-1 standards are considered. Hardware reconfigurability along with frame and macro-block parallel processing are implemented for this engine to maximize hardware sharing between multiple standards and to meet throughput constraints. Third, in the context of low-power SRAMs, a 6T and an 8T SRAM are designed in 28nm and 45nm CMOS technologies targeting low voltage operation. The 6T design achieves operation down to 0.6V and the 8T design achieves operation down to 0.5V providing ~ 2.8x and ~ 4.8x reduction in energy/access respectively. Finally, an application-specific SRAM design targeted for motion estimation is developed. Utilizing the correlation of pixel data to reduce bit-line switching activity, this SRAM achieves up to 1.9x energy savings compared to a similar conventional 8T design. These savings demonstrate that application-specific SRAM design can introduce a new dimension and can be combined with voltage scaling to maximize energy-efficiency.by Mahmut Ersin Sinangil.Ph.D

    Fast motion estimation algorithms for block-based video coding encoders

    Get PDF
    The objective of my research is reducing the complexity of video coding standards in real-time scalable and multi-view applications.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Joint CFO Estimation and Data Detection in OFDM systems

    Get PDF
    Orthogonal frequency division multiplexing (OFDM) is a multicarrier modulation technique that is widely used in wireless broadband communication systems. The spectral e ciency of OFDM is very high since the subcarriers are spaced as closely as possible while maintaining orthogonality. However, one of the major problems with OFDM that can cause performance degradation is carrier frequency o set (CFO) which impairs the orthogonality among OFDM subcarriers, as a consequence, results in inter-subcarrier interference. In this thesis, an iterative algorithm for joint CFO estimation and data detection in OFDM systems over frequency selective channels is proposed. The proposed algorithm is performing both CFO estimation and data detection in the frequency domain based on the Expectation-Maximization (EM) algorithm. The proposed algorithm can achieve the same bit-error-rate (BER) performance as that of its time-domain counterpart with much lower complexity. Simulation results show that the proposed algorithm can converge after three iterations and an estimate of CFO can be obtained with high accuracy

    Reactive traffic control mechanisms for communication networks with self-similar bandwidth demands

    Get PDF
    Communication network architectures are in the process of being redesigned so that many different services are integrated within the same network. Due to this integration, traffic management algorithms need to balance the requirements of the traffic which the algorithms are directly controlling with Quality of Service (QoS) requirements of other classes of traffic which will be encountered in the network. Of particular interest is one class of traffic, termed elastic traffic, that responds to dynamic feedback from the network regarding the amount of available resources within the network. Examples of this type of traffic include the Available Bit Rate (ABR) service in Asynchronous Transfer Mode (ATM) networks and connections using Transmission Control Protocol (TCP) in the Internet. Both examples aim to utilise available bandwidth within a network. Reactive traffic management, like that which occurs in the ABR service and TCP, depends explicitly on the dynamic bandwidth requirements of other traffic which is currently using the network. In particular, there is significant evidence that a wide range of network traffic, including Ethernet, World Wide Web, Varible Bit Rate video and signalling traffic, is self-similar. The term self-similar refers to the particular characteristic of network traffic to remain bursty over a wide range of time scales. A closely associated characteristic of self-similar traffic is its long-range dependence (LRD), which refers to the significant correlations that occur with the traffic. By utilising these correlations, greater predictability of network traffic can be achieved, and hence the performance of reactive traffic management algorithms can be enhanced. A predictive rate control algorithm, called PERC (Predictive Explicit Rate Control), is proposed in this thesis which is targeted to the ABR service in ATM networks. By incorporating the LRD stochastic structure of background traffic, measurements of the bandwidth requirements of background traffic, and the delay associated with a particular ABR connection, a predictive algorithm is defined which provides explicit rate information that is conveyed to ABR sources. An enhancement to PERC is also described. This algorithm, called PERC+, uses previous control information to correct prediction errors that occur for connections with larger round-trip delay. These algorithms have been extensively analysed with regards to their network performance, and simulation results show that queue lengths and cell loss rates are significantly reduced when these algorithms are deployed. An adaptive version of PERC has also been developed using real-time parameter estimates of self-similar traffic. This has excellent performance compared with standard ABR rate control algorithms such as ERICA. Since PERC and its enhancement PERC+ have explicitly utilised the index of self-similarity, known as the Hurst parameter, the sensitivity of these algorithms to this parameter can be determined analytically. Research work described in this thesis shows that the algorithms have an asymmetric sensitivity to the Hurst parameter, with significant sensitivity in the region where the parameter is underestimated as being close to 0.5. Simulation results reveal the same bias in the performance of the algorithm with regards to the Hurst parameter. In contrast, PERC is insensitive to estimates of the mean, using the sample mean estimator, and estimates of the traffic variance, which is due to the algorithm primarily utilising the correlation structure of the traffic to predict future bandwidth requirements. Sensitivity analysis falls into the area of investigative research, but it naturally leads to the area of robust control, where algorithms are designed so that uncertainty in traffic parameter estimation or modelling can be accommodated. An alternative robust design approach, to the standard maximum entropy approach, is proposed in this thesis that uses the maximum likelihood function to develop the predictive rate controller. The likelihood function defines the proximity of a specific traffic model to the traffic data, and hence gives a measure of the performance of a chosen model. Maximising the likelihood function leads to optimising robust performance, and it is shown, through simulations, that the system performance is close to the optimal performance as compared with maximising the spectral entropy. There is still debate regarding the influence of LRD on network performance. This thesis also considers the question of the influence of LRD on traffic predictability, and demonstrates that predictive rate control algorithms that only use short-term correlations have close performance to algorithms that utilise long-term correlations. It is noted that predictors based on LRD still out-perform ones which use short-term correlations, but that there is Potential simplification in the design of predictors, since traffic predictability can be achieved using short-term correlations. This thesis forms a substantial contribution to the understanding of control in the case where self-similar processes form part of the overall system. Rather than doggedly pursuing self-similar control, a broader view has been taken where the performance of algorithms have been considered from a number of perspectives. A number of different research avenues lead on from this work, and these are outlined

    Insight into the fundamental trade-offs of diffusion MRI from polarization-sensitive optical coherence tomography in ex vivo human brain

    Get PDF
    In the first study comparing high angular resolution diffusion MRI (dMRI) in the human brain to axonal orientation measurements from polarization-sensitive optical coherence tomography (PSOCT), we compare the accuracy of orientation estimates from various dMRI sampling schemes and reconstruction methods. We find that, if the reconstruction approach is chosen carefully, single-shell dMRI data can yield the same accuracy as multi-shell data, and only moderately lower accuracy than a full Cartesian-grid sampling scheme. Our results suggest that current dMRI reconstruction approaches do not benefit substantially from ultra-high b-values or from very large numbers of diffusion-encoding directions. We also show that accuracy remains stable across dMRI voxel sizes of 1 ​mm or smaller but degrades at 2 ​mm, particularly in areas of complex white-matter architecture. We also show that, as the spatial resolution is reduced, axonal configurations in a dMRI voxel can no longer be modeled as a small set of distinct axon populations, violating an assumption that is sometimes made by dMRI reconstruction techniques. Our findings have implications for in vivo studies and illustrate the value of PSOCT as a source of ground-truth measurements of white-matter organization that does not suffer from the distortions typical of histological techniques.Published versio
    corecore