208 research outputs found
Algorithms & implementation of advanced video coding standards
Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage
Adaptive video delivery using semantics
The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Video QoS/QoE over IEEE802.11n/ac: A Contemporary Survey
The demand for video applications over wireless networks has tremendously increased, and IEEE 802.11 standards have provided higher support for video transmission. However, providing Quality of Service (QoS) and Quality of Experience (QoE) for video over WLAN is still a challenge due to the error sensitivity of compressed video and dynamic channels. This thesis presents a contemporary survey study on video QoS/QoE over WLAN issues and solutions. The objective of the study is to provide an overview of the issues by conducting a background study on the video codecs and their features and characteristics, followed by studying QoS and QoE support in IEEE 802.11 standards. Since IEEE 802.11n is the current standard that is mostly deployed worldwide and IEEE 802.11ac is the upcoming standard, this survey study aims to investigate the most recent video QoS/QoE solutions based on these two standards. The solutions are divided into two broad categories, academic solutions, and vendor solutions. Academic solutions are mostly based on three main layers, namely Application, Media Access Control (MAC) and Physical (PHY) which are further divided into two major categories, single-layer solutions, and cross-layer solutions. Single-layer solutions are those which focus on a single layer to enhance the video transmission performance over WLAN. Cross-layer solutions involve two or more layers to provide a single QoS solution for video over WLAN. This thesis has also presented and technically analyzed QoS solutions by three popular vendors. This thesis concludes that single-layer solutions are not directly related to video QoS/QoE, and cross-layer solutions are performing better than single-layer solutions, but they are much more complicated and not easy to be implemented. Most vendors rely on their network infrastructure to provide QoS for multimedia applications. They have their techniques and mechanisms, but the concept of providing QoS/QoE for video is almost the same because they are using the same standards and rely on Wi-Fi Multimedia (WMM) to provide QoS
Recommended from our members
Design Space Exploration of Accelerators for Warehouse Scale Computing
With Moore’s law grinding to a halt, accelerators are one of the ways that new silicon can improve performance, and they are already a key component in modern datacenters. Accelerators are integrated circuits that implement parts of an application with the objective of higher energy efficiency compared to execution on a standard general purpose CPU. Many accelerators can target any particular workload, generally with a wide range of performance, and costs such as area or power. Exploring these design choices, called Design Space Exploration (DSE), is a crucial step in trying to find the most efficient accelerator design, the one that produces the largest reduction of the total cost of ownership.
This work aims to improve this design space exploration phase for accelerators and to avoid pitfalls in the process. This dissertation supports the thesis that early design choices – including the level of specialization – are critical for accelerator development and therefore require benchmarks reflective of production workloads. We present three studies that support this thesis. First, we show how to benchmark datacenter applications by creating a benchmark for large video sharing infrastructures. Then, we present two studies focused on accelerators for analytical query processing. The first is an analysis on the impact of Network on Chip specialization while the second analyses the impact of the level of specialization.
The first part of this dissertation introduces vbench: a video transcoding benchmark tailored to the growing video-as-a-service market. Video transcoding is not accurately represented in current computer architecture benchmarks such as SPEC or PARSEC. Despite posing a big computational burden for cloud video providers, such as YouTube and Facebook, it is not included in cloud benchmarks such as CloudSuite. Using vbench, we found that the microarchitectural profile of video transcoding is highly dependent on the input video, that SIMD extensions provide limited benefits, and that commercial hardware transcoders impose tradeoffs that are not ideal for cloud video providers. Our benchmark should spur architectural innovations for this critical workload. This work shows how to benchmark a real world warehouse scale application and the possible pitfalls in case of a mischaracterization.
When considering accelerators for the different, but no less important, application of analytical query processing, design space exploration plays a critical role. We analyzed the Q100, a class of accelerators for this application domain, using TPC-H as the reference benchmark. We found that the hardware computational blocks have to be tailored to the requirements of the application, but also the Network on Chip (NoC) can be specialized. We developed an algorithm capable of producing more effective Q100 designs by tailoring the NoC to the communication requirements of the system. Our algorithm is capable of producing designs that are Pareto optimal compared to standard NoC topologies. This shows how NoC specialization is highly effective for accelerators and it should be an integral part of design space exploration for large accelerators’ designs.
The third part of this dissertation analyzes the impact of the level of specialization, e.g. using an ASIC or Coarse Grain Reconfigurable Architecture (CGRA) implementation, on an accelerator performance. We developed a CGRA architecture capable of executing SQL query plans. We compare this architecture against Q100, an ASIC that targets the same class of workloads. Despite being less specialized, this programmable architecture shows comparable performance to the Q100 given an area and power budget. Resource usage explains this counterintuitive result, since a well programmed, homogeneous array of resources is able to more effectively harness silicon for the workload at hand. This suggests that a balanced accelerator research portfolio must include alternative programmable architectures – and their software stacks
Seguimento de pessoas com drones em espaços inteligentes
Recent technological progress made over the last decades in the field
of Computer Vision has introduced new methods and algorithms with
ever increasing performance results. Particularly, the emergence of
machine learning algorithms enabled class based object detection on
live video feeds. Alongside these advances, Unmanned Aerial Vehicles (more commonly known as drones), have also experienced advancements in both hardware miniaturization and software optimization. Thanks to these improvements, drones have emerged from their
military usage based background and are now both used by the general
public and the scientific community for applications as distinct as aerial
photography and environmental monitoring.
This dissertation aims to take advantage of these recent technological
advancements and apply state of the art machine learning algorithms
in order to create a Unmanned Aerial Vehicle (UAV) based network
architecture capable of performing real time people tracking through
image detection.
To perform object detection, two distinct machine learning algorithms
are presented. The first one uses an SVM based approach, while the
second one uses an Convolutional Neural Network (CNN) based architecture. Both methods will be evaluated using an image dataset
created for the purposes of this dissertation’s work.
The evaluations performed regarding the object detectors performance
showed that the method using a CNN based architecture was the best
both in terms of processing time required and detection accuracy, and
therefore, the most suitable method for our implementation.
The developed network architecture was tested in a live scenario context, with the results showing that the system is capable of performing
people tracking at average walking speeds.O recente progresso tecnológico registado nas últimas décadas no
campo da Visão por Computador introduziu novos métodos e algoritmos com um desempenho cada vez mais elevado. Particularmente,
a criação de algoritmos de aprendizagem automática tornou possível
a detecção de objetos aplicada a feeds de vídeo capturadas em tempo
real. Paralelo com este progresso, a tecnologia relativa a veículos aéreos
não tripulados, ou drones, também beneficiaram de avanços tanto na
miniaturização dos seus componentes de hardware assim como na optimização do software. Graças a essas melhorias, os drones emergiram
do seu passado militar e são agora usados tanto pelo público em geral
como pela comunidade científica para aplicações tão distintas como
fotografia e monitorização ambiental.
O objectivo da presente dissertação pretende tirar proveito destes recentes avanços tecnológicos e aplicar algoritmos de aprendizagem automática de última geração para criar um sistema capaz de realizar
seguimento automático de pessoas com drones através de visão por
computador.
Para realizar a detecção de objetos, dois algoritmos distintos de aprendizagem automática são apresentados. O primeiro é dotado de uma
abordagem baseada em Support Vector Machine (SVM), enquanto o
segundo é caracterizado por uma arquitetura baseada em Redes Neuronais Convolucionais. Ambos os métodos serão avaliados usando uma
base de dados de imagens criada para os propósitos da presente dissertação.
As avaliações realizadas relativas ao desempenho dos algoritmos de detecção de objectos demonstraram que o método baseado numa arquitetura de Redes Neuronais Covolucionais foi o melhor tanto em termos
de tempo de processamento médio assim como na precisão das detecções, revelando-se portanto, como sendo o método mais adequado
de acordo com os objectivos pretendidos.
O sistema desenvolvido foi testado num contexto real, com os resultados obtidos a demonstrarem que o sistema é capaz de realizar o
seguimento de pessoas a velocidades comparáveis a um ritmo normal
humano de caminhada.Mestrado em Engenharia Eletrónica e Telecomunicaçõe
- …