162 research outputs found
Image and Video Coding Techniques for Ultra-low Latency
The next generation of wireless networks fosters the adoption of latency-critical applications such as XR, connected industry, or autonomous driving. This survey gathers implementation aspects of different image and video coding schemes and discusses their tradeoffs. Standardized video coding technologies such as HEVC or VVC provide a high compression ratio, but their enormous complexity sets the scene for alternative approaches like still image, mezzanine, or texture compression in scenarios with tight resource or latency constraints. Regardless of the coding scheme, we found inter-device memory transfers and the lack of sub-frame coding as limitations of current full-system and software-programmable implementations.publishedVersionPeer reviewe
Need for speed:Achieving fast image processing in acute stroke care
This thesis aims to investigate the use of high-performance computing (HPC) techniques in developing imaging biomarkers to support the clinical workflow of acute stroke patients. In the first part of this thesis, we evaluate different HPC technologies and how such technologies can be leveraged by different image analysis applications used in the context of acute stroke care. More specifically, we evaluated how computers with multiple computing devices can be used to accelerate medical imaging applications in Chapter 2. Chapter 3 proposes a novel data compression technique that efficiently processes CT perfusion (CTP) images in GPUs. Unfortunately, the size of CTP datasets makes data transfers to computing devices time-consuming and, therefore, unsuitable in acute situations. Chapter 4 further evaluates the algorithm's usefulness proposed in Chapter 3 with two different applications: a double threshold segmentation and a time-intensity profile similarity (TIPS) bilateral filter to reduce noise in CTP scans. Finally, Chapter 5 presents a cloud platform for deploying high-performance medical applications for acute stroke patients. In Part 2 of this thesis, Chapter 6 presents a convolutional neural network (CNN) for detecting and volumetric segmentation of subarachnoid hemorrhages (SAH) in non-contrast CT scans. Chapter 7 proposed another method based on CNNs to quantify the final infarct volumes in follow-up non-contrast CT scans from ischemic stroke patients
Distributed Implementation of eXtended Reality Technologies over 5G Networks
MenciĂłn Internacional en el tĂtulo de doctorThe revolution of Extended Reality (XR) has already started and is rapidly
expanding as technology advances. Announcements such as Meta’s Metaverse have
boosted the general interest in XR technologies, producing novel use cases. With
the advent of the fifth generation of cellular networks (5G), XR technologies are
expected to improve significantly by offloading heavy computational processes from
the XR Head Mounted Display (HMD) to an edge server. XR offloading can rapidly
boost XR technologies by considerably reducing the burden on the XR hardware,
while improving the overall user experience by enabling smoother graphics and more
realistic interactions. Overall, the combination of XR and 5G has the potential to
revolutionize the way we interact with technology and experience the world around
us.
However, XR offloading is a complex task that requires state-of-the-art tools
and solutions, as well as an advanced wireless network that can meet the demanding
throughput, latency, and reliability requirements of XR. The definition of these
requirements strongly depends on the use case and particular XR offloading implementations.
Therefore, it is crucial to perform a thorough Key Performance
Indicators (KPIs) analysis to ensure a successful design of any XR offloading solution.
Additionally, distributed XR implementations can be intrincated systems with
multiple processes running on different devices or virtual instances. All these agents
must be well-handled and synchronized to achieve XR real-time requirements and
ensure the expected user experience, guaranteeing a low processing overhead. XR
offloading requires a carefully designed architecture which complies with the required
KPIs while efficiently synchronizing and handling multiple heterogeneous devices.
Offloading XR has become an essential use case for 5G and beyond 5G technologies.
However, testing distributed XR implementations requires access to advanced
5G deployments that are often unavailable to most XR application developers. Conversely,
the development of 5G technologies requires constant feedback from potential
applications and use cases. Unfortunately, most 5G providers, engineers, or
researchers lack access to cutting-edge XR hardware or applications, which can hinder
the fast implementation and improvement of 5G’s most advanced features. Both
technology fields require ongoing input and continuous development from each other
to fully realize their potential. As a result, XR and 5G researchers and developers
must have access to the necessary tools and knowledge to ensure the rapid and
satisfactory development of both technology fields.
In this thesis, we focus on these challenges providing knowledge, tools and solutiond towards the implementation of advanced offloading technologies, opening the
door to more immersive, comfortable and accessible XR technologies. Our contributions
to the field of XR offloading include a detailed study and description of the
necessary network throughput and latency KPIs for XR offloading, an architecture
for low latency XR offloading and our full end to end XR offloading implementation
ready for a commercial XR HMD. Besides, we also present a set of tools which can
facilitate the joint development of 5G networks and XR offloading technologies: our
5G RAN real-time emulator and a multi-scenario XR IP traffic dataset.
Firstly, in this thesis, we thoroughly examine and explain the KPIs that are
required to achieve the expected Quality of Experience (QoE) and enhanced immersiveness
in XR offloading solutions. Our analysis focuses on individual XR
algorithms, rather than potential use cases. Additionally, we provide an initial
description of feasible 5G deployments that could fulfill some of the proposed KPIs
for different offloading scenarios.
We also present our low latency muti-modal XR offloading architecture, which
has already been tested on a commercial XR device and advanced 5G deployments,
such as millimeter-wave (mmW) technologies. Besides, we describe our full endto-
end complex XR offloading system which relies on our offloading architecture to
provide low latency communication between a commercial XR device and a server
running a Machine Learning (ML) algorithm. To the best of our knowledge, this is
one of the first successful XR offloading implementations for complex ML algorithms
in a commercial device.
With the goal of providing XR developers and researchers access to complex
5G deployments and accelerating the development of future XR technologies, we
present FikoRE, our 5G RAN real-time emulator. FikoRE has been specifically
designed not only to model the network with sufficient accuracy but also to support
the emulation of a massive number of users and actual IP throughput. As FikoRE
can handle actual IP traffic above 1 Gbps, it can directly be used to test distributed
XR solutions. As we describe in the thesis, its emulation capabilities make FikoRE
a potential candidate to become a reference testbed for distributed XR developers
and researchers.
Finally, we used our XR offloading tools to generate an XR IP traffic dataset
which can accelerate the development of 5G technologies by providing a straightforward
manner for testing novel 5G solutions using realistic XR data. This dataset is
generated for two relevant XR offloading scenarios: split rendering, in which the rendering
step is moved to an edge server, and heavy ML algorithm offloading. Besides,
we derive the corresponding IP traffic models from the captured data, which can be
used to generate realistic XR IP traffic. We also present the validation experiments
performed on the derived models and their results.This work has received funding from the European Union (EU) Horizon 2020 research and innovation programme under the Marie SkĹ‚odowska-Curie ETN TeamUp5G, grant agreement No. 813391.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Narciso GarcĂa Santos.- Secretario: Fernando DĂaz de MarĂa.- Vocal: Aryan Kaushi
Efficient streaming for high fidelity imaging
Researchers and practitioners of graphics, visualisation and imaging have an ever-expanding list of technologies to account for, including (but not limited to) HDR, VR, 4K, 360°, light field and wide colour gamut. As these technologies move from theory to practice, the methods of encoding and transmitting this information need to become more advanced and capable year on year, placing greater demands on latency, bandwidth, and encoding performance.
High dynamic range (HDR) video is still in its infancy; the tools for capture, transmission and display of true HDR content are still restricted to professional technicians. Meanwhile, computer graphics are nowadays near-ubiquitous, but to achieve the highest fidelity in real or even reasonable time a user must be located at or near a supercomputer or other specialist workstation. These physical requirements mean that it is not always possible to demonstrate these graphics in any given place at any time, and when the graphics in question are intended to provide a virtual reality experience, the constrains on performance and latency are even tighter.
This thesis presents an overall framework for adapting upcoming imaging technologies for efficient streaming, constituting novel work across three areas of imaging technology. Over the course of the thesis, high dynamic range capture, transmission and display is considered, before specifically focusing on the transmission and display of high fidelity rendered graphics, including HDR graphics. Finally, this thesis considers the technical challenges posed by incoming head-mounted displays (HMDs). In addition, a full literature review is presented across all three of these areas, detailing state-of-the-art methods for approaching all three problem sets.
In the area of high dynamic range capture, transmission and display, a framework is presented and evaluated for efficient processing, streaming and encoding of high dynamic range video using general-purpose graphics processing unit (GPGPU) technologies.
For remote rendering, state-of-the-art methods of augmenting a streamed graphical render are adapted to incorporate HDR video and high fidelity graphics rendering, specifically with regards to path tracing.
Finally, a novel method is proposed for streaming graphics to a HMD for virtual reality (VR). This method utilises 360° projections to transmit and reproject stereo imagery to a HMD with minimal latency, with an adaptation for the rapid local production of depth maps
Symmetry-Adapted Machine Learning for Information Security
Symmetry-adapted machine learning has shown encouraging ability to mitigate the security risks in information and communication technology (ICT) systems. It is a subset of artificial intelligence (AI) that relies on the principles of processing future events by learning past events or historical data. The autonomous nature of symmetry-adapted machine learning supports effective data processing and analysis for security detection in ICT systems without the interference of human authorities. Many industries are developing machine-learning-adapted solutions to support security for smart hardware, distributed computing, and the cloud. In our Special Issue book, we focus on the deployment of symmetry-adapted machine learning for information security in various application areas. This security approach can support effective methods to handle the dynamic nature of security attacks by extraction and analysis of data to identify hidden patterns of data. The main topics of this Issue include malware classification, an intrusion detection system, image watermarking, color image watermarking, battlefield target aggregation behavior recognition model, IP camera, Internet of Things (IoT) security, service function chain, indoor positioning system, and crypto-analysis
Videos in Context for Telecommunication and Spatial Browsing
The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance
From Capture to Display: A Survey on Volumetric Video
Volumetric video, which offers immersive viewing experiences, is gaining
increasing prominence. With its six degrees of freedom, it provides viewers
with greater immersion and interactivity compared to traditional videos.
Despite their potential, volumetric video services poses significant
challenges. This survey conducts a comprehensive review of the existing
literature on volumetric video. We firstly provide a general framework of
volumetric video services, followed by a discussion on prerequisites for
volumetric video, encompassing representations, open datasets, and quality
assessment metrics. Then we delve into the current methodologies for each stage
of the volumetric video service pipeline, detailing capturing, compression,
transmission, rendering, and display techniques. Lastly, we explore various
applications enabled by this pioneering technology and we present an array of
research challenges and opportunities in the domain of volumetric video
services. This survey aspires to provide a holistic understanding of this
burgeoning field and shed light on potential future research trajectories,
aiming to bring the vision of volumetric video to fruition.Comment: Submitte
- …