17 research outputs found
Toward the distributed implementation of immersive augmented reality architectures on 5G networks
Augmented reality (AR) has been presented lately as one of the key technology fields in which 5G networks can become a disruptive tool, raising interest from both industry and academia. The main goal of this article is to extend the current state of the art of distributed AR studies and implementations by extracting the main AR algorithms' offloading requirements individually. This extension is further achieved by depicting the data flow between these algorithms and their hardware requirements. From the obtained results, we estimate a preliminary set of network key performance indicators (KPIs) for a subset of three examples of distributed AR implementations highlighting the necessity of 5G technologies and their ecosystem to unveil the full potential of AR. Finally, based on these KPls, we propose a set of 5G configuration parameters for a successful distributed AR implementation. As most of the described algorithms are also used in virtual reality (VR) applications, our contributions can facilitate future distributed implementations of both AR and VR applications.This work has received funding from the European Union (EU) Horizon 2020 research and innovation programme under the Marie Skodowska-Curie ETN TeamUp5G, grant agreement no. 813391
Improving the performance of object detection by preserving label distribution
Object detection is a task that performs position identification and label
classification of objects in images or videos. The information obtained through
this process plays an essential role in various tasks in the field of computer
vision. In object detection, the data utilized for training and validation
typically originate from public datasets that are well-balanced in terms of the
number of objects ascribed to each class in an image. However, in real-world
scenarios, handling datasets with much greater class imbalance, i.e., very
different numbers of objects for each class , is much more common, and this
imbalance may reduce the performance of object detection when predicting unseen
test images. In our study, thus, we propose a method that evenly distributes
the classes in an image for training and validation, solving the class
imbalance problem in object detection. Our proposed method aims to maintain a
uniform class distribution through multi-label stratification. We tested our
proposed method not only on public datasets that typically exhibit balanced
class distribution but also on custom datasets that may have imbalanced class
distribution. We found that our proposed method was more effective on datasets
containing severe imbalance and less data. Our findings indicate that the
proposed method can be effectively used on datasets with substantially
imbalanced class distribution.Comment: Code is available at
https://github.com/leeheewon-01/YOLOstratifiedKFold/tree/mai
Flexible development of location-based mobile augmented reality applications with AREA
Mobile applications have garnered a lot of attention in the last years. The computational capabilities of mobile devices are the mainstay to develop completely new application types. The provision of augmented reality experiences on mobile devices paves one alley in this feld. For example, in the automotive domain, augmented reality applications are used to experience, inter alia, the interior of a car by moving a mobile device around. The device’s camera then detects interior parts and shows additional information to the customer within the camera view. Another application type that is increasingly utilized is related to the combination of serious games with mobile augmented reality functions. Although the latter combination is promising for many scenarios, technically, it is a complex endeavor. In the AREA (Augmented Reality Engine Application) project, a kernel was implemented that enables location-based mobile augmented reality applications. Importantly, this kernel provides a fexible architecture that fosters the development of individual location-based mobile augmented reality applications. The work at hand shows the fexibility of AREA based on a developed serious game. Furthermore, the algorithm framework and major features of it are presented. As the conclusion of this paper, it is shown that mobile augmented reality applications require high development eforts. Therefore, fexible frameworks like AREA are crucial to develop respective applications in a reasonable time
Deep Video Codec Control
Lossy video compression is commonly used when transmitting and storing video
data. Unified video codecs (e.g., H.264 or H.265) remain the de facto standard,
despite the availability of advanced (neural) compression approaches.
Transmitting videos in the face of dynamic network bandwidth conditions
requires video codecs to adapt to vastly different compression strengths. Rate
control modules augment the codec's compression such that bandwidth constraints
are satisfied and video distortion is minimized. While, both standard video
codes and their rate control modules are developed to minimize video distortion
w.r.t. human quality assessment, preserving the downstream performance of deep
vision models is not considered. In this paper, we present the first end-to-end
learnable deep video codec control considering both bandwidth constraints and
downstream vision performance, while not breaking existing standardization. We
demonstrate for two common vision tasks (semantic segmentation and optical flow
estimation) and on two different datasets that our deep codec control better
preserves downstream performance than using 2-pass average bit rate control
while meeting dynamic bandwidth constraints and adhering to standardizations.Comment: 22 pages, 26 figures, 6 table
ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous Environment Adaptation
Pervasive mobile AI applications primarily employ one of the two learning
paradigms: cloud-based learning (with powerful large models) or on-device
learning (with lightweight small models). Despite their own advantages, neither
paradigm can effectively handle dynamic edge environments with frequent data
distribution shifts and on-device resource fluctuations, inevitably suffering
from performance degradation. In this paper, we propose ECLM, an edge-cloud
collaborative learning framework for rapid model adaptation for dynamic edge
environments. We first propose a novel block-level model decomposition design
to decompose the original large cloud model into multiple combinable modules.
By flexibly combining a subset of the modules, this design enables the
derivation of compact, task-specific sub-models for heterogeneous edge devices
from the large cloud model, and the seamless integration of new knowledge
learned on these devices into the cloud model periodically. As such, ECLM
ensures that the cloud model always provides up-to-date sub-models for edge
devices. We further propose an end-to-end learning framework that incorporates
the modular model design into an efficient model adaptation pipeline including
an offline on-cloud model prototyping and training stage, and an online
edge-cloud collaborative adaptation stage. Extensive experiments over various
datasets demonstrate that ECLM significantly improves model performance (e.g.,
18.89% accuracy increase) and resource efficiency (e.g., 7.12x communication
cost reduction) in adapting models to dynamic edge environments by efficiently
collaborating the edge and the cloud models