9,524 research outputs found
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
NF-Atlas: Multi-Volume Neural Feature Fields for Large Scale LiDAR Mapping
LiDAR Mapping has been a long-standing problem in robotics. Recent progress
in neural implicit representation has brought new opportunities to robotic
mapping. In this paper, we propose the multi-volume neural feature fields,
called NF-Atlas, which bridge the neural feature volumes with pose graph
optimization. By regarding the neural feature volume as pose graph nodes and
the relative pose between volumes as pose graph edges, the entire neural
feature field becomes both locally rigid and globally elastic. Locally, the
neural feature volume employs a sparse feature Octree and a small MLP to encode
the submap SDF with an option of semantics. Learning the map using this
structure allows for end-to-end solving of maximum a posteriori (MAP) based
probabilistic mapping. Globally, the map is built volume by volume
independently, avoiding catastrophic forgetting when mapping incrementally.
Furthermore, when a loop closure occurs, with the elastic pose graph based
representation, only updating the origin of neural volumes is required without
remapping. Finally, these functionalities of NF-Atlas are validated. Thanks to
the sparsity and the optimization based formulation, NF-Atlas shows competitive
performance in terms of accuracy, efficiency and memory usage on both
simulation and real-world datasets
ADS_UNet: A Nested UNet for Histopathology Image Segmentation
The UNet model consists of fully convolutional network (FCN) layers arranged
as contracting encoder and upsampling decoder maps. Nested arrangements of
these encoder and decoder maps give rise to extensions of the UNet model, such
as UNete and UNet++. Other refinements include constraining the outputs of the
convolutional layers to discriminate between segment labels when trained end to
end, a property called deep supervision. This reduces feature diversity in
these nested UNet models despite their large parameter space. Furthermore, for
texture segmentation, pixel correlations at multiple scales contribute to the
classification task; hence, explicit deep supervision of shallower layers is
likely to enhance performance. In this paper, we propose ADS UNet, a stage-wise
additive training algorithm that incorporates resource-efficient deep
supervision in shallower layers and takes performance-weighted combinations of
the sub-UNets to create the segmentation model. We provide empirical evidence
on three histopathology datasets to support the claim that the proposed ADS
UNet reduces correlations between constituent features and improves performance
while being more resource efficient. We demonstrate that ADS_UNet outperforms
state-of-the-art Transformer-based models by 1.08 and 0.6 points on CRAG and
BCSS datasets, and yet requires only 37% of GPU consumption and 34% of training
time as that required by Transformers.Comment: To be published in Expert Systems With Application
DiffRF: Rendering-Guided 3D Radiance Field Diffusion
We introduce DiffRF, a novel approach for 3D radiance field synthesis based
on denoising diffusion probabilistic models. While existing diffusion-based
methods operate on images, latent codes, or point cloud data, we are the first
to directly generate volumetric radiance fields. To this end, we propose a 3D
denoising model which directly operates on an explicit voxel grid
representation. However, as radiance fields generated from a set of posed
images can be ambiguous and contain artifacts, obtaining ground truth radiance
field samples is non-trivial. We address this challenge by pairing the
denoising formulation with a rendering loss, enabling our model to learn a
deviated prior that favours good image quality instead of trying to replicate
fitting errors like floating artifacts. In contrast to 2D-diffusion models, our
model learns multi-view consistent priors, enabling free-view synthesis and
accurate shape generation. Compared to 3D GANs, our diffusion-based approach
naturally enables conditional generation such as masked completion or
single-view 3D synthesis at inference time.Comment: Project page: https://sirwyver.github.io/DiffRF/ Video:
https://youtu.be/qETBcLu8SUk - CVPR 2023 Highlight - updated evaluations
after fixing initial data mapping error on all method
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
In this paper, a critical bibliometric analysis study is conducted, coupled
with an extensive literature survey on recent developments and associated
applications in machine learning research with a perspective on Africa. The
presented bibliometric analysis study consists of 2761 machine learning-related
documents, of which 98% were articles with at least 482 citations published in
903 journals during the past 30 years. Furthermore, the collated documents were
retrieved from the Science Citation Index EXPANDED, comprising research
publications from 54 African countries between 1993 and 2021. The bibliometric
study shows the visualization of the current landscape and future trends in
machine learning research and its application to facilitate future
collaborative research and knowledge exchange among authors from different
research institutions scattered across the African continent
Um modelo para suporte automatizado ao reconhecimento, extração, personalização e reconstrução de gráficos estáticos
Data charts are widely used in our daily lives, being present in regular media,
such as newspapers, magazines, web pages, books, and many others. A well constructed
data chart leads to an intuitive understanding of its underlying data
and in the same way, when data charts have wrong design choices, a redesign
of these representations might be needed. However, in most cases, these
charts are shown as a static image, which means that the original data are not
usually available. Therefore, automatic methods could be applied to extract the
underlying data from the chart images to allow these changes. The task of
recognizing charts and extracting data from them is complex, largely due to the
variety of chart types and their visual characteristics.
Computer Vision techniques for image classification and object detection are
widely used for the problem of recognizing charts, but only in images without
any disturbance. Other features in real-world images that can make this task
difficult are not present in most literature works, like photo distortions, noise,
alignment, etc. Two computer vision techniques that can assist this task and
have been little explored in this context are perspective detection and
correction. These methods transform a distorted and noisy chart in a clear
chart, with its type ready for data extraction or other uses. The task of
reconstructing data is straightforward, as long the data is available the
visualization can be reconstructed, but the scenario of reconstructing it on the
same context is complex.
Using a Visualization Grammar for this scenario is a key component, as these
grammars usually have extensions for interaction, chart layers, and multiple
views without requiring extra development effort.
This work presents a model for automated support for custom recognition, and
reconstruction of charts in images. The model automatically performs the
process steps, such as reverse engineering, turning a static chart back into its
data table for later reconstruction, while allowing the user to make modifications
in case of uncertainties. This work also features a model-based architecture
along with prototypes for various use cases. Validation is performed step by
step, with methods inspired by the literature. This work features three use
cases providing proof of concept and validation of the model.
The first use case features usage of chart recognition methods focused on
documents in the real-world, the second use case focus on vocalization of
charts, using a visualization grammar to reconstruct a chart in audio format,
and the third use case presents an Augmented Reality application that
recognizes and reconstructs charts in the same context (a piece of paper)
overlaying the new chart and interaction widgets. The results showed that with
slight changes, chart recognition and reconstruction methods are now ready for
real-world charts, when taking time, accuracy and precision into consideration.Os gráficos de dados são amplamente utilizados na nossa vida diária, estando
presentes nos meios de comunicação regulares, tais como jornais, revistas,
páginas web, livros, e muitos outros. Um gráfico bem construído leva a uma
compreensão intuitiva dos seus dados inerentes e da mesma forma, quando
os gráficos de dados têm escolhas de conceção erradas, poderá ser
necessário um redesenho destas representações. Contudo, na maioria dos
casos, estes gráficos são mostrados como uma imagem estática, o que
significa que os dados originais não estão normalmente disponíveis. Portanto,
poderiam ser aplicados métodos automáticos para extrair os dados inerentes
das imagens dos gráficos, a fim de permitir estas alterações. A tarefa de
reconhecer os gráficos e extrair dados dos mesmos é complexa, em grande
parte devido à variedade de tipos de gráficos e às suas características visuais.
As técnicas de Visão Computacional para classificação de imagens e deteção
de objetos são amplamente utilizadas para o problema de reconhecimento de
gráficos, mas apenas em imagens sem qualquer ruído. Outras características
das imagens do mundo real que podem dificultar esta tarefa não estão
presentes na maioria das obras literárias, como distorções fotográficas, ruído,
alinhamento, etc. Duas técnicas de visão computacional que podem ajudar
nesta tarefa e que têm sido pouco exploradas neste contexto são a deteção e
correção da perspetiva. Estes métodos transformam um gráfico distorcido e
ruidoso em um gráfico limpo, com o seu tipo pronto para extração de dados
ou outras utilizações. A tarefa de reconstrução de dados é simples, desde que
os dados estejam disponíveis a visualização pode ser reconstruída, mas o
cenário de reconstrução no mesmo contexto é complexo.
A utilização de uma Gramática de Visualização para este cenário é um
componente chave, uma vez que estas gramáticas têm normalmente
extensões para interação, camadas de gráficos, e visões múltiplas sem exigir
um esforço extra de desenvolvimento.
Este trabalho apresenta um modelo de suporte automatizado para o
reconhecimento personalizado, e reconstrução de gráficos em imagens
estáticas. O modelo executa automaticamente as etapas do processo, tais
como engenharia inversa, transformando um gráfico estático novamente na
sua tabela de dados para posterior reconstrução, ao mesmo tempo que
permite ao utilizador fazer modificações em caso de incertezas. Este trabalho
também apresenta uma arquitetura baseada em modelos, juntamente com
protótipos para vários casos de utilização. A validação é efetuada passo a
passo, com métodos inspirados na literatura. Este trabalho apresenta três
casos de uso, fornecendo prova de conceito e validação do modelo.
O primeiro caso de uso apresenta a utilização de métodos de reconhecimento
de gráficos focando em documentos no mundo real, o segundo caso de uso
centra-se na vocalização de gráficos, utilizando uma gramática de visualização
para reconstruir um gráfico em formato áudio, e o terceiro caso de uso
apresenta uma aplicação de Realidade Aumentada que reconhece e reconstrói
gráficos no mesmo contexto (um pedaço de papel) sobrepondo os novos
gráficos e widgets de interação. Os resultados mostraram que com pequenas
alterações, os métodos de reconhecimento e reconstrução dos gráficos estão
agora prontos para os gráficos do mundo real, tendo em consideração o
tempo, a acurácia e a precisão.Programa Doutoral em Engenharia Informátic
The Role of Transient Vibration of the Skull on Concussion
Concussion is a traumatic brain injury usually caused by a direct or indirect blow to the head that affects brain function. The maximum mechanical impedance of the brain tissue occurs at 450±50 Hz and may be affected by the skull resonant frequencies. After an impact to the head, vibration resonance of the skull damages the underlying cortex. The skull deforms and vibrates, like a bell for 3 to 5 milliseconds, bruising the cortex. Furthermore, the deceleration forces the frontal and temporal cortex against the skull, eliminating a layer of cerebrospinal fluid. When the skull vibrates, the force spreads directly to the cortex, with no layer of cerebrospinal fluid to reflect the wave or cushion its force. To date, there is few researches investigating the effect of transient vibration of the skull. Therefore, the overall goal of the proposed research is to gain better understanding of the role of transient vibration of the skull on concussion. This goal will be achieved by addressing three research objectives. First, a MRI skull and brain segmentation automatic technique is developed. Due to bones’ weak magnetic resonance signal, MRI scans struggle with differentiating bone tissue from other structures. One of the most important components for a successful segmentation is high-quality ground truth labels. Therefore, we introduce a deep learning framework for skull segmentation purpose where the ground truth labels are created from CT imaging using the standard tessellation language (STL). Furthermore, the brain region will be important for a future work, thus, we explore a new initialization concept of the convolutional neural network (CNN) by orthogonal moments to improve brain segmentation in MRI. Second, the creation of a novel 2D and 3D Automatic Method to Align the Facial Skeleton is introduced. An important aspect for further impact analysis is the ability to precisely simulate the same point of impact on multiple bone models. To perform this task, the skull must be precisely aligned in all anatomical planes. Therefore, we introduce a 2D/3D technique to align the facial skeleton that was initially developed for automatically calculating the craniofacial symmetry midline. In the 2D version, the entire concept of using cephalometric landmarks and manual image grid alignment to construct the training dataset was introduced. Then, this concept was extended to a 3D version where coronal and transverse planes are aligned using CNN approach. As the alignment in the sagittal plane is still undefined, a new alignment based on these techniques will be created to align the sagittal plane using Frankfort plane as a framework. Finally, the resonant frequencies of multiple skulls are assessed to determine how the skull resonant frequency vibrations propagate into the brain tissue. After applying material properties and mesh to the skull, modal analysis is performed to assess the skull natural frequencies. Finally, theories will be raised regarding the relation between the skull geometry, such as shape and thickness, and vibration with brain tissue injury, which may result in concussive injury
Optical coherence tomography methods using 2-D detector arrays
Optical coherence tomography (OCT) is a non-invasive, non-contact optical technique that allows cross-section imaging of biological tissues with high spatial resolution, high sensitivity and high dynamic range. Standard OCT uses a focused beam to illuminate a point on the target and detects the signal using a single photodetector. To acquire transverse information, transversal scanning of the illumination point is required. Alternatively, multiple OCT channels can be operated in parallel simultaneously; parallel OCT signals are recorded by a two-dimensional (2D) detector array. This approach is known as Parallel-detection OCT. In this thesis, methods, experiments and results using three parallel OCT techniques, including full -field (time-domain) OCT (FF-OCT), full-field swept-source OCT (FF-SS-OCT) and line-field Fourier-domain OCT (LF-FD-OCT), are presented. Several 2D digital cameras of different formats have been used and evaluated in the experiments of different methods. With the LF-FD-OCT method, photography equipment, such as flashtubes and commercial DSLR cameras have been equipped and tested for OCT imaging. The techniques used in FF-OCT and FF-SS-OCT are employed in a novel wavefront sensing technique, which combines OCT methods with a Shack-Hartmann wavefront sensor (SH-WFS). This combination technique is demonstrated capable of measuring depth-resolved wavefront aberrations, which has the potential to extend the applications of SH-WFS in wavefront-guided biomedical imaging techniques
Graphical scaffolding for the learning of data wrangling APIs
In order for students across the sciences to avail themselves of modern data streams, they must first know how to wrangle data: how to reshape ill-organised, tabular data into another format, and how to do this programmatically, in languages such as Python and R. Despite the cross-departmental demand and the ubiquity of data wrangling in analytical workflows, the research on how to optimise the instruction of it has been minimal. Although data wrangling as a programming domain presents distinctive challenges - characterised by on-the-fly syntax lookup and code example integration - it also presents opportunities. One such opportunity is how tabular data structures are easily visualised. To leverage the inherent visualisability of data wrangling, this dissertation evaluates three types of graphics that could be employed as scaffolding for novices: subgoal graphics, thumbnail graphics, and parameter graphics. Using a specially built e-learning platform, this dissertation documents a multi-institutional, randomised, and controlled experiment that investigates the pedagogical effects of these. Our results indicate that the graphics are well-received, that subgoal graphics boost the completion rate, and that thumbnail graphics improve navigability within a command menu. We also obtained several non-significant results, and indications that parameter graphics are counter-productive. We will discuss these findings in the context of general scaffolding dilemmas, and how they fit into a wider research programme on data wrangling instruction
- …