34 research outputs found
Synthetic Aperture Radar (SAR) Meets Deep Learning
This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Text-driven 3D stylization is a complex and crucial task in the fields of
computer vision (CV) and computer graphics (CG), aimed at transforming a bare
mesh to fit a target text. Prior methods adopt text-independent multilayer
perceptrons (MLPs) to predict the attributes of the target mesh with the
supervision of CLIP loss. However, such text-independent architecture lacks
textual guidance during predicting attributes, thus leading to unsatisfactory
stylization and slow convergence. To address these limitations, we present
X-Mesh, an innovative text-driven 3D stylization framework that incorporates a
novel Text-guided Dynamic Attention Module (TDAM). The TDAM dynamically
integrates the guidance of the target text by utilizing text-relevant spatial
and channel-wise attentions during vertex feature extraction, resulting in more
accurate attribute prediction and faster convergence speed. Furthermore,
existing works lack standard benchmarks and automated metrics for evaluation,
often relying on subjective and non-reproducible user studies to assess the
quality of stylized 3D assets. To overcome this limitation, we introduce a new
standard text-mesh benchmark, namely MIT-30, and two automated metrics, which
will enable future research to achieve fair and objective comparisons. Our
extensive qualitative and quantitative experiments demonstrate that X-Mesh
outperforms previous state-of-the-art methods.Comment: Technical repor
D4.2 Intelligent D-Band wireless systems and networks initial designs
This deliverable gives the results of the ARIADNE project's Task 4.2: Machine Learning based network intelligence. It presents the work conducted on various aspects of network management to deliver system level, qualitative solutions that leverage diverse machine learning techniques. The different chapters present system level, simulation and algorithmic models based on multi-agent reinforcement learning, deep reinforcement learning, learning automata for complex event forecasting, system level model for proactive handovers and resource allocation, model-driven deep learning-based channel estimation and feedbacks as well as strategies for deployment of machine learning based solutions. In short, the D4.2 provides results on promising AI and ML based methods along with their limitations and potentials that have been investigated in the ARIADNE project
State of the Art on Diffusion Models for Visual Computing
The field of visual computing is rapidly advancing due to the emergence of
generative artificial intelligence (AI), which unlocks unprecedented
capabilities for the generation, editing, and reconstruction of images, videos,
and 3D scenes. In these domains, diffusion models are the generative AI
architecture of choice. Within the last year alone, the literature on
diffusion-based tools and applications has seen exponential growth and relevant
papers are published across the computer graphics, computer vision, and AI
communities with new works appearing daily on arXiv. This rapid growth of the
field makes it difficult to keep up with all recent developments. The goal of
this state-of-the-art report (STAR) is to introduce the basic mathematical
concepts of diffusion models, implementation details and design choices of the
popular Stable Diffusion model, as well as overview important aspects of these
generative AI tools, including personalization, conditioning, inversion, among
others. Moreover, we give a comprehensive overview of the rapidly growing
literature on diffusion-based generation and editing, categorized by the type
of generated medium, including 2D images, videos, 3D objects, locomotion, and
4D scenes. Finally, we discuss available datasets, metrics, open challenges,
and social implications. This STAR provides an intuitive starting point to
explore this exciting topic for researchers, artists, and practitioners alike
AI-generated Content for Various Data Modalities: A Survey
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions
Fully Unsupervised Image Denoising, Diversity Denoising and Image Segmentation with Limited Annotations
Understanding the processes of cellular development and the interplay of cell shape changes, division and migration requires investigation of developmental processes at the spatial resolution of single cell. Biomedical imaging experiments enable the study of dynamic processes as they occur in living organisms. While biomedical imaging is essential, a key component of exposing unknown biological phenomena is quantitative image analysis. Biomedical images, especially microscopy images, are usually noisy owing to practical limitations such as available photon budget, sample sensitivity, etc. Additionally, microscopy images often contain artefacts due to the optical aberrations in microscopes or due to imperfections in camera sensor and internal electronics. The noisy nature of images as well as the artefacts prohibit accurate downstream analysis such as cell segmentation. Although countless approaches have been proposed for image denoising, artefact removal and segmentation, supervised Deep Learning (DL) based content-aware algorithms are currently the best performing for all these tasks.
Supervised DL based methods are plagued by many practical limitations. Supervised denoising and artefact removal algorithms require paired corrupted and high quality images for training. Obtaining such image pairs can be very hard and virtually impossible in most biomedical imaging applications owing to photosensitivity and the dynamic nature of the samples being imaged. Similarly, supervised DL based segmentation methods need copious amounts of annotated data for training, which is often very expensive to obtain. Owing to these restrictions, it is imperative to look beyond supervised methods. The objective of this thesis is to develop novel unsupervised alternatives for image denoising, and artefact removal as well as semisupervised approaches for image segmentation.
The first part of this thesis deals with unsupervised image denoising and artefact removal. For unsupervised image denoising task, this thesis first introduces a probabilistic approach for training DL based methods using parametric models of imaging noise. Next, a novel unsupervised diversity denoising framework is presented which addresses the fundamentally non-unique inverse nature of image denoising by generating multiple plausible denoised solutions for any given noisy image. Finally, interesting properties of the diversity denoising methods are presented which make them suitable for unsupervised spatial artefact removal in microscopy and medical imaging applications.
In the second part of this thesis, the problem of cell/nucleus segmentation is addressed. The focus is especially on practical scenarios where ground truth annotations for training DL based segmentation methods are scarcely available. Unsupervised denoising is used as an aid to improve segmentation performance in the presence of limited annotations. Several training strategies are presented in this work to leverage the representations learned by unsupervised denoising networks to enable better cell/nucleus segmentation in microscopy data. Apart from DL based segmentation methods, a proof-of-concept is introduced which views cell/nucleus segmentation from the perspective of solving a label fusion problem. This method, through limited human interaction, learns to choose the best possible segmentation for each cell/nucleus using only a pool of diverse (and possibly faulty) segmentation hypotheses as input.
In summary, this thesis seeks to introduce new unsupervised denoising and artefact removal methods as well as semi-supervised segmentation methods which can be easily deployed to directly and immediately benefit biomedical practitioners with their research
Deep Model-Based Super-Resolution with Non-uniform Blur
We propose a state-of-the-art method for super-resolution with non-uniform
blur. Single-image super-resolution methods seek to restore a high-resolution
image from blurred, subsampled, and noisy measurements. Despite their
impressive performance, existing techniques usually assume a uniform blur
kernel. Hence, these techniques do not generalize well to the more general case
of non-uniform blur. Instead, in this paper, we address the more realistic and
computationally challenging case of spatially-varying blur. To this end, we
first propose a fast deep plug-and-play algorithm, based on linearized ADMM
splitting techniques, which can solve the super-resolution problem with
spatially-varying blur. Second, we unfold our iterative algorithm into a single
network and train it end-to-end. In this way, we overcome the intricacy of
manually tuning the parameters involved in the optimization scheme. Our
algorithm presents remarkable performance and generalizes well after a single
training to a large family of spatially-varying blur kernels, noise levels and
scale factors
Applied Methuerstic computing
For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC
Bridging generative models and Convolutional Neural Networks for domain-agnostic segmentation of brain MRI
Segmentation of brain MRI scans is paramount in neuroimaging, as it is a prerequisite for many subsequent analyses. Although manual segmentation is considered the gold standard, it suffers from severe reproducibility issues, and is extremely tedious, which limits its application to large datasets. Therefore, there is a clear need for automated tools that enable fast and accurate segmentation of brain MRI scans.
Recent methods rely on convolutional neural networks (CNNs). While CNNs obtain accurate results on their training domain, they are highly sensitive to changes in resolution and MRI contrast. Although data augmentation and domain adaptation techniques can increase the generalisability of CNNs, these methods still need to be retrained for every new domain, which requires costly labelling of images.
Here, we present a learning strategy to make CNNs agnostic to MRI contrast, resolution, and numerous artefacts. Specifically, we train a network with synthetic data sampled from a generative model conditioned on segmentations. Crucially, we adopt a domain randomisation approach where all generation parameters are drawn for each example from uniform priors. As a result, the network is forced to learn domain-agnostic features, and can segment real test scans without retraining. The proposed method almost achieves the accuracy of supervised CNNs on their training domain, and substantially outperforms state-of-the-art domain adaptation methods. Finally, based on this learning strategy, we present a segmentation suite for robust analysis of heterogeneous clinical scans. Overall, our approach unlocks the development of morphometry on millions of clinical scans, which ultimately has the potential to improve the diagnosis and characterisation of neurological disorders
Compressive learning: new models and applications
Today’s world is fuelled by data. From self-driving cars through to agriculture, massive amounts of data are used to fit learning models to provide valuable insights and predictions. Such insights come at a significant price as many traditional learning procedures have both memory and computational costs that scale with the size of the data. This quickly becomes prohibitive, even when substantial resources are available. A new way of learning is therefore needed to allow for efficient model fitting in the 21st century. The birth of compressive learning in recent years has provided a novel solution to the bottleneck of learning from big data. Situated at the core of the compressive learning framework is the construction of a so-called sketch. The sketch is a compact representation of the data that provides sufficient information for specific learning tasks. In this thesis we develop the compressive learning framework to a host of new models
and applications. In the first part of the thesis, we consider the group of semi-parametric models and demonstrate the unique advantages and challenges associated with creating a compressive learning paradigm for these particular models. Concentrating on the independent component analysis model, we develop a framework of algorithms and theory enabling magnitudes of compression
with respect to memory complexity compared to existing methods. In the second part of the thesis, we develop a compressive learning framework to the emerging technology of single-photon counting lidar. We demonstrate that forming a sketch of the time-of-flight data circumvents the inherent data-transfer bottleneck of existing lidar techniques. Finally, we extend the compressive lidar technology by developing both an efficient sketch-based detection algorithm that can detect the presence of a surface solely from the sketch and a sketched plug
and play framework that can integrate existing powerful denoisers that are robust to noisy lidar scenes with low photon counts