216 research outputs found
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends
In today's digital age, Convolutional Neural Networks (CNNs), a subset of
Deep Learning (DL), are widely used for various computer vision tasks such as
image classification, object detection, and image segmentation. There are
numerous types of CNNs designed to meet specific needs and requirements,
including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention,
depthwise convolutions, and NAS, among others. Each type of CNN has its unique
structure and characteristics, making it suitable for specific tasks. It's
crucial to gain a thorough understanding and perform a comparative analysis of
these different CNN types to understand their strengths and weaknesses.
Furthermore, studying the performance, limitations, and practical applications
of each type of CNN can aid in the development of new and improved
architectures in the future. We also dive into the platforms and frameworks
that researchers utilize for their research or development from various
perspectives. Additionally, we explore the main research fields of CNN like 6D
vision, generative models, and meta-learning. This survey paper provides a
comprehensive examination and comparison of various CNN architectures,
highlighting their architectural differences and emphasizing their respective
advantages, disadvantages, applications, challenges, and future trends
Multi-source Remote Sensing for Forest Characterization and Monitoring
As a dominant terrestrial ecosystem of the Earth, forest environments play profound roles in ecology, biodiversity, resource utilization, and management, which highlights the significance of forest characterization and monitoring. Some forest parameters can help track climate change and quantify the global carbon cycle and therefore attract growing attention from various research communities. Compared with traditional in-situ methods with expensive and time-consuming field works involved, airborne and spaceborne remote sensors collect cost-efficient and consistent observations at global or regional scales and have been proven to be an effective way for forest monitoring. With the looming paradigm shift toward data-intensive science and the development of remote sensors, remote sensing data with higher resolution and diversity have been the mainstream in data analysis and processing. However, significant heterogeneities in the multi-source remote sensing data largely restrain its forest applications urging the research community to come up with effective synergistic strategies.
The work presented in this thesis contributes to the field by exploring the potential of the Synthetic Aperture Radar (SAR), SAR Polarimetry (PolSAR), SAR Interferometry (InSAR), Polarimetric SAR Interferometry (PolInSAR), Light Detection and Ranging (LiDAR), and multispectral remote sensing in forest characterization and monitoring from three main aspects including forest height estimation, active fire detection, and burned area mapping.
First, the forest height inversion is demonstrated using airborne L-band dual-baseline repeat-pass PolInSAR data based on modified versions of the Random Motion over Ground (RMoG) model, where the scattering attenuation and wind-derived random motion are described in conditions of homogeneous and heterogeneous volume layer, respectively. A boreal and a tropical forest test site are involved in the experiment to explore the flexibility of different models over different forest types and based on that, a leveraging strategy is proposed to boost the accuracy of forest height estimation.
The accuracy of the model-based forest height inversion is limited by the discrepancy between the theoretical models and actual scenarios and exhibits a strong dependency on the system and scenario parameters. Hence, high vertical accuracy LiDAR samples are employed to assist the PolInSAR-based forest height estimation. This multi-source forest height estimation is reformulated as a pan-sharpening task aiming to generate forest heights with high spatial resolution and vertical accuracy based on the synergy of the sparse LiDAR-derived heights and the information embedded in the PolInSAR data. This process is realized by a specifically designed generative adversarial network (GAN) allowing high accuracy forest height estimation less limited by theoretical models and system parameters. Related experiments are carried out over a boreal and a tropical forest to validate the flexibility of the method.
An automated active fire detection framework is proposed for the medium resolution multispectral remote sensing data. The basic part of this framework is a deep-learning-based semantic segmentation model specifically designed for active fire detection. A dataset is constructed with open-access Sentinel-2 imagery for the training and testing of the deep-learning model. The developed framework allows an automated Sentinel-2 data download, processing, and generation of the active fire detection results through time and location information provided by the user. Related performance is evaluated in terms of detection accuracy and processing efficiency.
The last part of this thesis explored whether the coarse burned area products can be further improved through the synergy of multispectral, SAR, and InSAR features with higher spatial resolutions. A Siamese Self-Attention (SSA) classification is proposed for the multi-sensor burned area mapping and a multi-source dataset is constructed at the object level for the training and testing. Results are analyzed by different test sites, feature sources, and classification methods to assess the improvements achieved by the proposed method.
All developed methods are validated with extensive processing of multi-source data acquired by Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR), Land, Vegetation, and Ice Sensor (LVIS), PolSARproSim+, Sentinel-1, and Sentinel-2. I hope these studies constitute a substantial contribution to the forest applications of multi-source remote sensing
Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)
The book attempts to introduce a gentle introduction to the field of Facial
Micro Expressions Recognition (FMER) using Color and Depth images, with the aid
of MATLAB programming environment. FMER is a subset of image processing and it
is a multidisciplinary topic to analysis. So, it requires familiarity with
other topics of Artifactual Intelligence (AI) such as machine learning, digital
image processing, psychology and more. So, it is a great opportunity to write a
book which covers all of these topics for beginner to professional readers in
the field of AI and even without having background of AI. Our goal is to
provide a standalone introduction in the field of MFER analysis in the form of
theorical descriptions for readers with no background in image processing with
reproducible Matlab practical examples. Also, we describe any basic definitions
for FMER analysis and MATLAB library which is used in the text, that helps
final reader to apply the experiments in the real-world applications. We
believe that this book is suitable for students, researchers, and professionals
alike, who need to develop practical skills, along with a basic understanding
of the field. We expect that, after reading this book, the reader feels
comfortable with different key stages such as color and depth image processing,
color and depth image representation, classification, machine learning, facial
micro-expressions recognition, feature extraction and dimensionality reduction.
The book attempts to introduce a gentle introduction to the field of Facial
Micro Expressions Recognition (FMER) using Color and Depth images, with the aid
of MATLAB programming environment.Comment: This is the second edition of the boo
Echocardiography
The book "Echocardiography - New Techniques" brings worldwide contributions from highly acclaimed clinical and imaging science investigators, and representatives from academic medical centers. Each chapter is designed and written to be accessible to those with a basic knowledge of echocardiography. Additionally, the chapters are meant to be stimulating and educational to the experts and investigators in the field of echocardiography. This book is aimed primarily at cardiology fellows on their basic echocardiography rotation, fellows in general internal medicine, radiology and emergency medicine, and experts in the arena of echocardiography. Over the last few decades, the rate of technological advancements has developed dramatically, resulting in new techniques and improved echocardiographic imaging. The authors of this book focused on presenting the most advanced techniques useful in today's research and in daily clinical practice. These advanced techniques are utilized in the detection of different cardiac pathologies in patients, in contributing to their clinical decision, as well as follow-up and outcome predictions. In addition to the advanced techniques covered, this book expounds upon several special pathologies with respect to the functions of echocardiography
Generalizable automated pixel-level structural segmentation of medical and biological data
Over the years, the rapid expansion in imaging techniques and equipments has driven the demand for more automation in handling large medical and biological data sets. A wealth of approaches have been suggested as optimal solutions for their respective imaging types. These
solutions span various image resolutions, modalities and contrast (staining) mechanisms. Few approaches generalise well across multiple image types, contrasts or resolution.
This thesis proposes an automated pixel-level framework that addresses 2D, 2D+t and 3D
structural segmentation in a more generalizable manner, yet has enough adaptability to address
a number of specific image modalities, spanning retinal funduscopy, sequential
fluorescein angiography and two-photon microscopy.
The pixel-level segmentation scheme involves: i ) constructing a phase-invariant orientation field of the local spatial neighbourhood; ii ) combining local feature maps with intensity-based
measures in a structural patch context; iii ) using a complex supervised learning process to interpret the combination of all the elements in the patch in order to reach a classification decision. This has the advantage of transferability from retinal blood vessels in 2D to neural structures in 3D.
To process the temporal components in non-standard 2D+t retinal angiography sequences, we first introduce a co-registration procedure: at the pairwise level, we combine projective
RANSAC with a quadratic homography transformation to map the coordinate systems between any two frames. At the joint level, we construct a hierarchical approach in order for each individual frame to be registered to the global reference intra- and inter- sequence(s). We then take a non-training approach that searches in both the spatial neighbourhood of each pixel and the filter output across varying scales to locate and link microvascular centrelines to (sub-)
pixel accuracy. In essence, this \link while extract" piece-wise segmentation approach combines the local phase-invariant orientation field information with additional local phase estimates to obtain a soft classification of the centreline (sub-) pixel locations.
Unlike retinal segmentation problems where vasculature is the main focus, 3D neural segmentation requires additional
exibility, allowing a variety of structures of anatomical importance yet with different geometric properties to be differentiated both from the background and against other structures. Notably, cellular structures, such as Purkinje cells, neural dendrites and interneurons, all display certain elongation along their medial axes, yet each class has a characteristic shape captured by an orientation field that distinguishes it from other structures. To take this
into consideration, we introduce a 5D orientation mapping to capture these orientation properties.
This mapping is incorporated into the local feature map description prior to a learning
machine. Extensive performance evaluations and validation of each of the techniques presented in this thesis is carried out. For retinal fundus images, we compute Receiver Operating Characteristic (ROC) curves on existing public databases (DRIVE & STARE) to assess and compare our algorithms with other benchmark methods. For 2D+t retinal angiography sequences, we compute the error metrics ("Centreline Error") of our scheme with other benchmark methods.
For microscopic cortical data stacks, we present segmentation results on both surrogate data with known ground-truth and experimental rat cerebellar cortex two-photon microscopic tissue stacks.Open Acces
Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries
This two-volume set LNCS 12962 and 12963 constitutes the thoroughly refereed proceedings of the 7th International MICCAI Brainlesion Workshop, BrainLes 2021, as well as the RSNA-ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge, the Federated Tumor Segmentation (FeTS) Challenge, the Cross-Modality Domain Adaptation (CrossMoDA) Challenge, and the challenge on Quantification of Uncertainties in Biomedical Image Quantification (QUBIQ). These were held jointly at the 23rd Medical Image Computing for Computer Assisted Intervention Conference, MICCAI 2020, in September 2021. The 91 revised papers presented in these volumes were selected form 151 submissions. Due to COVID-19 pandemic the conference was held virtually. This is an open access book
Foetal echocardiographic segmentation
Congenital heart disease affects just under one percentage of all live births [1].
Those defects that manifest themselves as changes to the cardiac chamber volumes
are the motivation for the research presented in this thesis.
Blood volume measurements in vivo require delineation of the cardiac chambers and
manual tracing of foetal cardiac chambers is very time consuming and operator
dependent. This thesis presents a multi region based level set snake deformable
model applied in both 2D and 3D which can automatically adapt to some extent
towards ultrasound noise such as attenuation, speckle and partial occlusion artefacts.
The algorithm presented is named Mumford Shah Sarti Collision Detection (MSSCD).
The level set methods presented in this thesis have an optional shape prior term for
constraining the segmentation by a template registered to the image in the presence
of shadowing and heavy noise.
When applied to real data in the absence of the template the MSSCD algorithm is
initialised from seed primitives placed at the centre of each cardiac chamber. The
voxel statistics inside the chamber is determined before evolution. The MSSCD stops
at open boundaries between two chambers as the two approaching level set fronts
meet. This has significance when determining volumes for all cardiac compartments
since cardiac indices assume that each chamber is treated in isolation. Comparison
of the segmentation results from the implemented snakes including a previous level
set method in the foetal cardiac literature show that in both 2D and 3D on both real
and synthetic data, the MSSCD formulation is better suited to these types of data.
All the algorithms tested in this thesis are within 2mm error to manually traced
segmentation of the foetal cardiac datasets. This corresponds to less than 10% of
the length of a foetal heart. In addition to comparison with manual tracings all the
amorphous deformable model segmentations in this thesis are validated using a
physical phantom. The volume estimation of the phantom by the MSSCD
segmentation is to within 13% of the physically determined volume
Medical image enhancement
Each image acquired from a medical imaging system is often part of a two-dimensional (2-D) image set whose total presents a three-dimensional (3-D) object for diagnosis. Unfortunately, sometimes these images are of poor quality. These distortions cause an inadequate object-of-interest presentation, which can result in inaccurate image analysis. Blurring is considered a serious problem. Therefore, “deblurring” an image to obtain better quality is an important issue in medical image processing. In our research, the image is initially decomposed. Contrast improvement is achieved by modifying the coefficients obtained from the decomposed image. Small coefficient values represent subtle details and are amplified to improve the visibility of the corresponding details. The stronger image density variations make a major contribution to the overall dynamic range, and have large coefficient values. These values can be reduced without much information loss
Advanced Computational Methods for Oncological Image Analysis
[Cancer is the second most common cause of death worldwide and encompasses highly variable clinical and biological scenarios. Some of the current clinical challenges are (i) early diagnosis of the disease and (ii) precision medicine, which allows for treatments targeted to specific clinical cases. The ultimate goal is to optimize the clinical workflow by combining accurate diagnosis with the most suitable therapies. Toward this, large-scale machine learning research can define associations among clinical, imaging, and multi-omics studies, making it possible to provide reliable diagnostic and prognostic biomarkers for precision oncology. Such reliable computer-assisted methods (i.e., artificial intelligence) together with clinicians’ unique knowledge can be used to properly handle typical issues in evaluation/quantification procedures (i.e., operator dependence and time-consuming tasks). These technical advances can significantly improve result repeatability in disease diagnosis and guide toward appropriate cancer care. Indeed, the need to apply machine learning and computational intelligence techniques has steadily increased to effectively perform image processing operations—such as segmentation, co-registration, classification, and dimensionality reduction—and multi-omics data integration.
- …