216 research outputs found

    A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

    Full text link
    In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NAS, among others. Each type of CNN has its unique structure and characteristics, making it suitable for specific tasks. It's crucial to gain a thorough understanding and perform a comparative analysis of these different CNN types to understand their strengths and weaknesses. Furthermore, studying the performance, limitations, and practical applications of each type of CNN can aid in the development of new and improved architectures in the future. We also dive into the platforms and frameworks that researchers utilize for their research or development from various perspectives. Additionally, we explore the main research fields of CNN like 6D vision, generative models, and meta-learning. This survey paper provides a comprehensive examination and comparison of various CNN architectures, highlighting their architectural differences and emphasizing their respective advantages, disadvantages, applications, challenges, and future trends

    Multi-source Remote Sensing for Forest Characterization and Monitoring

    Full text link
    As a dominant terrestrial ecosystem of the Earth, forest environments play profound roles in ecology, biodiversity, resource utilization, and management, which highlights the significance of forest characterization and monitoring. Some forest parameters can help track climate change and quantify the global carbon cycle and therefore attract growing attention from various research communities. Compared with traditional in-situ methods with expensive and time-consuming field works involved, airborne and spaceborne remote sensors collect cost-efficient and consistent observations at global or regional scales and have been proven to be an effective way for forest monitoring. With the looming paradigm shift toward data-intensive science and the development of remote sensors, remote sensing data with higher resolution and diversity have been the mainstream in data analysis and processing. However, significant heterogeneities in the multi-source remote sensing data largely restrain its forest applications urging the research community to come up with effective synergistic strategies. The work presented in this thesis contributes to the field by exploring the potential of the Synthetic Aperture Radar (SAR), SAR Polarimetry (PolSAR), SAR Interferometry (InSAR), Polarimetric SAR Interferometry (PolInSAR), Light Detection and Ranging (LiDAR), and multispectral remote sensing in forest characterization and monitoring from three main aspects including forest height estimation, active fire detection, and burned area mapping. First, the forest height inversion is demonstrated using airborne L-band dual-baseline repeat-pass PolInSAR data based on modified versions of the Random Motion over Ground (RMoG) model, where the scattering attenuation and wind-derived random motion are described in conditions of homogeneous and heterogeneous volume layer, respectively. A boreal and a tropical forest test site are involved in the experiment to explore the flexibility of different models over different forest types and based on that, a leveraging strategy is proposed to boost the accuracy of forest height estimation. The accuracy of the model-based forest height inversion is limited by the discrepancy between the theoretical models and actual scenarios and exhibits a strong dependency on the system and scenario parameters. Hence, high vertical accuracy LiDAR samples are employed to assist the PolInSAR-based forest height estimation. This multi-source forest height estimation is reformulated as a pan-sharpening task aiming to generate forest heights with high spatial resolution and vertical accuracy based on the synergy of the sparse LiDAR-derived heights and the information embedded in the PolInSAR data. This process is realized by a specifically designed generative adversarial network (GAN) allowing high accuracy forest height estimation less limited by theoretical models and system parameters. Related experiments are carried out over a boreal and a tropical forest to validate the flexibility of the method. An automated active fire detection framework is proposed for the medium resolution multispectral remote sensing data. The basic part of this framework is a deep-learning-based semantic segmentation model specifically designed for active fire detection. A dataset is constructed with open-access Sentinel-2 imagery for the training and testing of the deep-learning model. The developed framework allows an automated Sentinel-2 data download, processing, and generation of the active fire detection results through time and location information provided by the user. Related performance is evaluated in terms of detection accuracy and processing efficiency. The last part of this thesis explored whether the coarse burned area products can be further improved through the synergy of multispectral, SAR, and InSAR features with higher spatial resolutions. A Siamese Self-Attention (SSA) classification is proposed for the multi-sensor burned area mapping and a multi-source dataset is constructed at the object level for the training and testing. Results are analyzed by different test sites, feature sources, and classification methods to assess the improvements achieved by the proposed method. All developed methods are validated with extensive processing of multi-source data acquired by Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR), Land, Vegetation, and Ice Sensor (LVIS), PolSARproSim+, Sentinel-1, and Sentinel-2. I hope these studies constitute a substantial contribution to the forest applications of multi-source remote sensing

    Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)

    Full text link
    The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment. FMER is a subset of image processing and it is a multidisciplinary topic to analysis. So, it requires familiarity with other topics of Artifactual Intelligence (AI) such as machine learning, digital image processing, psychology and more. So, it is a great opportunity to write a book which covers all of these topics for beginner to professional readers in the field of AI and even without having background of AI. Our goal is to provide a standalone introduction in the field of MFER analysis in the form of theorical descriptions for readers with no background in image processing with reproducible Matlab practical examples. Also, we describe any basic definitions for FMER analysis and MATLAB library which is used in the text, that helps final reader to apply the experiments in the real-world applications. We believe that this book is suitable for students, researchers, and professionals alike, who need to develop practical skills, along with a basic understanding of the field. We expect that, after reading this book, the reader feels comfortable with different key stages such as color and depth image processing, color and depth image representation, classification, machine learning, facial micro-expressions recognition, feature extraction and dimensionality reduction. The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment.Comment: This is the second edition of the boo

    Echocardiography

    Get PDF
    The book "Echocardiography - New Techniques" brings worldwide contributions from highly acclaimed clinical and imaging science investigators, and representatives from academic medical centers. Each chapter is designed and written to be accessible to those with a basic knowledge of echocardiography. Additionally, the chapters are meant to be stimulating and educational to the experts and investigators in the field of echocardiography. This book is aimed primarily at cardiology fellows on their basic echocardiography rotation, fellows in general internal medicine, radiology and emergency medicine, and experts in the arena of echocardiography. Over the last few decades, the rate of technological advancements has developed dramatically, resulting in new techniques and improved echocardiographic imaging. The authors of this book focused on presenting the most advanced techniques useful in today's research and in daily clinical practice. These advanced techniques are utilized in the detection of different cardiac pathologies in patients, in contributing to their clinical decision, as well as follow-up and outcome predictions. In addition to the advanced techniques covered, this book expounds upon several special pathologies with respect to the functions of echocardiography

    Generalizable automated pixel-level structural segmentation of medical and biological data

    Get PDF
    Over the years, the rapid expansion in imaging techniques and equipments has driven the demand for more automation in handling large medical and biological data sets. A wealth of approaches have been suggested as optimal solutions for their respective imaging types. These solutions span various image resolutions, modalities and contrast (staining) mechanisms. Few approaches generalise well across multiple image types, contrasts or resolution. This thesis proposes an automated pixel-level framework that addresses 2D, 2D+t and 3D structural segmentation in a more generalizable manner, yet has enough adaptability to address a number of specific image modalities, spanning retinal funduscopy, sequential fluorescein angiography and two-photon microscopy. The pixel-level segmentation scheme involves: i ) constructing a phase-invariant orientation field of the local spatial neighbourhood; ii ) combining local feature maps with intensity-based measures in a structural patch context; iii ) using a complex supervised learning process to interpret the combination of all the elements in the patch in order to reach a classification decision. This has the advantage of transferability from retinal blood vessels in 2D to neural structures in 3D. To process the temporal components in non-standard 2D+t retinal angiography sequences, we first introduce a co-registration procedure: at the pairwise level, we combine projective RANSAC with a quadratic homography transformation to map the coordinate systems between any two frames. At the joint level, we construct a hierarchical approach in order for each individual frame to be registered to the global reference intra- and inter- sequence(s). We then take a non-training approach that searches in both the spatial neighbourhood of each pixel and the filter output across varying scales to locate and link microvascular centrelines to (sub-) pixel accuracy. In essence, this \link while extract" piece-wise segmentation approach combines the local phase-invariant orientation field information with additional local phase estimates to obtain a soft classification of the centreline (sub-) pixel locations. Unlike retinal segmentation problems where vasculature is the main focus, 3D neural segmentation requires additional exibility, allowing a variety of structures of anatomical importance yet with different geometric properties to be differentiated both from the background and against other structures. Notably, cellular structures, such as Purkinje cells, neural dendrites and interneurons, all display certain elongation along their medial axes, yet each class has a characteristic shape captured by an orientation field that distinguishes it from other structures. To take this into consideration, we introduce a 5D orientation mapping to capture these orientation properties. This mapping is incorporated into the local feature map description prior to a learning machine. Extensive performance evaluations and validation of each of the techniques presented in this thesis is carried out. For retinal fundus images, we compute Receiver Operating Characteristic (ROC) curves on existing public databases (DRIVE & STARE) to assess and compare our algorithms with other benchmark methods. For 2D+t retinal angiography sequences, we compute the error metrics ("Centreline Error") of our scheme with other benchmark methods. For microscopic cortical data stacks, we present segmentation results on both surrogate data with known ground-truth and experimental rat cerebellar cortex two-photon microscopic tissue stacks.Open Acces

    Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries

    Get PDF
    This two-volume set LNCS 12962 and 12963 constitutes the thoroughly refereed proceedings of the 7th International MICCAI Brainlesion Workshop, BrainLes 2021, as well as the RSNA-ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge, the Federated Tumor Segmentation (FeTS) Challenge, the Cross-Modality Domain Adaptation (CrossMoDA) Challenge, and the challenge on Quantification of Uncertainties in Biomedical Image Quantification (QUBIQ). These were held jointly at the 23rd Medical Image Computing for Computer Assisted Intervention Conference, MICCAI 2020, in September 2021. The 91 revised papers presented in these volumes were selected form 151 submissions. Due to COVID-19 pandemic the conference was held virtually. This is an open access book

    Foetal echocardiographic segmentation

    Get PDF
    Congenital heart disease affects just under one percentage of all live births [1]. Those defects that manifest themselves as changes to the cardiac chamber volumes are the motivation for the research presented in this thesis. Blood volume measurements in vivo require delineation of the cardiac chambers and manual tracing of foetal cardiac chambers is very time consuming and operator dependent. This thesis presents a multi region based level set snake deformable model applied in both 2D and 3D which can automatically adapt to some extent towards ultrasound noise such as attenuation, speckle and partial occlusion artefacts. The algorithm presented is named Mumford Shah Sarti Collision Detection (MSSCD). The level set methods presented in this thesis have an optional shape prior term for constraining the segmentation by a template registered to the image in the presence of shadowing and heavy noise. When applied to real data in the absence of the template the MSSCD algorithm is initialised from seed primitives placed at the centre of each cardiac chamber. The voxel statistics inside the chamber is determined before evolution. The MSSCD stops at open boundaries between two chambers as the two approaching level set fronts meet. This has significance when determining volumes for all cardiac compartments since cardiac indices assume that each chamber is treated in isolation. Comparison of the segmentation results from the implemented snakes including a previous level set method in the foetal cardiac literature show that in both 2D and 3D on both real and synthetic data, the MSSCD formulation is better suited to these types of data. All the algorithms tested in this thesis are within 2mm error to manually traced segmentation of the foetal cardiac datasets. This corresponds to less than 10% of the length of a foetal heart. In addition to comparison with manual tracings all the amorphous deformable model segmentations in this thesis are validated using a physical phantom. The volume estimation of the phantom by the MSSCD segmentation is to within 13% of the physically determined volume

    Medical image enhancement

    Get PDF
    Each image acquired from a medical imaging system is often part of a two-dimensional (2-D) image set whose total presents a three-dimensional (3-D) object for diagnosis. Unfortunately, sometimes these images are of poor quality. These distortions cause an inadequate object-of-interest presentation, which can result in inaccurate image analysis. Blurring is considered a serious problem. Therefore, “deblurring” an image to obtain better quality is an important issue in medical image processing. In our research, the image is initially decomposed. Contrast improvement is achieved by modifying the coefficients obtained from the decomposed image. Small coefficient values represent subtle details and are amplified to improve the visibility of the corresponding details. The stronger image density variations make a major contribution to the overall dynamic range, and have large coefficient values. These values can be reduced without much information loss

    Advanced Computational Methods for Oncological Image Analysis

    Get PDF
    [Cancer is the second most common cause of death worldwide and encompasses highly variable clinical and biological scenarios. Some of the current clinical challenges are (i) early diagnosis of the disease and (ii) precision medicine, which allows for treatments targeted to specific clinical cases. The ultimate goal is to optimize the clinical workflow by combining accurate diagnosis with the most suitable therapies. Toward this, large-scale machine learning research can define associations among clinical, imaging, and multi-omics studies, making it possible to provide reliable diagnostic and prognostic biomarkers for precision oncology. Such reliable computer-assisted methods (i.e., artificial intelligence) together with clinicians’ unique knowledge can be used to properly handle typical issues in evaluation/quantification procedures (i.e., operator dependence and time-consuming tasks). These technical advances can significantly improve result repeatability in disease diagnosis and guide toward appropriate cancer care. Indeed, the need to apply machine learning and computational intelligence techniques has steadily increased to effectively perform image processing operations—such as segmentation, co-registration, classification, and dimensionality reduction—and multi-omics data integration.
    • …
    corecore