7 research outputs found
Cooking State Recognition from Images Using Inception Architecture
A kitchen robot properly needs to understand the cooking environment to
continue any cooking activities. But object's state detection has not been
researched well so far as like object detection. In this paper, we propose a
deep learning approach to identify different cooking states from images for a
kitchen robot. In our research, we investigate particularly the performance of
Inception architecture and propose a modified architecture based on Inception
model to classify different cooking states. The model is analyzed robustly in
terms of different layers, and optimizers. Experimental results on a cooking
datasets demonstrate that proposed model can be a potential solution to the
cooking state recognition problem.Comment: 6 pages, 8 figures, 4 table
Blind Image Deconvolution Using Variational Deep Image Prior
Conventional deconvolution methods utilize hand-crafted image priors to
constrain the optimization. While deep-learning-based methods have simplified
the optimization by end-to-end training, they fail to generalize well to blurs
unseen in the training dataset. Thus, training image-specific models is
important for higher generalization. Deep image prior (DIP) provides an
approach to optimize the weights of a randomly initialized network with a
single degraded image by maximum a posteriori (MAP), which shows that the
architecture of a network can serve as the hand-crafted image prior. Different
from the conventional hand-crafted image priors that are statistically
obtained, it is hard to find a proper network architecture because the
relationship between images and their corresponding network architectures is
unclear. As a result, the network architecture cannot provide enough constraint
for the latent sharp image. This paper proposes a new variational deep image
prior (VDIP) for blind image deconvolution, which exploits additive
hand-crafted image priors on latent sharp images and approximates a
distribution for each pixel to avoid suboptimal solutions. Our mathematical
analysis shows that the proposed method can better constrain the optimization.
The experimental results further demonstrate that the generated images have
better quality than that of the original DIP on benchmark datasets. The source
code of our VDIP is available at
https://github.com/Dong-Huo/VDIP-Deconvolution
DSMRI: Domain Shift Analyzer for Multi-Center MRI Datasets
In medical research and clinical applications, the utilization of MRI datasets from multiple centers has become increasingly prevalent. However, inherent variability between these centers presents challenges due to domain shift, which can impact the quality and reliability of the analysis. Regrettably, the absence of adequate tools for domain shift analysis hinders the development and validation of domain adaptation and harmonization techniques. To address this issue, this paper presents a novel Domain Shift analyzer for MRI (DSMRI) framework designed explicitly for domain shift analysis in multi-center MRI datasets. The proposed model assesses the degree of domain shift within an MRI dataset by leveraging various MRI-quality-related metrics derived from the spatial domain. DSMRI also incorporates features from the frequency domain to capture low- and high-frequency information about the image. It further includes the wavelet domain features by effectively measuring the sparsity and energy present in the wavelet coefficients. Furthermore, DSMRI introduces several texture features, thereby enhancing the robustness of the domain shift analysis process. The proposed framework includes visualization techniques such as t-SNE and UMAP to demonstrate that similar data are grouped closely while dissimilar data are in separate clusters. Additionally, quantitative analysis is used to measure the domain shift distance, domain classification accuracy, and the ranking of significant features. The effectiveness of the proposed approach is demonstrated using experimental evaluations on seven large-scale multi-site neuroimaging datasets
Accurate personalized survival prediction for amyotrophic lateral sclerosis patients
Abstract Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease. Accurately predicting the survival time for ALS patients can help patients and clinicians to plan for future treatment and care. We describe the application of a machine-learned tool that incorporates clinical features and cortical thickness from brain magnetic resonance (MR) images to estimate the time until a composite respiratory failure event for ALS patients, and presents the prediction as individual survival distributions (ISDs). These ISDs provide the probability of survival (none of the respiratory failures) at multiple future time points, for each individual patient. Our learner considers several survival prediction models, and selects the best model to provide predictions. We evaluate our learned model using the mean absolute error margin (MAE-margin), a modified version of mean absolute error that handles data with censored outcomes. We show that our tool can provide helpful information for patients and clinicians in planning future treatment
Recommended from our members
SF2Former: Amyotrophic Lateral Sclerosis Identification From Multi-center MRI Data Using Spatial and Frequency Fusion Transformer
Amyotrophic Lateral Sclerosis (ALS) is a complex neurodegenerative disorder
involving motor neuron degeneration. Significant research has begun to
establish brain magnetic resonance imaging (MRI) as a potential biomarker to
diagnose and monitor the state of the disease. Deep learning has turned into a
prominent class of machine learning programs in computer vision and has been
successfully employed to solve diverse medical image analysis tasks. However,
deep learning-based methods applied to neuroimaging have not achieved superior
performance in ALS patients classification from healthy controls due to having
insignificant structural changes correlated with pathological features.
Therefore, the critical challenge in deep models is to determine useful
discriminative features with limited training data. By exploiting the
long-range relationship of image features, this study introduces a framework
named SF2Former that leverages vision transformer architecture's power to
distinguish the ALS subjects from the control group. To further improve the
network's performance, spatial and frequency domain information are combined
because MRI scans are captured in the frequency domain before being converted
to the spatial domain. The proposed framework is trained with a set of
consecutive coronal 2D slices, which uses the pre-trained weights on ImageNet
by leveraging transfer learning. Finally, a majority voting scheme has been
employed to those coronal slices of a particular subject to produce the final
classification decision. Our proposed architecture has been thoroughly assessed
with multi-modal neuroimaging data using two well-organized versions of the
Canadian ALS Neuroimaging Consortium (CALSNIC) multi-center datasets. The
experimental results demonstrate the superiority of our proposed strategy in
terms of classification accuracy compared with several popular deep
learning-based techniques
Recommended from our members
SF2Former: Amyotrophic lateral sclerosis identification from multi-center MRI data using spatial and frequency fusion transformer
Amyotrophic Lateral Sclerosis (ALS) is a complex neurodegenerative disorder characterized by motor neuron degeneration. Significant research has begun to establish brain magnetic resonance imaging (MRI) as a potential biomarker to diagnose and monitor the state of the disease. Deep learning has emerged as a prominent class of machine learning algorithms in computer vision and has shown successful applications in various medical image analysis tasks. However, deep learning methods applied to neuroimaging have not achieved superior performance in classifying ALS patients from healthy controls due to insignificant structural changes correlated with pathological features. Thus, a critical challenge in deep models is to identify discriminative features from limited training data. To address this challenge, this study introduces a framework called SF2Former, which leverages the power of the vision transformer architecture to distinguish ALS subjects from the control group by exploiting the long-range relationships among image features. Additionally, spatial and frequency domain information is combined to enhance the network’s performance, as MRI scans are initially captured in the frequency domain and then converted to the spatial domain. The proposed framework is trained using a series of consecutive coronal slices and utilizes pre-trained weights from ImageNet through transfer learning. Finally, a majority voting scheme is employed on the coronal slices of each subject to generate the final classification decision. The proposed architecture is extensively evaluated with multi-modal neuroimaging data (i.e., T1-weighted, R2*, FLAIR) using two well-organized versions of the Canadian ALS Neuroimaging Consortium (CALSNIC) multi-center datasets. The experimental results demonstrate the superiority of the proposed strategy in terms of classification accuracy compared to several popular deep learning-based techniques.
•We propose a novel vision transformer-based deep model to classify ALS subjects from the healthy control group.•We analyze two independent and extensive MRI datasets of 120 and 232 scans of ALS and healthy control subjects.•To evaluate the effectiveness of our proposed framework, we leverage multi-center and multi-modal neuroimaging data, including T1-weighted, R2*, and FLAIR images.•Our results demonstrate superior classification accuracy when compared to various state-of-the-art deep learning-based techniques