331 research outputs found
RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation
Optical coherence tomography (OCT) helps ophthalmologists assess macular
edema, accumulation of fluids, and lesions at microscopic resolution.
Quantification of retinal fluids is necessary for OCT-guided treatment
management, which relies on a precise image segmentation step. As manual
analysis of retinal fluids is a time-consuming, subjective, and error-prone
task, there is increasing demand for fast and robust automatic solutions. In
this study, a new convolutional neural architecture named RetiFluidNet is
proposed for multi-class retinal fluid segmentation. The model benefits from
hierarchical representation learning of textural, contextual, and edge features
using a new self-adaptive dual-attention (SDA) module, multiple self-adaptive
attention-based skip connections (SASC), and a novel multi-scale deep self
supervision learning (DSL) scheme. The attention mechanism in the proposed SDA
module enables the model to automatically extract deformation-aware
representations at different levels, and the introduced SASC paths further
consider spatial-channel interdependencies for concatenation of counterpart
encoder and decoder units, which improve representational capability.
RetiFluidNet is also optimized using a joint loss function comprising a
weighted version of dice overlap and edge-preserved connectivity-based losses,
where several hierarchical stages of multi-scale local losses are integrated
into the optimization process. The model is validated based on three publicly
available datasets: RETOUCH, OPTIMA, and DUKE, with comparisons against several
baselines. Experimental results on the datasets prove the effectiveness of the
proposed model in retinal OCT fluid segmentation and reveal that the suggested
method is more effective than existing state-of-the-art fluid segmentation
algorithms in adapting to retinal OCT scans recorded by various image scanning
instruments.Comment: 11 pages, Early Access Version, IEEE Transactions on Medical Imagin
DeepSignals: Predicting Intent of Drivers Through Visual Signals
Detecting the intention of drivers is an essential task in self-driving,
necessary to anticipate sudden events like lane changes and stops. Turn signals
and emergency flashers communicate such intentions, providing seconds of
potentially critical reaction time. In this paper, we propose to detect these
signals in video sequences by using a deep neural network that reasons about
both spatial and temporal information. Our experiments on more than a million
frames show high per-frame accuracy in very challenging scenarios.Comment: To be presented at the IEEE International Conference on Robotics and
Automation (ICRA), 201
Generalized Sparse Convolutional Neural Networks for Semantic Segmentation of Point Clouds Derived from Tri-Stereo Satellite Imagery
We studied the applicability of point clouds derived from tri-stereo satellite imagery for
semantic segmentation for generalized sparse convolutional neural networks by the example of
an Austrian study area. We examined, in particular, if the distorted geometric information, in addition
to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this
regard, we trained a fully convolutional neural network that uses generalized sparse convolution
one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching),
and twice on 3D geometric as well as color information. In the first experiment, we did not use
class weights, whereas in the second we did. We compared the results with a fully convolutional
neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on
hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color
features. The decision tree using hand-crafted features has been successfully applied to aerial laser
scanning data in the literature. Hence, we compared our main interest of study, a representation
learning technique, with another representation learning technique, and a non-representation learning
technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is
a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily
unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our
study area, we reported that geometric and color information only improves the performance of the
Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a
higher overall performance in our case. We also found that training the network with median class
weighting partially reverts the effects of adding color. The network also started to learn the classes
with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto
generally outperforms the other two with a kappa score of over 90% and an average per class accuracy
of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2%
higher accuracy for roads
Data-Driven Deep Learning-Based Analysis on THz Imaging
Breast cancer affects about 12.5% of women population in the United States. Surgical operations are often needed post diagnosis. Breast conserving surgery can help remove malignant tumors while maximizing the remaining healthy tissues. Due to lacking effective real-time tumor analysis tools and a unified operation standard, re-excision rate could be higher than 30% among breast conserving surgery patients. This results in significant physical, physiological, and financial burdens to those patients. This work designs deep learning-based segmentation algorithms that detect tissue type in excised tissues using pulsed THz technology. This work evaluates the algorithms for tissue type classification task among freshly excised tumor samples. Freshly excised tumor samples are more challenging than formalin-fixed, paraffin-embedded (FFPE) block sample counterparts due to excessive fluid, image registration difficulties, and lacking trustworthy pixelwise labels of each tissue sample. Additionally, evaluating freshly excised tumor samples has profound meaning of potentially applying pulsed THz scan technology to breast conserving cancer surgery in operating room. Recently, deep learning techniques have been heavily researched since GPU based computation power becomes economical and stronger. This dissertation revisits breast cancer tissue segmentation related problems using pulsed terahertz wave scan technique among murine samples and applies recent deep learning frameworks to enhance the performance in various tasks. This study first performs pixelwise classification on terahertz scans with CNN-based neural networks and time-frequency based feature tensors using wavelet transformation. This study then explores the neural network based semantic segmentation strategy performing on terahertz scans considering spatial information and incorporating noisy label handling with label correction techniques. Additionally, this study performs resolution restoration for visual enhancement on terahertz scans using an unsupervised, generative image-to-image translation methodology. This work also proposes a novel data processing pipeline that trains a semantic segmentation network using only neural generated synthetic terahertz scans. The performance is evaluated using various evaluation metrics among different tasks
CKD-TransBTS: Clinical Knowledge-Driven Hybrid Transformer with Modality-Correlated Cross-Attention for Brain Tumor Segmentation
Brain tumor segmentation (BTS) in magnetic resonance image (MRI) is crucial
for brain tumor diagnosis, cancer management and research purposes. With the
great success of the ten-year BraTS challenges as well as the advances of CNN
and Transformer algorithms, a lot of outstanding BTS models have been proposed
to tackle the difficulties of BTS in different technical aspects. However,
existing studies hardly consider how to fuse the multi-modality images in a
reasonable manner. In this paper, we leverage the clinical knowledge of how
radiologists diagnose brain tumors from multiple MRI modalities and propose a
clinical knowledge-driven brain tumor segmentation model, called CKD-TransBTS.
Instead of directly concatenating all the modalities, we re-organize the input
modalities by separating them into two groups according to the imaging
principle of MRI. A dual-branch hybrid encoder with the proposed
modality-correlated cross-attention block (MCCA) is designed to extract the
multi-modality image features. The proposed model inherits the strengths from
both Transformer and CNN with the local feature representation ability for
precise lesion boundaries and long-range feature extraction for 3D volumetric
images. To bridge the gap between Transformer and CNN features, we propose a
Trans&CNN Feature Calibration block (TCFC) in the decoder. We compare the
proposed model with five CNN-based models and six transformer-based models on
the BraTS 2021 challenge dataset. Extensive experiments demonstrate that the
proposed model achieves state-of-the-art brain tumor segmentation performance
compared with all the competitors
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Human knowledge provides a formal understanding of the world. Knowledge
graphs that represent structural relations between entities have become an
increasingly popular research direction towards cognition and human-level
intelligence. In this survey, we provide a comprehensive review of knowledge
graph covering overall research topics about 1) knowledge graph representation
learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph,
and 4) knowledge-aware applications, and summarize recent breakthroughs and
perspective directions to facilitate future research. We propose a full-view
categorization and new taxonomies on these topics. Knowledge graph embedding is
organized from four aspects of representation space, scoring function, encoding
models, and auxiliary information. For knowledge acquisition, especially
knowledge graph completion, embedding methods, path inference, and logical rule
reasoning, are reviewed. We further explore several emerging topics, including
meta relational learning, commonsense reasoning, and temporal knowledge graphs.
To facilitate future research on knowledge graphs, we also provide a curated
collection of datasets and open-source libraries on different tasks. In the
end, we have a thorough outlook on several promising research directions
- …