6 research outputs found

    Multiclass CBCT image segmentation for orthodontics with deep learning

    Get PDF
    Accurate segmentation of the jaw (i.e., mandible and maxilla) and the teeth in cone beam computed tomography (CBCT) scans is essential for orthodontic diagnosis and treatment planning. Although various (semi)automated methods have been proposed to segment the jaw or the teeth, there is still a lack of fully automated segmentation methods that can simultaneously segment both anatomic structures in CBCT scans (i.e., multiclass segmentation). In this study, we aimed to train and validate a mixed-scale dense (MS-D) convolutional neural network for multiclass segmentation of the jaw, the teeth, and the background in CBCT scans. Thirty CBCT scans were obtained from patients who had undergone orthodontic treatment. Gold standard segmentation labels were manually created by 4 dentists. As a benchmark, we also evaluated MS-D networks that segmented the jaw or the teeth (i.e., binary segmentation). All segmented CBCT scans were converted to virtual 3-dimensional (3D) models. The segmentation performance of all trained MS-D networks was assessed by the Dice similarity coefficient and surface deviation. The CBCT scans segmented by the MS-D network demonstrated a large overlap with the gold standard segmentations (Dice similarity coefficient: 0.934 ± 0.019, jaw; 0.945 ± 0.021, teeth). The MS-D network–based 3D models of the jaw and the teeth showed minor surface deviations when compared with the corresponding gold standard 3D models (0.390 ± 0.093 mm, jaw; 0.204 ± 0.061 mm, teeth). The MS-D network took approximately 25 s to segment 1 CBCT scan, whereas manual segmentation took about 5 h. This study showed that multiclass segmentation of jaw and teeth was accurate and its performance was comparable to binary segmentation. The MS-D network trained for multiclass segmentation would therefore make patient-specific orthodontic treatment more feasible by strongly reducing the time required to segment multiple anatomic structures in CBCT scans

    Emotional classification of music using neural networks with the MediaEval dataset

    Get PDF
    The proven ability of music to transmit emotions provokes the increasing interest in the development of new algorithms for music emotion recognition (MER). In this work, we present an automatic system of emotional classification of music by implementing a neural network. This work is based on a previous implementation of a dimensional emotional prediction system in which a multilayer perceptron (MLP) was trained with the freely available MediaEval database. Although these previous results are good in terms of the metrics of the prediction values, they are not good enough to obtain a classification by quadrant based on the valence and arousal values predicted by the neural network, mainly due to the imbalance between classes in the dataset. To achieve better classification values, a pre-processing phase was implemented to stratify and balance the dataset. Three different classifiers have been compared: linear support vector machine (SVM), random forest, and MLP. The best results are obtained with the MLP. An averaged F-measure of 50% is obtained in a four-quadrant classification schema. Two binary classification approaches are also presented: one vs. rest (OvR) approach in four-quadrants and binary classifier in valence and arousal. The OvR approach has an average F-measure of 69%, and the second one obtained F-measure of 73% and 69% in valence and arousal respectively. Finally, a dynamic classification analysis with different time windows was performed using the temporal annotation data of the MediaEval database. The results obtained show that the classification F-measures in four quadrants are practically constant, regardless of the duration of the time window. Also, this work reflects some limitations related to the characteristics of the dataset, including size, class balance, quality of the annotations, and the sound features available

    Optimization of deep learning methods for visualization of tumor heterogeneity and brain tumor grading through digital pathology

    Get PDF
    Background: Variations in prognosis and treatment options for gliomas are dependent on tumor grading. When tissue is available for analysis, grade is established based on histological criteria. However, histopathological diagnosis is not always reliable or straight-forward due to tumor heterogeneity, sampling error, and subjectivity, and hence there is great interobserver variability in readings. Methods: We trained convolutional neural network models to classify digital whole-slide histopathology images from The Cancer Genome Atlas. We tested a number of optimization parameters. Results: Data augmentation did not improve model training, while a smaller batch size helped to prevent overfitting and led to improved model performance. There was no significant difference in performance between a modular 2-class model and a single 3-class model system. The best models trained achieved a mean accuracy of 73% in classifying glioblastoma from other grades and 53% between WHO grade II and III gliomas. A visualization method was developed to convey the model output in a clinically relevant manner by overlaying color-coded predictions over the original whole-slide image. Conclusions: Our developed visualization method reflects the clinical decision-making process by highlighting the intratumor heterogeneity and may be used in a clinical setting to aid diagnosis. Explainable artificial intelligence techniques may allow further evaluation of the model and underline areas for improvements such as biases. Due to intratumor heterogeneity, data annotation for training was imprecise, and hence performance was lower than expected. The models may be further improved by employing advanced data augmentation strategies and using more precise semiautomatic or manually labeled training data

    Multiclass Bone Segmentation of PET/CT Scans for Automatic SUV Extraction

    Get PDF
    In this thesis I present an automated framework for segmentation of bone structures from dual modality PET/CT scans and further extraction of SUV measurements. The first stage of this framework consists of a variant of the 3D U-Net architecture for segmentation of three bone structures: vertebral body, pelvis, and sternum. The dataset for this model consists of annotated slices from the CT scans retrieved from the study of post-HCST patients and the 18F-FLT radiotracer, which are undersampled volumes due to the low-dose radiation used during the scanning. The mean Dice scores obtained by the proposed model are 0.9162, 0.9163, and 0.8721 for the vertebral body, pelvis, and sternum class respectively. The next step of the proposed framework consists of identifying the individual vertebrae, which is a particularly difficult task due to the low resolution of the CT scans in the axial dimension. To address this issue, I present an iterative algorithm for instance segmentation of vertebral bodies, based on anatomical priors of the spine for detecting the starting point of a vertebra. The spatial information contained in the CT and PET scans is used to translate the resulting masks to the PET image space and extract SUV measurements. I then present a CNN model based on the DenseNet architecture that, for the first time, classifies the spatial distribution of SUV within the marrow cavities of the vertebral bodies as normal engraftment or possible relapse. With an AUC of 0.931 and an accuracy of 92% obtained on real patient data, this method shows good potential as a future automated tool to assist in monitoring the recovery process of HSCT patients

    DEEP LEARNING IN COMPUTER-ASSISTED MAXILLOFACIAL SURGERY

    Get PDF

    Development and validation of a neural network for adaptive gait cycle detection from kinematic data

    Get PDF
    (1) Background: Instrumented gait analysis is a tool for quantification of the different aspects of the locomotor system. Gait analysis technology has substantially evolved over the last decade and most modern systems provide real-time capability. The ability to calculate joint angles with low delays paves the way for new applications such as real-time movement feedback, like control of functional electrical stimulation in the rehabilitation of individuals with gait disorders. For any kind of therapeutic application, the timely determination of different gait phases such as stance or swing is crucial. Gait phases are usually estimated based on heuristics of joint angles or time points of certain gait events. Such heuristic approaches often do not work properly in people with gait disorders due to the greater variability of their pathological gait pattern. To improve the current state-ofthe- art, this thesis aims to introduce a data-driven approach for real-time determination of gait phases from kinematic variables based on long short-term memory recurrent neural networks (LSTM RNNs). (2) Methods: In this thesis, 56 measurements with gait data of 11 healthy subjects, 13 individuals with incomplete spinal cord injury and 10 stroke survivors with walking speeds ranging from 0.2 m s up to 1 m s were used to train the networks. Each measurement contained kinematic data from the corresponding subject walking on a treadmill for 90 seconds. Kinematic data was obtained by measuring the positions of reflective markers on body landmarks (Helen Hayes marker set) with a sample rate of 60Hz. For constructing a ground truth, gait data was annotated manually by three raters. Two approaches, direct regression of gait phases and estimation via detection of the gait events Initial Contact and Final Contact were implemented for evaluation of the performance of LSTM RNNs. For comparison of performance, the frequently cited coordinate- and velocity-based event detection approaches of Zeni et al. were used. All aspects of this thesis have been implemented within MATLAB Version 9.6 using the Deep Learning Toolbox. (3) Results: The mean time difference between events annotated by the three raters was −0.07 ± 20.17ms. Correlation coefficients of inter-rater and intra-rater reliability yielded mainly excellent or perfect results. For detection of gait events, the LSTM RNN algorithm covered 97.05% of all events within a scope of 50ms. The overall mean time difference between detected events and ground truth was −11.62 ± 7.01ms. Temporal differences and deviations were consistently small over different walking speeds and gait pathologies. Mean time difference to the ground truth was 13.61 ± 17.88ms for the coordinate-based approach of Zeni et al. and 17.18 ± 15.67ms for the velocity-based approach. For estimation of gait phases, the gait phase was determined as a percentage. Mean squared error to the ground truth was 0.95 ± 0.55% for the proposed algorithm using event detection and 1.50 ± 0.55% for regression. For the approaches of Zeni et al., mean squared error was 2.04±1.23% for the coordinate-based approach and 2.24±1.34% for the velocity-based approach. Regarding mean absolute error to the ground truth, the proposed algorithm achieved a mean absolute error of 1.95±1.10% using event detection and one of 7.25 ± 1.45% using regression. Mean absolute error for the coordinate-based approach of Zeni et al. was 4.08±2.51% and 4.50±2.73% for the velocity-based approach. (4) Conclusion: The newly introduced LSTM RNN algorithm offers a high recognition rate of gait events with a small delay. Its performance outperforms several state-of-theart gait event detection methods while offering the possibility for real-time processing and high generalization of trained gait patterns. Additionally, the proposed algorithm is easy to integrate into existing applications and contains parameters that self-adapt to individuals’ gait behavior to further improve performance. In respect to gait phase estimation, the performance of the proposed algorithm using event detection is in line with current wearable state-of-the-art methods. Compared with conventional methods, performance of direct regression of gait phases is only moderate. Given the results, LSTM RNNs demonstrate feasibility regarding event detection and are applicable for many clinical and research applications. They may be not suitable for the estimation of gait phases via regression. For LSTM RNNs, it can be assumed, that with a more optimal configuration of the networks, a much higher performance is achieved
    corecore