15 research outputs found

    Validation of automated artificial intelligence segmentation of optical coherence tomography images

    Full text link
    PURPOSE To benchmark the human and machine performance of spectral-domain (SD) and swept-source (SS) optical coherence tomography (OCT) image segmentation, i.e., pixel-wise classification, for the compartments vitreous, retina, choroid, sclera. METHODS A convolutional neural network (CNN) was trained on OCT B-scan images annotated by a senior ground truth expert retina specialist to segment the posterior eye compartments. Independent benchmark data sets (30 SDOCT and 30 SSOCT) were manually segmented by three classes of graders with varying levels of ophthalmic proficiencies. Nine graders contributed to benchmark an additional 60 images in three consecutive runs. Inter-human and intra-human class agreement was measured and compared to the CNN results. RESULTS The CNN training data consisted of a total of 6210 manually segmented images derived from 2070 B-scans (1046 SDOCT and 1024 SSOCT; 630 C-Scans). The CNN segmentation revealed a high agreement with all grader groups. For all compartments and groups, the mean Intersection over Union (IOU) score of CNN compartmentalization versus group graders' compartmentalization was higher than the mean score for intra-grader group comparison. CONCLUSION The proposed deep learning segmentation algorithm (CNN) for automated eye compartment segmentation in OCT B-scans (SDOCT and SSOCT) is on par with manual segmentations by human graders

    Validation of automated artificial intelligence segmentation of optical coherence tomography images

    Get PDF
    Purpose To benchmark the human and machine performance of spectral-domain (SD) and swept-source (SS) optical coherence tomography (OCT) image segmentation, i.e., pixel-wise classification, for the compartments vitreous, retina, choroid, sclera. Methods A convolutional neural network (CNN) was trained on OCT B-scan images annotated by a senior ground truth expert retina specialist to segment the posterior eye compartments. Independent benchmark data sets (30 SDOCT and 30 SSOCT) were manually segmented by three classes of graders with varying levels of ophthalmic proficiencies. Nine graders contributed to benchmark an additional 60 images in three consecutive runs. Inter-human and intra-human class agreement was measured and compared to the CNN results. Results The CNN training data consisted of a total of 6210 manually segmented images derived from 2070 B-scans (1046 SDOCT and 1024 SSOCT; 630 C-Scans). The CNN segmentation revealed a high agreement with all grader groups. For all compartments and groups, the mean Intersection over Union (IOU) score of CNN compartmentalization versus group graders’ compartmentalization was higher than the mean score for intra-grader group comparison. Conclusion The proposed deep learning segmentation algorithm (CNN) for automated eye compartment segmentation in OCT B-scans (SDOCT and SSOCT) is on par with manual segmentations by human graders

    Unraveling the deep learning gearbox in optical coherence tomography image segmentation towards explainable artificial intelligence

    Get PDF
    Machine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization ('neural recording'). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications

    Validation of automated artificial intelligence segmentation of optical coherence tomography images.

    No full text
    PurposeTo benchmark the human and machine performance of spectral-domain (SD) and swept-source (SS) optical coherence tomography (OCT) image segmentation, i.e., pixel-wise classification, for the compartments vitreous, retina, choroid, sclera.MethodsA convolutional neural network (CNN) was trained on OCT B-scan images annotated by a senior ground truth expert retina specialist to segment the posterior eye compartments. Independent benchmark data sets (30 SDOCT and 30 SSOCT) were manually segmented by three classes of graders with varying levels of ophthalmic proficiencies. Nine graders contributed to benchmark an additional 60 images in three consecutive runs. Inter-human and intra-human class agreement was measured and compared to the CNN results.ResultsThe CNN training data consisted of a total of 6210 manually segmented images derived from 2070 B-scans (1046 SDOCT and 1024 SSOCT; 630 C-Scans). The CNN segmentation revealed a high agreement with all grader groups. For all compartments and groups, the mean Intersection over Union (IOU) score of CNN compartmentalization versus group graders' compartmentalization was higher than the mean score for intra-grader group comparison.ConclusionThe proposed deep learning segmentation algorithm (CNN) for automated eye compartment segmentation in OCT B-scans (SDOCT and SSOCT) is on par with manual segmentations by human graders

    Human selection bias drives the linear nature of the more ground truth effect in explainable deep learning optical coherence tomography image segmentation

    Full text link
    Supervised Deep Learning (DL) algorithms are highly dependent on training data for which human graders are assigned, e.g. for optical coherence tomography (OCT) image annotation. Despite the tremendous success of DL, due to human judgement, these ground truth labels can be inaccurate and/or ambiguous and cause a human selection bias. We therefore investigated the impact of the size of the ground truth and variable numbers of graders on the predictive performance of the same DL architecture and repeated each experiment 3 times. The largest training dataset delivered a prediction performance close to that of human experts. All DL systems utilized were highly consistent. Nevertheless, the DL under‐performers could not achieve any further autonomous improvement even after repeated training. Furthermore, a quantifiable linear relationship between ground truth ambiguity and the beneficial effect of having a larger amount of ground truth data was detected and marked as the more‐ground‐truth effect.This article is protected by copyright. All rights reserved

    Human selection bias drives the linear nature of the more ground truth effect in explainable deep learning optical coherence tomography image segmentation

    No full text
    Supervised deep learning (DL) algorithms are highly dependent on training data for which human graders are assigned, for example, for optical coherence tomography (OCT) image annotation. Despite the tremendous success of DL, due to human judgment, these ground truth labels can be inaccurate and/or ambiguous and cause a human selection bias. We therefore investigated the impact of the size of the ground truth and variable numbers of graders on the predictive performance of the same DL architecture and repeated each experiment three times. The largest training dataset delivered a prediction performance close to that of human experts. All DL systems utilized were highly consistent. Nevertheless, the DL under-performers could not achieve any further autonomous improvement even after repeated training. Furthermore, a quantifiable linear relationship between ground truth ambiguity and the beneficial effect of having a larger amount of ground truth data was detected and marked as the more-ground-truth effect

    Defining response to anti-VEGF therapies in neovascular AMD

    No full text
    The introduction of anti-vascular endothelial growth factor (anti-VEGF) has made significant impact on the reduction of the visual loss due to neovascular age-related macular degeneration (n-AMD). There are significant inter-individual differences in response to an anti-VEGF agent, made more complex by the availability of multiple anti-VEGF agents with different molecular configurations. The response to anti-VEGF therapy have been found to be dependent on a variety of factors including patient's age, lesion characteristics, lesion duration, baseline visual acuity (VA) and the presence of particular genotype risk alleles. Furthermore, a proportion of eyes with n-AMD show a decline in acuity or morphology, despite therapy or require very frequent re-treatment. There is currently no consensus as to how to classify optimal response, or lack of it, with these therapies. There is, in particular, confusion over terms such as 'responder status' after treatment for n-AMD, 'tachyphylaxis' and 'recalcitrant' n-AMD. This document aims to provide a consensus on definition/categorisation of the response of n-AMD to anti-VEGF therapies and on the time points at which response to treatment should be determined. Primary response is best determined at 1 month following the last initiation dose, while maintained treatment (secondary) response is determined any time after the 4th visit. In a particular eye, secondary responses do not mirror and cannot be predicted from that in the primary phase. Morphological and functional responses to anti-VEGF treatments, do not necessarily correlate, and may be dissociated in an individual eye. Furthermore, there is a ceiling effect that can negate the currently used functional metrics such as >5 letters improvement when the baseline VA is good (ETDRS>70 letters). It is therefore important to use a combination of both the parameters in determining the response.The following are proposed definitions: optimal (good) response is defined as when there is resolution of fluid (intraretinal fluid; IRF, subretinal fluid; SRF and retinal thickening), and/or improvement of >5 letters, subject to the ceiling effect of good starting VA. Poor response is defined as <25% reduction from the baseline in the central retinal thickness (CRT), with persistent or new IRF, SRF or minimal or change in VA (that is, change in VA of 0+4 letters). Non-response is defined as an increase in fluid (IRF, SRF and CRT), or increasing haemorrhage compared with the baseline and/or loss of >5 letters compared with the baseline or best corrected vision subsequently. Poor or non-response to anti-VEGF may be due to clinical factors including suboptimal dosing than that required by a particular patient, increased dosing intervals, treatment initiation when disease is already at an advanced or chronic stage), cellular mechanisms, lesion type, genetic variation and potential tachyphylaxis); non-clinical factors including poor access to clinics or delayed appointments may also result in poor treatment outcomes. In eyes classified as good responders, treatment should be continued with the same agent when disease activity is present or reactivation occurs following temporary dose holding. In eyes that show partial response, treatment may be continued, although re-evaluation with further imaging may be required to exclude confounding factors. Where there is persistent, unchanging accumulated fluid following three consecutive injections at monthly intervals, treatment may be withheld temporarily, but recommenced with the same or alternative anti-VEGF if the fluid subsequently increases (lesion considered active). Poor or non-response to anti-VEGF treatments requires re-evaluation of diagnosis and if necessary switch to alternative therapies including other anti-VEGF agents and/or with photodynamic therapy (PDT). Idiopathic polypoidal choroidopathy may require treatment with PDT monotherapy or combination with anti-VEGF. A committee comprised of retinal specialists with experience of managing patients with n-AMD similar to that which developed the Royal College of Ophthalmologists Guidelines to Ranibizumab was assembled. Individual aspects of the guidelines were proposed by the committee lead (WMA) based on relevant reference to published evidence base following a search of Medline and circulated to all committee members for discussion before approval or modification. Each draft was modified according to feedback from committee members until unanimous approval was obtained in the final draft. A system for categorising the range of responsiveness of n-AMD lesions to anti-VEGF therapy is proposed. The proposal is based primarily on morphological criteria but functional criteria have been included. Recommendations have been made on when to consider discontinuation of therapy either because of success or futility. These guidelines should help clinical decision-making and may prevent over and/or undertreatment with anti-VEGF therapy
    corecore