15 research outputs found

    The Lung Image Database Consortium (LIDC):ensuring the integrity of expert-defined "truth"

    Get PDF
    RATIONALE AND OBJECTIVES: Computer-aided diagnostic (CAD) systems fundamentally require the opinions of expert human observers to establish “truth” for algorithm development, training, and testing. The integrity of this “truth,” however, must be established before investigators commit to this “gold standard” as the basis for their research. The purpose of this study was to develop a quality assurance (QA) model as an integral component of the “truth” collection process concerning the location and spatial extent of lung nodules observed on computed tomography (CT) scans to be included in the Lung Image Database Consortium (LIDC) public database. MATERIALS AND METHODS: One hundred CT scans were interpreted by four radiologists through a two-phase process. For the first of these reads (the “blinded read phase”), radiologists independently identified and annotated lesions, assigning each to one of three categories: “nodule ≥ 3mm,” “nodule < 3mm,” or “non-nodule ≥ 3mm.” For the second read (the “unblinded read phase”), the same radiologists independently evaluated the same CT scans but with all of the annotations from the previously performed blinded reads presented; each radiologist could add marks, edit or delete their own marks, change the lesion category of their own marks, or leave their marks unchanged. The post-unblinded-read set of marks was grouped into discrete nodules and subjected to the QA process, which consisted of (1) identification of potential errors introduced during the complete image annotation process (such as two marks on what appears to be a single lesion or an incomplete nodule contour) and (2) correction of those errors. Seven categories of potential error were defined; any nodule with a mark that satisfied the criterion for one of these categories was referred to the radiologist who assigned that mark for either correction or confirmation that the mark was intentional. RESULTS: A total of 105 QA issues were identified across 45 (45.0%) of the 100 CT scans. Radiologist review resulted in modifications to 101 (96.2%) of these potential errors. Twenty-one lesions erroneously marked as lung nodules after the unblinded reads had this designation removed through the QA process. CONCLUSION: The establishment of “truth” must incorporate a QA process to guarantee the integrity of the datasets that will provide the basis for the development, training, and testing of CAD systems

    Assessment of Radiologist Performance in the Detection of Lung Nodules

    Get PDF
    RATIONALE AND OBJECTIVES: Studies that evaluate the lung-nodule-detection performance of radiologists or computerized methods depend on an initial inventory of the nodules within the thoracic images (the “truth”). The purpose of this study was to analyze (1) variability in the “truth” defined by different combinations of experienced thoracic radiologists and (2) variability in the performance of other experienced thoracic radiologists based on these definitions of “truth” in the context of lung nodule detection on computed tomography (CT) scans. MATERIALS AND METHODS: Twenty-five thoracic CT scans were reviewed by four thoracic radiologists, who independently marked lesions they considered to be nodules ≥ 3 mm in maximum diameter. Panel “truth” sets of nodules then were derived from the nodules marked by different combinations of two and three of these four radiologists. The nodule-detection performance of the other radiologists was evaluated based on these panel “truth” sets. RESULTS: The number of “true” nodules in the different panel “truth” sets ranged from 15–89 (mean: 49.8±25.6). The mean radiologist nodule-detection sensitivities across radiologists and panel “truth” sets for different panel “truth” conditions ranged from 51.0–83.2%; mean false-positive rates ranged from 0.33–1.39 per case. CONCLUSION: Substantial variability exists across radiologists in the task of lung nodule identification in CT scans. The definition of “truth” on which lung nodule detection studies are based must be carefully considered, since even experienced thoracic radiologists may not perform well when measured against the “truth” established by other experienced thoracic radiologists

    The Lung Image Database Consortium (LIDC):A comparison of different size metrics for pulmonary nodule measurements

    Get PDF
    RATIONALE AND OBJECTIVES: To investigate the effects of choosing between different metrics in estimating the size of pulmonary nodules as a factor both of nodule characterization and of performance of computer aided detection systems, since the latters are always qualified with respect to a given size range of nodules. MATERIALS AND METHODS: This study used 265 whole-lung CT scans documented by the Lung Image Database Consortium using their protocol for nodule evaluation. Each inspected lesion was reviewed independently by four experienced radiologists who provided boundary markings for nodules larger than 3 mm. Four size metrics, based on the boundary markings, were considered: a uni-dimensional and two bi-dimensional measures on a single image slice and a volumetric measurement based on all the image slices. The radiologist boundaries were processed and those with four markings were analyzed to characterize the inter-radiologist variation, while those with at least one marking were used to examine the difference between the metrics. RESULTS: The processing of the annotations found 127 nodules marked by all of the four radiologists and an extended set of 518 nodules each having at least one observation with three-dimensional sizes ranging from 2.03 to 29.4 mm (average 7.05 mm, median 5.71 mm). A very high inter-observer variation was observed for all these metrics: 95% of estimated standard deviations were in the following ranges [0.49, 1.25], [0.67, 2.55], [0.78, 2.11], and [0.96, 2.69] for the three-dimensional, the uni-dimensional, and the two bi-dimensional size metrics respectively (in mm). Also a very large difference among the metrics was observed: 0.95 probability-coverage region widths for the volume estimation conditional on uni-dimensional, and the two bi-dimensional size measurements of 10mm were 7.32, 7.72, and 6.29 mm respectively. CONCLUSIONS: The selection of data subsets for performance evaluation is highly impacted by the size metric choice. The LIDC plans to include a single size measure for each nodule in its database. This metric is not intended as a gold standard for nodule size; rather, it is intended to facilitate the selection of unique repeatable size limited nodule subsets

    The Lung Image Database Consortium (LIDC): An Evaluation of Radiologist Variability in the Identification of Lung Nodules on CT Scans

    Get PDF
    RATIONALE AND OBJECTIVES: The purpose of this study was to analyze the variability of experienced thoracic radiologists in the identification of lung nodules on CT scans and thereby to investigate variability in the establishment of the “truth” against which nodule-based studies are measured. MATERIALS AND METHODS: Thirty CT scans were reviewed twice by four thoracic radiologists through a two-phase image annotation process. During the initial “blinded read” phase, radiologists independently marked lesions they identified as “nodule ≥ 3mm (diameter),” “nodule < 3mm,” or “non-nodule ≥ 3mm.” During the subsequent “unblinded read” phase, the blinded read results of all radiologists were revealed to each of the four radiologists, who then independently reviewed their marks along with the anonymous marks of their colleagues; a radiologist’s own marks then could be deleted, added, or left unchanged. This approach was developed to identify, as completely as possible, all nodules in a scan without requiring forced consensus. RESULTS: After the initial blinded read phase, a total of 71 lesions received “nodule ≥ 3mm” marks from at least one radiologist; however, all four radiologists assigned such marks to only 24 (33.8%) of these lesions. Following the unblinded reads, a total of 59 lesions were marked as “nodule ≥ 3 mm” by at least one radiologist. 27 (45.8%) of these lesions received such marks from all four radiologists, 3 (5.1%) were identified as such by three radiologists, 12 (20.3%) were identified by two radiologists, and 17 (28.8%) were identified by only a single radiologist. CONCLUSION: The two-phase image annotation process yields improved agreement among radiologists in the interpretation of nodules ≥ 3mm. Nevertheless, substantial variabilty remains across radiologists in the task of lung nodule identification

    Evaluation of lung MDCT nodule annotation across radiologists and methods

    Get PDF
    RATIONALE AND OBJECTIVES: Integral to the mission of the National Institutes of Health–sponsored Lung Imaging Database Consortium is the accurate definition of the spatial location of pulmonary nodules. Because the majority of small lung nodules are not resected, a reference standard from histopathology is generally unavailable. Thus assessing the source of variability in defining the spatial location of lung nodules by expert radiologists using different software tools as an alternative form of truth is necessary. MATERIALS AND METHODS: The relative differences in performance of six radiologists each applying three annotation methods to the task of defining the spatial extent of 23 different lung nodules were evaluated. The variability of radiologists’ spatial definitions for a nodule was measured using both volumes and probability maps (p-map). Results were analyzed using a linear mixed-effects model that included nested random effects. RESULTS: Across the combination of all nodules, volume and p-map model parameters were found to be significant at P < .05 for all methods, all radiologists, and all second-order interactions except one. The radiologist and methods variables accounted for 15% and 3.5% of the total p-map variance, respectively, and 40.4% and 31.1% of the total volume variance, respectively. CONCLUSION: Radiologists represent the major source of variance as compared with drawing tools independent of drawing metric used. Although the random noise component is larger for the p-map analysis than for volume estimation, the p-map analysis appears to have more power to detect differences in radiologist-method combinations. The standard deviation of the volume measurement task appears to be proportional to nodule volume

    The role of hydrothermal fluids in sedimentation in saline alkaline lakes : evidence from Nasikie Engida, Kenya Rift Valley

    No full text
    Saline alkaline lakes that precipitate sodium carbonate evaporites are most common in volcanic terrains in semi‐arid environments. Processes that lead to trona precipitation are poorly understood compared to those in sulphate‐dominated and chloride‐dominated lake brines. Nasikie Engida (Little Magadi) in the southern Kenya Rift shows the initial stages of soda evaporite formation. This small shallow (<2 m deep; 7 km long) lake is recharged by alkaline hot springs and seasonal runoff but unlike neighbouring Lake Magadi is perennial. This study aims to understand modern sedimentary and geochemical processes in Nasikie Engida and to assess the importance of geothermal fluids in evaporite formation. Perennial hot‐spring inflow waters along the northern shoreline evaporate and become saturated with respect to nahcolite and trona, which precipitate in the southern part of the lake, up to 6 km from the hot springs. Nahcolite (NaHCO3) forms bladed crystals that nucleate on the lake floor. Trona (Na2CO3·NaHCO3·2H2O) precipitates from more concentrated brines as rafts and as bottom‐nucleated shrubs of acicular crystals that coalesce laterally to form bedded trona. Many processes modify the fluid composition as it evolves. Silica is removed as gels and by early diagenetic reactions and diatoms. Sulphate is depleted by bacterial reduction. Potassium and chloride, of moderate concentration, remain conservative in the brine. Clastic sedimentation is relatively minor because of the predominant hydrothermal inflow. Nahcolite precipitates when and where pCO2 is high, notably near sublacustrine spring discharge. Results from Nasikie Engida show that hot spring discharge has maintained the lake for at least 2 kyr, and that the evaporite formation is strongly influenced by local discharge of carbon dioxide. Brine evolution and evaporite deposition at Nasikie Engida help to explain conditions under which ancient sodium carbonate evaporites formed, including those in other East African rift basins, the Eocene Green River Formation (western USA), and elsewhere
    corecore