5 research outputs found

    Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge

    Get PDF
    Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorithms using the largest publicly available reference database of chest CT scans, the LIDC-IDRI data set. In LUNA16, participants develop their algorithm and upload their predictions on 888 CT scans in one of the two tracks: 1) the complete nodule detection track where a complete CAD system should be developed, or 2) the false positive reduction track where a provided set of nodule candidates should be classified. This paper describes the setup of LUNA16 and presents the results of the challenge so far. Moreover, the impact of combining individual systems on the detection performance was also investigated. It was observed that the leading solutions employed convolutional networks and used the provided set of nodule candidates. The combination of these solutions achieved an excellent sensitivity of over 95% at fewer than 1.0 false positives per scan. This highlights the potential of combining algorithms to improve the detection performance. Our observer study with four expert readers has shown that the best system detects nodules that were missed by expert readers who originally annotated the LIDC-IDRI data. We released this set of additional nodules for further development of CAD systems

    Evaluation of data balancing techniques. Application to CAD of lung nodules using the LUNA16 framework

    No full text
    Due to the high incidence of lung cancer, computer-aided detection (CAD) systems may play an increasingly important role in screening. Classification in CAD systems has to deal with highly imbalanced datasets composed by actual nodules and non-nodule structures. The application of data balancing techniques helps the training process of the classifiers, making the generation of the classification rules more effective. The purpose of this paper is to compare the performance of different data balancing techniques applied to the classification of lung nodules. According to the reviewed literature, this is the first time that different data balancing methods are evaluated on the problem of lung nodule detection using a large data set and at low false positive rates. A web-based framework was used to evaluate the different methods applied to a classical CAD system (ETROCAD) presented in the LUNA16 Challenge by calculating a score of average sensitivity at different values of false positives per scan. In our experiments, data balancing using SMOTE and SMOTE-TL led to the best results, with a score of 0.760 and 0.759 respectively, in comparison to 0.748 when not balancing the data. Although the impact on the overall score may seem marginal, adequate data balancing resulted in the correct classification of 36 additional candidate nodules at 4 FP/scan. At the time of writing this paper, the SMOTE-based ETROCAD system had the best score among all the classical systems using handcrafted features in LUNA16 web site.Debido a la alta incidencia del cáncer de pulmón a nivel mundial, los sistemas de diagnóstico asistidos por computadora (CAD por sus siglas en inglés) desempeñan un papel importante en los estudios de pesquisaje de la enfermedad. El proceso de clasificación en los sistemas CAD se ve deteriorado debido al bajo porciento de estructuras detectadas que se corresponden a nódulos verdaderos. El principal propósito de este trabajo es compararla influencia de las técnicas de balanceo de datos en la clasificación de nódulos pulmonares. De acuerdo con la literatura revisada, en este trabajo se presenta por primera vez la comparación entre balanceo de datos aplicado a la detección de nódulos pulmonares empleando un conjunto de imágenes grande para razones de falsos positivos bajas. Los métodos se aplicaron a un sistema CAD presentado en LUNA16 Challenge (ETROCAD). Los mejores resultados obtenidos se corresponden a los métodos SMOTE y SMOTE-TL con una sensibilidad promedio de 0.760 y 0.759 respectivamente, en contraste a 0.748 obtenido sin realizar ningún balanceo de datos. Aunque el impacto en el índice empleado en LUNA16 no es alto, un balanceo de datos adecuado permitió la detección correcta de 36 candidatos adicionales a una raso de 4 falsos positivos por imagen. En el momento de escritura de este trabajo, el desempeño del ETROCAD con balanceo de datos basado en SMOTE exhibe la mayor puntuación de entre los sistemas CAD clásicos
    corecore