1,109 research outputs found

    Automatic Pulmonary Nodule Detection in CT Scans Using Convolutional Neural Networks Based on Maximum Intensity Projection

    Get PDF
    Accurate pulmonary nodule detection is a crucial step in lung cancer screening. Computer-aided detection (CAD) systems are not routinely used by radiologists for pulmonary nodule detection in clinical practice despite their potential benefits. Maximum intensity projection (MIP) images improve the detection of pulmonary nodules in radiological evaluation with computed tomography (CT) scans. Inspired by the clinical methodology of radiologists, we aim to explore the feasibility of applying MIP images to improve the effectiveness of automatic lung nodule detection using convolutional neural networks (CNNs). We propose a CNN-based approach that takes MIP images of different slab thicknesses (5 mm, 10 mm, 15 mm) and 1 mm axial section slices as input. Such an approach augments the two-dimensional (2-D) CT slice images with more representative spatial information that helps discriminate nodules from vessels through their morphologies. Our proposed method achieves sensitivity of 92.67% with 1 false positive per scan and sensitivity of 94.19% with 2 false positives per scan for lung nodule detection on 888 scans in the LIDC-IDRI dataset. The use of thick MIP images helps the detection of small pulmonary nodules (3 mm-10 mm) and results in fewer false positives. Experimental results show that utilizing MIP images can increase the sensitivity and lower the number of false positives, which demonstrates the effectiveness and significance of the proposed MIP-based CNNs framework for automatic pulmonary nodule detection in CT scans. The proposed method also shows the potential that CNNs could gain benefits for nodule detection by combining the clinical procedure.Comment: Submitted to IEEE TM

    Contribution to Graph-based Manifold Learning with Application to Image Categorization.

    Get PDF
    122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de características y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrínseca de la variedad

    Contribution to Graph-based Manifold Learning with Application to Image Categorization.

    Get PDF
    122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de características y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrínseca de la variedad

    What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification

    Full text link
    Matching pedestrians across disjoint camera views, known as person re-identification (re-id), is a challenging problem that is of importance to visual recognition and surveillance. Most existing methods exploit local regions within spatial manipulation to perform matching in local correspondence. However, they essentially extract \emph{fixed} representations from pre-divided regions for each image and perform matching based on the extracted representation subsequently. For models in this pipeline, local finer patterns that are crucial to distinguish positive pairs from negative ones cannot be captured, and thus making them underperformed. In this paper, we propose a novel deep multiplicative integration gating function, which answers the question of \emph{what-and-where to match} for effective person re-id. To address \emph{what} to match, our deep network emphasizes common local patterns by learning joint representations in a multiplicative way. The network comprises two Convolutional Neural Networks (CNNs) to extract convolutional activations, and generates relevant descriptors for pedestrian matching. This thus, leads to flexible representations for pair-wise images. To address \emph{where} to match, we combat the spatial misalignment by performing spatially recurrent pooling via a four-directional recurrent neural network to impose spatial dependency over all positions with respect to the entire image. The proposed network is designed to be end-to-end trainable to characterize local pairwise feature interactions in a spatially aligned manner. To demonstrate the superiority of our method, extensive experiments are conducted over three benchmark data sets: VIPeR, CUHK03 and Market-1501.Comment: Published at Pattern Recognition, Elsevie

    Object Detection using Dimensionality Reduction on Image Descriptors

    Get PDF
    The aim of object detection is to recognize objects in a visual scene. Performing reliable object detection is becoming increasingly important in the fields of computer vision and robotics. Various applications of object detection include video surveillance, traffic monitoring, digital libraries, navigation, human computer interaction, etc. The challenges involved with detecting real world objects include the multitude of colors, textures, sizes, and cluttered or complex backgrounds making objects difficult to detect. This thesis contributes to the exploration of various dimensionality reduction techniques on descriptors for establishing an object detection system that achieves the best trade-offs between performance and speed. Histogram of Oriented Gradients (HOG) and other histogram-based descriptors were used as an input to a Support Vector Machine (SVM) classifier to achieve good classification performance. Binary descriptors were considered as a computationally efficient alternative to HOG. It was determined that single local binary descriptors in combination with Support Vector Machine (SVM) classifier don\u27t work as well as histograms of features for object detection. Thus, histogram of binary descriptors features were explored as a viable alternative and the results were found to be comparable to those of the popular Histogram of Oriented Gradients descriptor. Histogram-based descriptors can be high dimensional and working with large amounts of data can be computationally expensive and slow. Thus, various dimensionality reduction techniques were considered, such as principal component analysis (PCA), which is the most widely used technique, random projections, which is data independent and fast to compute, unsupervised locality preserving projections (LPP), and supervised locality preserving projections (SLPP), which incorporate non-linear reduction techniques. The classification system was tested on eye detection as well as different object classes. The eye database was created using BioID and FERET databases. Additionally, the CalTech-101 data set, which has 101 object categories, was used to evaluate the system. The results showed that the reduced-dimensionality descriptors based on SLPP gave improved classification performance with fewer computations
    corecore