1,109 research outputs found
Automatic Pulmonary Nodule Detection in CT Scans Using Convolutional Neural Networks Based on Maximum Intensity Projection
Accurate pulmonary nodule detection is a crucial step in lung cancer
screening. Computer-aided detection (CAD) systems are not routinely used by
radiologists for pulmonary nodule detection in clinical practice despite their
potential benefits. Maximum intensity projection (MIP) images improve the
detection of pulmonary nodules in radiological evaluation with computed
tomography (CT) scans. Inspired by the clinical methodology of radiologists, we
aim to explore the feasibility of applying MIP images to improve the
effectiveness of automatic lung nodule detection using convolutional neural
networks (CNNs). We propose a CNN-based approach that takes MIP images of
different slab thicknesses (5 mm, 10 mm, 15 mm) and 1 mm axial section slices
as input. Such an approach augments the two-dimensional (2-D) CT slice images
with more representative spatial information that helps discriminate nodules
from vessels through their morphologies. Our proposed method achieves
sensitivity of 92.67% with 1 false positive per scan and sensitivity of 94.19%
with 2 false positives per scan for lung nodule detection on 888 scans in the
LIDC-IDRI dataset. The use of thick MIP images helps the detection of small
pulmonary nodules (3 mm-10 mm) and results in fewer false positives.
Experimental results show that utilizing MIP images can increase the
sensitivity and lower the number of false positives, which demonstrates the
effectiveness and significance of the proposed MIP-based CNNs framework for
automatic pulmonary nodule detection in CT scans. The proposed method also
shows the potential that CNNs could gain benefits for nodule detection by
combining the clinical procedure.Comment: Submitted to IEEE TM
Contribution to Graph-based Manifold Learning with Application to Image Categorization.
122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de características y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrínseca de la variedad
Contribution to Graph-based Manifold Learning with Application to Image Categorization.
122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de características y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrínseca de la variedad
What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification
Matching pedestrians across disjoint camera views, known as person
re-identification (re-id), is a challenging problem that is of importance to
visual recognition and surveillance. Most existing methods exploit local
regions within spatial manipulation to perform matching in local
correspondence. However, they essentially extract \emph{fixed} representations
from pre-divided regions for each image and perform matching based on the
extracted representation subsequently. For models in this pipeline, local finer
patterns that are crucial to distinguish positive pairs from negative ones
cannot be captured, and thus making them underperformed. In this paper, we
propose a novel deep multiplicative integration gating function, which answers
the question of \emph{what-and-where to match} for effective person re-id. To
address \emph{what} to match, our deep network emphasizes common local patterns
by learning joint representations in a multiplicative way. The network
comprises two Convolutional Neural Networks (CNNs) to extract convolutional
activations, and generates relevant descriptors for pedestrian matching. This
thus, leads to flexible representations for pair-wise images. To address
\emph{where} to match, we combat the spatial misalignment by performing
spatially recurrent pooling via a four-directional recurrent neural network to
impose spatial dependency over all positions with respect to the entire image.
The proposed network is designed to be end-to-end trainable to characterize
local pairwise feature interactions in a spatially aligned manner. To
demonstrate the superiority of our method, extensive experiments are conducted
over three benchmark data sets: VIPeR, CUHK03 and Market-1501.Comment: Published at Pattern Recognition, Elsevie
Object Detection using Dimensionality Reduction on Image Descriptors
The aim of object detection is to recognize objects in a visual scene. Performing reliable object detection is becoming increasingly important in the fields of computer vision and robotics. Various applications of object detection include video surveillance, traffic monitoring, digital libraries, navigation, human computer interaction, etc. The challenges involved with detecting real world objects include the multitude of colors, textures, sizes, and cluttered or complex backgrounds making objects difficult to detect.
This thesis contributes to the exploration of various dimensionality reduction techniques on descriptors for establishing an object detection system that achieves the best trade-offs between performance and speed. Histogram of Oriented Gradients (HOG) and other histogram-based descriptors were used as an input to a Support Vector Machine (SVM) classifier to achieve good classification performance. Binary descriptors were considered as a computationally efficient alternative to HOG. It was determined that single local binary descriptors in combination with Support Vector Machine (SVM) classifier don\u27t work as well as histograms of features for object detection. Thus, histogram of binary descriptors features were explored as a viable alternative and the results were found to be comparable to those of the popular Histogram of Oriented Gradients descriptor.
Histogram-based descriptors can be high dimensional and working with large amounts of data can be computationally expensive and slow. Thus, various dimensionality reduction techniques were considered, such as principal component analysis (PCA), which is the most widely used technique, random projections, which is data independent and fast to compute, unsupervised locality preserving projections (LPP), and supervised locality preserving projections (SLPP), which incorporate non-linear reduction techniques.
The classification system was tested on eye detection as well as different object classes. The eye database was created using BioID and FERET databases. Additionally, the CalTech-101 data set, which has 101 object categories, was used to evaluate the system. The results showed that the reduced-dimensionality descriptors based on SLPP gave improved classification performance with fewer computations
- …