9 research outputs found

    Learning detectors quickly using structured covariance matrices

    Full text link
    Computer vision is increasingly becoming interested in the rapid estimation of object detectors. Canonical hard negative mining strategies are slow as they require multiple passes of the large negative training set. Recent work has demonstrated that if the distribution of negative examples is assumed to be stationary, then Linear Discriminant Analysis (LDA) can learn comparable detectors without ever revisiting the negative set. Even with this insight, however, the time to learn a single object detector can still be on the order of tens of seconds on a modern desktop computer. This paper proposes to leverage the resulting structured covariance matrix to obtain detectors with identical performance in orders of magnitude less time and memory. We elucidate an important connection to the correlation filter literature, demonstrating that these can also be trained without ever revisiting the negative set

    Identity verification using computer vision for automatic garage door opening

    Get PDF
    We present a novel system for automatic identification of vehicles as part of an intelligent access control system for a garage entrance. Using a camera in the door, cars are detected and matched to the database of authenticated cars. Once a car is detected, License Plate Recognition (LPR) is applied using character detection and recognition. The found license plate number is matched with the database of authenticated plates. If the car is allowed access, the door will open automatically. The recognition of both cars and characters (LPR) is performed using state-ofthe- art shape descriptors and a linear classifier. Experiments have revealed that 90% of all cars are correctly authenticated from a single image only. Analysis of the computational complexity shows that an embedded implementation allows user authentication within approximately 300ms, which is well within the application constraints

    A tool for fast ground truth generation for object detection and tracking from video

    Full text link
    Object detection and tracking is one of the most important components in computer vision applications. To carefully evaluate the performance of detection and tracking algorithms, it is important to develop benchmark data sets. One of the most tedious and error-prone aspects when developing benchmarks, is the generation of the ground truth. This paper presents FAST-GT (FAst Semi-automatic Tool for Ground Truth generation), a new generic framework for the semiautomatic generation of ground truths. FAST-GT reduces the need for manual intervention thus speeding-up the ground-truthing process

    Shallow U-Net Deep Learning Approach for Phase Retrieval in Propagation-Based Phase-Contrast Imaging

    Get PDF
    X-Ray Computed Tomography (CT) has revolutionised modern medical imaging. However, X-Ray CT imaging requires patients to be exposed to radiation, which can increase the risk of cancer. Therefore there exists an aim to reduce radiation doses for CT imaging without sacrificing image accuracy. This research combines phase retrieval with the ShallowU-Net CNN method to achieve the aim. This paper shows that a significant change in existing machine learning neural network algorithms could improve the X-ray phase retrieval in propagationbased phase-contrast imaging. This paper applies deep learning methods, through a variant of the existing U-Net architecture, named ShallowU-Net, to show that it is possible to perform two distance X-ray phase retrieval on composite materials by predicting a portion of the required data. ShallowU-Net is faster in training and in deployment. This method also performs data stretching and pre-processing, to reduce the numerical instability of the U-Net algorithm thereby improving the phase retrieval images

    Inverted cone convolutional neural network for deboning MRIs

    Get PDF
    Data plenitude is the power but also the bottleneck for data-driven approaches, including neural networks. In particular, Convolutional Neural Networks (CNNs) require an abundant database of training images to achieve a desired high accuracy. Current techniques employed for boosting small datasets are data augmentation and synthetic data generation, which suffer from computational complexity and imprecision compared to original datasets. In this thesis, we intercalate prior knowledge based on the temporal relation between the images in the third dimension. Specifically, we compute the gradient of subsequent images in the dataset to remove extraneous information and highlight subtle variations between the images. The approach is coined Inverted Cone because the volume of brain images below the level of the eyes is ordered to form an inverted cone geometry. The application explored in this work is deboning, or brain extraction, in brain magnetic resonance imaging (MRI) scans. We considered a limited dataset of 23 patients with and without malignant glioblastoma provided by the University of Alabama at Birmingham School of Medicine. Automatic deboning was performed by employing an optimized CNN architecture with and without the Inverted Cone processing. The classic CNN achieved a validation accuracy of 77%, while the Inverted Cone CNN model achieved a validation accuracy of 86% in a dataset of 451 brain MRI slices

    Diseño, implementación y evaluación de una nueva estrategia de aprendizaje para redes neuronales convolucionales de transformación espacial de imágenes (STNs)

    Full text link
    [ES] Este trabajo consistirá en diseñar, implementar y evaluar diferentes métodos de convergencia para el desempeño de redes neuronales convolucionales de transformación espacial, en este caso trabajando sobre imágenes de gusanos (C. elegans). Inicialmente, el trabajo se centrará en el estudio y la comprensión de este tipo de redes neuronales de forma que se puedan plantear las diferentes estrategias a seguir. Para su evaluación, se contará con un dataset de parejas de imágenes de C. elegans, capturadas mediante dos cámaras, y el objetivo principal de los ensayos será transformar una de las imágenes, en la cual el gusano no aparece centrado, en la otra, en la cual sí lo estará. Para ello, se empleará la herramienta PyCharm como medio en el cual realizar los ensayos. Dicha herramienta emplea Python como lenguaje de programación, y mediante la librería de funciones de Pytorch junto a otras librerías típicas de Python se diseñarán tanto las redes neuronales como los diferentes métodos de convergencia que se van a evaluar. Finalmente, para la evaluación de las propuestas se emplearán diversos criterios entre los que estarán la tasa de acierto o los costes temporales de las ejecuciones. Además, se plantearán diversas aplicaciones en las cuales puedan emplearse los resultados aportados por este estudio.[CA] Aquest treball consistirà a dissenyar, implementar i avaluar diferents mètodes de convergència per a l'acompliment de xarxes neuronals convolucionals de transformació espacial, en aquest cas treballant sobre imatges de cucs (Caenorhabditis elegans). Inicialment, el treball se centrarà en l'estudi i la comprensió d'aquesta mena de xarxes neuronals de manera que es puguen plantejar les diferents estratègies a seguir. Per a la seua avaluació, es comptarà amb un dataset de parelles d'imatges de C. elegans, capturades mitjançant dues cambres, i l'objectiu principal dels assajos serà transformar una de les imatges, en la qual el cuc no apareix centrat, en l'altra, en la qual sí que ho estarà. Per a això, s'emprarà l'eina PyCharm com a mitjà en el qual realitzar els assajos. Aquesta eina empra Python com a llenguatge de programació, i mitjançant la llibreria de funcions de Pytorch al costat d'altres llibreries típiques de Python es dissenyaran tant les xarxes neuronals com els diferents mètodes de convergència que s'avaluaran. Finalment, per a l'avaluació de les propostes s'empraran diversos criteris entre els quals estaran la taxa d'encert o els costos temporals de les execucions. A més, es plantejaran diverses aplicacions en les quals puguen emprar-se els resultats aportats per aquest estudi.[EN] This work will consist of designing, implementing and evaluating different convergence methods for the performance of spatial transform convolutional neural networks, in this case working on images of worms (C. elegans). Initially, this work will focus on the study and understanding of this type of convolutional neural network so that can be proposed the different strategies to be followed. For its evaluation, there will be a dataset of C. elegans images pairs, captures by two cameras, and the main objective of the essays will be to transform one of these images, in which the worm does not appear in the middle of the image, into the other one, where it will be there. For this, the PyCharm tool will be used as the means in which to carry out the essays. This tool uses Python as its programming language, through the Pytorch function library together with others typical Python libraries, both the convolutional neural networks and the different convergence methods will be designed. Finally, for the evaluation of the proposed methods, several index will be used, among which will be the success rate or the computational costs. In addition, several applications will be proposed in which the results provided for this study can be used.Navarro Moya, F. (2021). Diseño, implementación y evaluación de una nueva estrategia de aprendizaje para redes neuronales convolucionales de transformación espacial de imágenes (STNs). Universitat Politècnica de València. http://hdl.handle.net/10251/173981TFG

    Fast training of object detection using stochastic gradient descent

    Get PDF
    Training datasets for object detection problems are typically very large and Support Vector Machine (SVM) implementations are computationally complex. As opposed to these complex techniques, we use Stochastic Gradient Descent (SGD) algorithms that use only a single new training sample in each iteration and process samples in a stream-like fashion. We have incorporated SGD optimization in an object detection framework. The object detection problem is typically highly asymmetric, because of the limited variation in object appearance, compared to the background. Incorporating SGD speeds up the optimization process significantly, requiring only a single iteration over the training set to obtain results comparable to state-of-the-art SVM techniques. SGD optimization is linearly scalable in time and the obtained speedup in computation time is two to three orders of magnitude. We show that by considering only part of the total training set, SGD converges quickly to the overall optimum
    corecore