866 research outputs found

    Application of Super Resolution Convolutional Neural Networks (SRCNNs) to enhance medical images resolution

    Get PDF
    The importance of resolution is crucial when working with medical images. The possibility to visualize details lead to a more accurate diagnosis and makes segmentation easier. However, obtention of high-resolution medical images requires of long acquisition times. In clinical environments, lack of time leads to the acquisition of low-resolution images. Super Resolution (SR) consist in post-processing images in order to enhance its resolution. During the last years, a branch of SR is getting promising results. This branch focuses in the application of Convolutional Neural Networks (CNNs) to the images. This project is intended to create a network able to enhance resolution of knee MR stored in DICOM format. Different networks are proposed, and evaluation is made by computing Peak Signal-to-Noise Ratio (PSNR) and normalized Cross-Correlation. One of the networks proposed, SR-DCNN, presented better results than the conventional method, bicubic interpolation. Finally, visual comparison of the SR-DCNN and bicubic interpolation also showed that the network proposed outperforms the conventional methods.Ingeniería Biomédic

    Connecting mathematical models for image processing and neural networks

    Get PDF
    This thesis deals with the connections between mathematical models for image processing and deep learning. While data-driven deep learning models such as neural networks are flexible and well performing, they are often used as a black box. This makes it hard to provide theoretical model guarantees and scientific insights. On the other hand, more traditional, model-driven approaches such as diffusion, wavelet shrinkage, and variational models offer a rich set of mathematical foundations. Our goal is to transfer these foundations to neural networks. To this end, we pursue three strategies. First, we design trainable variants of traditional models and reduce their parameter set after training to obtain transparent and adaptive models. Moreover, we investigate the architectural design of numerical solvers for partial differential equations and translate them into building blocks of popular neural network architectures. This yields criteria for stable networks and inspires novel design concepts. Lastly, we present novel hybrid models for inpainting that rely on our theoretical findings. These strategies provide three ways for combining the best of the two worlds of model- and data-driven approaches. Our work contributes to the overarching goal of closing the gap between these worlds that still exists in performance and understanding.Gegenstand dieser Arbeit sind die Zusammenhänge zwischen mathematischen Modellen zur Bildverarbeitung und Deep Learning. Während datengetriebene Modelle des Deep Learning wie z.B. neuronale Netze flexibel sind und gute Ergebnisse liefern, werden sie oft als Black Box eingesetzt. Das macht es schwierig, theoretische Modellgarantien zu liefern und wissenschaftliche Erkenntnisse zu gewinnen. Im Gegensatz dazu bieten traditionellere, modellgetriebene Ansätze wie Diffusion, Wavelet Shrinkage und Variationsansätze eine Fülle von mathematischen Grundlagen. Unser Ziel ist es, diese auf neuronale Netze zu übertragen. Zu diesem Zweck verfolgen wir drei Strategien. Zunächst entwerfen wir trainierbare Varianten von traditionellen Modellen und reduzieren ihren Parametersatz, um transparente und adaptive Modelle zu erhalten. Außerdem untersuchen wir die Architekturen von numerischen Lösern für partielle Differentialgleichungen und übersetzen sie in Bausteine von populären neuronalen Netzwerken. Daraus ergeben sich Kriterien für stabile Netzwerke und neue Designkonzepte. Schließlich präsentieren wir neuartige hybride Modelle für Inpainting, die auf unseren theoretischen Erkenntnissen beruhen. Diese Strategien bieten drei Möglichkeiten, das Beste aus den beiden Welten der modell- und datengetriebenen Ansätzen zu vereinen. Diese Arbeit liefert einen Beitrag zum übergeordneten Ziel, die Lücke zwischen den zwei Welten zu schließen, die noch in Bezug auf Leistung und Modellverständnis besteht.ERC Advanced Grant INCOVI

    Gabor Filter Initialization And Parameterization Strategies In Convolutional Neural Networks

    Get PDF
    Convolutional neural networks (CNN) have been widely known in literature to be extremely effective for classifying images. Some of the filters learned during training of the first layer of a CNN resemble the Gabor filter. Gabor filters are extremely good at extracting features within an image. We have taken this as an incentive by replacing the first layer of a CNN with the Gabor filter to increase speed and accuracy for classifying images. We created two simple 5-layer AlexNet-like CNNs comparing grid-search to random-search for initializing the Gabor filter bank. We trained on MNIST, CIFAR-10, and CIFAR-100 as well as a rock dataset created at Western University to study the classification of rock images using a CNN. When training on this rock dataset, we use an architecture from literature and use our Gabor filter substitution method to show the usage of the Gabor filter. Using the Gabor convolutional neural network (GCNN) showed improvements in the training speed across all datasets tested. We also found that the GCNN underperforms when dropout is added, even when overfitting becomes an issue. The size of the Gabor filter bank becomes a hyperparameter that can be tuned per dataset. Applying our Gabor filter replacement method to a 3-layer CNN reduced final accuracy at epoch 200 by 1:16% but showed large improvements in the speed of convergence during training with 93:44% accuracy on a validation set after 10 epochs compared to the original network’s 82:19%

    Robust Traffic Sign Detection by means of Vision and V2I Communications

    Get PDF
    14th International IEEE Annual Conference on Intelligent Transportation Systems - ITSC, , 05/10/2011-07/10/2011, Washington DC, Estados UnidosThis paper presents a complete traffic sign recognition system, including the steps of detection, recognition and tracking. The Hough transform is used as detection method from the information extracted in contour images, while the proposed recognition system is based on Support Vector Machines (SVM), and is able to recognize up to one hundred of the main road signs. Besides a novel solution to the problem of discarding detected signs that do not pertain to the host road is proposed, for that purpose vehicle-to-infrastructure (V2I) communication and stereo information is used. This paper presents plenty of tests in real driving conditions, both day and night, in which a high success rate and low number of false negatives and true positives were obtained, and an average runtime of 35 ms, allowing real-time performance

    Using CNNs in the domain of Visual Trademark Retrieval

    Get PDF
    To be specified upon arrival at University of Klagenfurt.Nowadays we are immersed in the age of Artificial Intelligence and it seems that every application has to be developed following this tendency. But even this actual boom, this technique is not so common to find in the domain of image trademark retrieval. Thus, what this thesis proposes is to create a tool to help the people in charge of the process of indexing and classifying the upcoming visual trademarks, to support them with suggestions following a standard classification. The project involves all the process end-to-end of a classification task in deep learning. From downloading and extraction of a data set for training and testing, passing through the analysis of the state of the art and finishing with the evaluation of the results in our own database. As this field is constantly developing we cannot predict what the future will bring. However, using deep learning in this scenario may help in the future to have more concise labelling and classification of visual trademarks compared to the results of the manual process.Hoy en día no encontramos inmersos en la era de la inteligencia artificial y parece que todas las aplicaciones hayan de ser desarrolladas siguiendo esta tendencia. Pero pese el actual boom en el que vivimos, esta técnica no es muy común encontrarla en el dominio de la retribución de imágenes de marcas. Por lo consiguiente, este trabajo propone crear una herramienta para ayudar a las personas que se encargan del proceso de indexación y clasificación de nuevas marcas, apoyándolos con sugerencias para seguir una clasificación estándar. El proyecto involucra todo el proceso de inicio a final de una tarea de clasificación de aprendizaje profundo. Yendo desde la descarga y extracción de conjunto de datos de entrenamiento y test, pasando por el análisis del estado del arte, y acabando por la evaluación de los resultados en nuestra propia base de datos. Ya que este campo está en constante desarrollo, no podemos predecir qué nos deparará el futuro. Aún y así, utilizando el aprendizaje profundo en este escenario, puede ayudar en un futuro a tener una tarea de etiquetado y clasificación de marcas comerciales comparables a los resultados del proceso manual.Avui en dia ens trobem immerses en l’era de la intel·ligència artificial i sembla que totes les aplicacions hagin de ser desenvolupades seguint aquesta tendència. Però tot i l’actual boom en que vivim, aquesta tècnica no es gaire comú trobar-la en el domini de la retribució d’imatges de marques. Per tant, el que proposa aquest treball és crear una eina per ajudar les persones que s’encarreguen del procés d’indexar i classificar les noves marques, recolzant-los amb suggeriments per seguir una classificació estàndard. El projecte involucra tot el procés de principi a fi d’una tasca de classificació en aprenentatge profund. Des de la descàrrega i extracció d’un conjunt de dades d’entrenament i test, passant per l’anàlisi de l’estat de l’art i acabant per l’avaluació dels resultats en la nostra pròpia base de dades. Ja que aquest camp està en constant desenvolupament, no podem predir el que ens depararà el futur. Tot i així, usant l’aprenentatge profund en aquest escenari, pot ajudar en un futur per tenir una tasca d’etiquetatge i classificació de marques comercials comparable als resultats del procés manual

    Learning Learning Algorithms

    Get PDF
    Machine learning models rely on data to learn any given task and depending on the universal diversity of each of the elements of the task and the design objectives, multiple data may be required for better performance, which in turn could exponentially increase learning time and computational cost. Although most of the training of machine learning models today are done using GPUs (Graphics Processing Unit) to speed up the training process, most however, depending on the dataset, still require a huge amount of training time to attain good performance. This study aims to look into learning learning algorithms or popularly known as metalearning which is a method that not only tries to improve the learning speed but also the model performance and in addition it requires fewer data and entails multiple tasks. The concept involves training a model that constantly learns to learn novel tasks at a fast rate from previously learned tasks. For the review of the related work, attention will be given to optimization-based methods and most precisely MAML (Model Agnostic MetaLearning), because first of all, it is one of the most popular state-of-the-art metalearning method, and second of all, this thesis focuses on creating a MAML based method called MAML-DBL that uses an adaptive learning rate technique with dynamic bounds that enables it to attain quick convergence at the beginning of the training process and good generalization towards the end. The proposed MAML variant aims to try to prevent vanishing learning rates during training and slowing down at the end where dense features are prevalent, although further hyperparameter tunning might be necessary for some models or where sparse features may be prevalent, for improved performance. MAML-DBL and MAML, were tested on the most commonly used datasets for metalearning models, and based on the results of the experiments, the proposed method showed a rather competitive performance on some of the models and even outperformed the baseline in some of the carried out tests. The results obtained with both MAML-DBL (in one of the dataset) and MAML, show that metalearning methods are highly recommendable solutions whenever good performance, less data and a multi-task or versatile model are required or desired.Os modelos de aprendizagem automática dependem dos dados para aprender qualquer tarefa e, dependendo da diversidade de cada um dos elementos da tarefa e dos objetivos do projeto, a quantidade de dados pode ser elevada, o que, por sua vez, pode aumentar exponencialmente o tempo de aprendizagem e o custo computacional. Embora a maioria do treino dos modelos de aprendizagem automática hoje seja feito usando GPUs (unidade de processamento gráfico), ainda é necessária uma quantidade enorme de tempo de treino para obter o desempenho desejado. Este trabalho tem como objetivo analisar os algoritmos de aprendizagem de aprendizagem ou popularmente conhecidos como metalearning, que são métodos que não apenas tentam melhorar a velocidade de aprendizagem, mas também o desempenho do modelo e, além disso, requerem menos dados e envolvem várias tarefas. O conceito envolve o treino de um modelo que aprende constantemente a aprender tarefas novas em ritmo acelerado, a partir de tarefas aprendidas anteriormente. Para a revisão do trabalho relacionado, será dada atenção aos métodos baseados em otimização e, mais precisamente, ao MAML (Model Agnostic MetaLearning), porque em primeiro lugar é um dos métodos de metalearning mais populares e em segundo lugar, esta tese foca a criação de um método baseado em MAML, chamado MAML-DBL, que usa uma técnica de taxa de aprendizagem adaptável com limites dinâmicos que permite obter convergência rápida no início do processo de treino e boa generalização no fim. A proposta variante de MAML tem como objetivo tentar evitar o desaparecimento das taxas de aprendizagem durante o treino e a desaceleração no fim onde entradas densas são predominantes, embora possa ser necessário um ajuste adicional dos hiperparâmetros para alguns modelos ou onde entradas esparsas podem ser predominantes, para melhorar o desempenho. O MAML-DBL e o MAML foram testados nos conjuntos de dados mais comumente usados para modelos de metalearning, e com base nos resultados das experiências, o método proposto mostrou um desempenho bastante competitivo em alguns dos modelos e até superou o baseline em alguns dos testes realizados. Os resultados obtidos com o MAML e MAML-DBL (num dos conjuntos de dados) mostram que os métodos de metalearning são soluções altamente recomendáveis sempre que um bom desempenho, menos dados e um modelo versátil ou com várias tarefas são necessários ou desejados

    JPEG-like Image Compression using Neural-network-based Block Classification and Adaptive Reordering of Transform Coefficients

    Get PDF
    The research described in this thesis addresses aspects of coding of discrete-cosinetransform (DCT) coefficients, that are present in a variety of transform-based digital-image-compression schemes such as JPEG. Coefficient reordering; that directly affects the symbol statistics for entropy coding, and therefore the effectiveness of entropy coding; is investigated. Adaptive zigzag reordering, a novel versatile technique that achieves efficient reordering by processing variable-size rectangular sub-blocks of coefficients, is developed. Classification of blocks of DCT coefficients using an artificial neural network (ANN) prior to adaptive zigzag reordering is also considered. Some established digital-image-compression techniques are reviewed, and the JPEG standard for the DCT-based method is studied in more detail. An introduction to artificial neural networks is provided. Lossless conversion of blocks of coefficients using adaptive zigzag reordering is investigated, and experimental results are presented. A versatile algorithm, that generates zigzag scan paths for sub-blocks of any dimensions using a binary decision tree, is developed. An implementation of the algorithm based on programmable logic devices (PLDs) is described demonstrating the feasibility of hardware implementations. Coding of the sub-block dimensions, that need to be retained in order to reconstruct a sub-block during decoding, based on the scan-path length is developed. Lossy conversion of blocks of coefficients is also considered, and experimental results are presented. A two-layer feedforward artificial neural network trained using an error-backpropagation algorithm, that determines the sub-block dimensions, is described. Isolated nonzero coefficients of small significance are discarded in some blocks, and therefore smaller sub-blocks are generated

    Improving the accuracy of weed species detection for robotic weed control in complex real-time environments

    Get PDF
    Alex Olsen applied deep learning and machine vision to improve the accuracy of weed species detection in real time complex environments. His robotic weed control prototype, AutoWeed, presents a new efficient tool for weed management in crop and pasture and has launched a startup agricultural technology company

    Ensemble Approach to the Semantic Segmentation of Satellite Images

    Get PDF
    Automatic classification and segmentation of land use land cover(LULC) is extremely important for understanding the relationship between humans and nature. Human pressures on the environment have drastically accelerated in the last decades, risking biodiversity and ecosystem services. Remote sensing via satellite imagery is an excellent tool to study LULC. Research has shown that deep learning encoder-decoder architectures have achieved worthy results in the area of LULC, however the application of an ensemble approach has not been well quantified. Studies have shown it to be useful in the area of medical imaging. Ensembling by pooling together predictions to produce better predictions is a well known technique in machine learning. This study aims to quantify the statistical improvement that a deep learning ensemble approach can give to solving a semantic segmentation problem on satellite imagery
    corecore