298 research outputs found

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

    Style Transfer with Generative Adversarial Networks

    Get PDF
    This dissertation is focused on trying to use concepts from style transfer and image-to-image translation to address the problem of defogging. Defogging (or dehazing) is the ability to remove fog from an image, restoring it as if the photograph was taken during optimal weather conditions. The task of defogging is of particular interest in many fields, such as surveillance or self driving cars. In this thesis an unpaired approach to defogging is adopted, trying to translate a foggy image to the correspondent clear picture without having pairs of foggy and ground truth haze-free images during training. This approach is particularly significant, due to the difficult of gathering an image collection of exactly the same scenes with and without fog. Many of the models and techniques used in this dissertation already existed in literature, but they are extremely difficult to train, and often it is highly problematic to obtain the desired behavior. Our contribute was a systematic implementative and experimental activity, conducted with the aim of attaining a comprehensive understanding of how these models work, and the role of datasets and training procedures in the final results. We also analyzed metrics and evaluation strategies, in order to seek to assess the quality of the presented model in the most correct and appropriate manner. First, the feasibility of an unpaired approach to defogging was analyzed, using the cycleGAN model. Then, the base model was enhanced with a cycle perceptual loss, inspired by style transfer techniques. Next, the role of the training set was investigated, showing that improving the quality of data is at least as important as the utilization of more powerful models. Finally, our approach is compared with state-of-the art defogging methods, showing that the quality of our results is in line with preexisting approaches, even if our model was trained using unpaired data

    Cancer diagnosis using deep learning: A bibliographic review

    Get PDF
    In this paper, we first describe the basics of the field of cancer diagnosis, which includes steps of cancer diagnosis followed by the typical classification methods used by doctors, providing a historical idea of cancer classification techniques to the readers. These methods include Asymmetry, Border, Color and Diameter (ABCD) method, seven-point detection method, Menzies method, and pattern analysis. They are used regularly by doctors for cancer diagnosis, although they are not considered very efficient for obtaining better performance. Moreover, considering all types of audience, the basic evaluation criteria are also discussed. The criteria include the receiver operating characteristic curve (ROC curve), Area under the ROC curve (AUC), F1 score, accuracy, specificity, sensitivity, precision, dice-coefficient, average accuracy, and Jaccard index. Previously used methods are considered inefficient, asking for better and smarter methods for cancer diagnosis. Artificial intelligence and cancer diagnosis are gaining attention as a way to define better diagnostic tools. In particular, deep neural networks can be successfully used for intelligent image analysis. The basic framework of how this machine learning works on medical imaging is provided in this study, i.e., pre-processing, image segmentation and post-processing. The second part of this manuscript describes the different deep learning techniques, such as convolutional neural networks (CNNs), generative adversarial models (GANs), deep autoencoders (DANs), restricted Boltzmann’s machine (RBM), stacked autoencoders (SAE), convolutional autoencoders (CAE), recurrent neural networks (RNNs), long short-term memory (LTSM), multi-scale convolutional neural network (M-CNN), multi-instance learning convolutional neural network (MIL-CNN). For each technique, we provide Python codes, to allow interested readers to experiment with the cited algorithms on their own diagnostic problems. The third part of this manuscript compiles the successfully applied deep learning models for different types of cancers. Considering the length of the manuscript, we restrict ourselves to the discussion of breast cancer, lung cancer, brain cancer, and skin cancer. The purpose of this bibliographic review is to provide researchers opting to work in implementing deep learning and artificial neural networks for cancer diagnosis a knowledge from scratch of the state-of-the-art achievements

    Deep Learning-Based Object Detection in Maritime Unmanned Aerial Vehicle Imagery: Review and Experimental Comparisons

    Full text link
    With the advancement of maritime unmanned aerial vehicles (UAVs) and deep learning technologies, the application of UAV-based object detection has become increasingly significant in the fields of maritime industry and ocean engineering. Endowed with intelligent sensing capabilities, the maritime UAVs enable effective and efficient maritime surveillance. To further promote the development of maritime UAV-based object detection, this paper provides a comprehensive review of challenges, relative methods, and UAV aerial datasets. Specifically, in this work, we first briefly summarize four challenges for object detection on maritime UAVs, i.e., object feature diversity, device limitation, maritime environment variability, and dataset scarcity. We then focus on computational methods to improve maritime UAV-based object detection performance in terms of scale-aware, small object detection, view-aware, rotated object detection, lightweight methods, and others. Next, we review the UAV aerial image/video datasets and propose a maritime UAV aerial dataset named MS2ship for ship detection. Furthermore, we conduct a series of experiments to present the performance evaluation and robustness analysis of object detection methods on maritime datasets. Eventually, we give the discussion and outlook on future works for maritime UAV-based object detection. The MS2ship dataset is available at \href{https://github.com/zcj234/MS2ship}{https://github.com/zcj234/MS2ship}.Comment: 32 pages, 18 figure

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Intelligent Transportation Related Complex Systems and Sensors

    Get PDF
    Building around innovative services related to different modes of transport and traffic management, intelligent transport systems (ITS) are being widely adopted worldwide to improve the efficiency and safety of the transportation system. They enable users to be better informed and make safer, more coordinated, and smarter decisions on the use of transport networks. Current ITSs are complex systems, made up of several components/sub-systems characterized by time-dependent interactions among themselves. Some examples of these transportation-related complex systems include: road traffic sensors, autonomous/automated cars, smart cities, smart sensors, virtual sensors, traffic control systems, smart roads, logistics systems, smart mobility systems, and many others that are emerging from niche areas. The efficient operation of these complex systems requires: i) efficient solutions to the issues of sensors/actuators used to capture and control the physical parameters of these systems, as well as the quality of data collected from these systems; ii) tackling complexities using simulations and analytical modelling techniques; and iii) applying optimization techniques to improve the performance of these systems. It includes twenty-four papers, which cover scientific concepts, frameworks, architectures and various other ideas on analytics, trends and applications of transportation-related data

    Deep learning with multiple modalities : making the most out of available data

    Get PDF
    L’apprentissage profond, un sous domaine de l’apprentissage machine, est reconnu pour nécessiter une très grande quantité de données pour atteindre des performances satisfaisantes en généralisation. Une autre restriction actuelle des systèmes utilisant l’apprentissage machine en lien avec les données est la nécessité d’avoir accès au même type de données autant durant la phase d’entrainement du modèle que durant la phase de test de celui-ci. Dans plusieurs cas, ceci rend inutilisable en entrainement des données de modalité supplémentaire pouvant possiblement apporter de l’information additionnelle au système et l’améliorer. Dans ce mémoire, plusieurs méthodes d’entrainement permettant de tirer avantage de modalités additionnelles disponibles dans des jeux de données seulement en entrainement et non durant la phase de test seront proposées. Pour débuter, nous nous intéressons à diminuer le bruit présent dans images.. On débute le mémoire avec la technique la plus simple, soit un débruitage avant une tâche pour augmenter la capacité du système à faire cette tâche. Par la suite, deux techniques un peu plus poussées proposant de faire un débruitage guidé pour augmenter les performances d’une tâche subséquente sont présentées. On conclut finalement cette thèse en présentant une technique du nom d’Input Dropout permettant d’utiliser très facilement une modalité seulement disponible en entrainement pour augmenter les performances d’un système, et ce pour une multitude de tâches variées de vision numérique.Deep learning, a sub-domain of machine learning, is known to require a very large amount of data to achieve satisfactory performance in generalization. Another current limitation of these machine learning systems is the need to have access to the same type of data during the training phase of the model as during its testing phase. In many cases, this renders unusable training on additional modality data that could possibly bring additional information to the system and improve it. In this thesis, several training methods will be proposed to take advantage of additional modalities available in datasets only in training and not in testing. We will be particularly interested in reducing the noise present in images. The thesis begins with the simplest technique, which is a denoising before a task to increase the system’s ability to perform a task. Then, two more advanced techniques are presented, which propose guided denoising to increase the performance of a subsequent task. Finally, we conclude this thesis by presenting a technique called Input Dropout that facilitates the use of modality only available in training to increase the performance of a system, and this for a multitude of varied computer vision tasks

    LiDAR-based Weather Detection: Automotive LiDAR Sensors in Adverse Weather Conditions

    Get PDF
    Technologische Verbesserungen erhöhen den Automatisierungsgrad von Fahrzeugen. Der natürliche Schritt ist dabei, den Fahrer dort zu unterstützen, wo er es am meisten wünscht: bei schlechtem Wetter. Das Wetter beeinflusst alle Sensoren, die zur Wahrnehmung der Umgebung verwendet werden, daher ist es entscheidend, diese Effekte zu berücksichtigen und abzuschwächen. Die vorliegende Dissertation konzentriert sich auf die gerade entstehende Technologie der automobilen Light Detection and Ranging (LiDAR)-Sensoren und trägt zur Entwicklung von autonomen Fahrzeugen bei, die in der Lage sind, unter verschiedenen Wetterbedingungen zu fahren. Die Grundlage ist der erste LiDAR-Punktwolken-Datensatz mit dem Schwerpunkt auf schlechte Wetterbedingungen, welcher punktweise annonatatierte Wetterinformationen enthält, während er unter kontrollierten Wetterbedingungen aufgezeichnet wurde. Dieser Datensatz wird durch eine neuartige Wetter-Augmentation erweitert, um realistische Wettereffekte erzeugen zu können. Ein neuartiger Ansatz zur Klassifizierung des Wetterzustands und der erste CNN-basierte Entrauschungsalgorithmus werden entwickelt. Das Ergebnis ist eine genaue Vorhersage des Wetterstatus und eine Verbesserung der Punktwolkenqualität. Kontrollierte Umgebungen unter verschiedenen Wetterbedingungen ermöglichen die Evaluierung der oben genannten Ansätze und liefern wertvolle Informationen für das automatisierte und autonome Fahren

    Retinal image quality assessment using deep convolutional neural networks

    Get PDF
    Dissertação de mestrado integrado em Engenharia Biomédica (área de especialização em Informática Médica)Diabetic Retinopathy (DR) and diabetic macular edema (DME) are the damages caused to the retina and are complications that can affect the diabetic population. Diabetic retinopathy (DR), is the most common disease due to the presence of exudates and has three levels of severity, such as mild, moderate and severe, depending on the exudates distribution in the retina. For screening of diabetic retinopathy or a population-based clinical study, a large number of digital fundus images are captured and to be possible to recognize the signs of DR and DME, it is necessary that the images have quality, because low-quality images may force the patient to return for a second examination, wasting time and possibly delaying treatment. These images are evaluated by trained human experts, which can be a time-consuming and expensive task due to the number of images that need to be examined. Therefore, this is a field that would be hugely benefited with the development of an automated eye fundus quality assessment and analysis systems. It can potentially facilitate health care in remote regions and in developing countries where reading skills are scarce. Deep Learning is a kind of Machine Learning method that involves learning multi-level representations that begin with raw data entry and gradually moves to more abstract levels through non-linear transformations. With enough training data and sufficiently deep architectures, neural networks, such as Convolutional Neural Networks (CNN), can learn very complex functions and discover complex structures in the data. Thus, Deep Learning emerges as a powerful tool for medical image analysis and evaluation of retinal image quality using computer-aided diagnosis. Therefore, the aim of this study is to automatically assess all the three quality parameters alone (focus, illumination and color), and then an overall quality of fundus images assessment, classifying the images into the classes “accept” or “reject with a Deep Learning approach using convolutional neural networks (CNN). For the overall classification, the following results were obtained: test accuracy=97.89%, SN=97.9%, AUC=0.98 and 1-score=97.91%.A retinopatia diabética (RD) e o edema macular diabético (EMD) são patologias da retina e são uma complicação que pode afetar a população diabética. A retinopatia diabética é a doença mais comum devido à presença de exsudatos e possui três níveis de gravidade, como leve, moderado e grave, dependendo da distribuição dos exsudatos na retina. Para triagem da retinopatia diabética ou estudo clínico de base populacional, um grande número de imagens digitais de fundo do olho são capturadas e para ser possível reconhecer os sinais da RD e EMD, é necessário que as imagens tenham qualidade, pois imagens de baixa qualidade podem forçar o paciente a retornar para um segundo exame, perdendo tempo e, possivelmente, retardando o tratamento. Essas imagens são avaliadas por especialistas humanos treinados, o que pode ser uma tarefa demorada e cara devido ao número de imagens que precisam de ser examinadas. Portanto, este é um campo que seria enormemente beneficiado com o desenvolvimento de sistemas automatizados de avaliação e análise da qualidade da imagem do fundo de olho. Pode potencialmente facilitar a assistência médica em regiões remotas e em países em desenvolvimento, onde as habilidades de leitura são escassas. Deep Learning é um tipo de método de Machine Learning que envolve a aprendizagem de representações em vários níveis que começam com a entrada de dados brutos e gradualmente se transformam para níveis mais abstratos através de transformações não lineares, para se obterem as previsões. Com dados de treino suficientes e arquiteturas suficientemente profundas, as redes neuronais, como as Convolutional Neural Networks (CNN), podem aprender funções muito complexas e descobrir estruturas complexas nos dados. Assim, o Deep Learning surge como uma ferramenta poderosa para analisar imagens médicas para avaliação da qualidade da retina, usando diagnóstico auxiliado por computador a partir do fundo do olho. Portanto, o objetivo deste estudo é avaliar automaticamente a qualidade geral das imagens do fundo, classificando as imagens em “aceites” ou “rejeitadas”, com base em três parâmetros principais, como o foco, a iluminação e cor com abordagem de Deep Learning usando convolutional neural networks (CNN). Para a classificação geral da qualidade das imagens, obtiveram-se os seguintes resultados: acurácia do teste = 97,89%, SN = 97,9%, AUC = 0,98 e 1-score=97.91%

    Sea surface wind and wave parameter estimation from X-band marine radar images with rain detection and mitigation

    Get PDF
    In this research, the application of X-band marine radar backscatter images for sea surface wind and wave parameter estimation with rain detection and mitigation is investigated. In the presence of rain, the rain echoes in the radar image blur the wave signatures and negatively affect estimation accuracy. Hence, in order to improve estimation accuracy, it is meaningful to detect the presence of those rain echoes and mitigate their influence on estimation results. Since rain alters radar backscatter intensity distribution, features are extracted from the normalized histogram of each radar image. Then, a support vector machine (SVM)-based rain detection model is proposed to classify radar images obtained between rainless and rainy conditions. The classification accuracy shows significant improvement compared to the existing threshold-based method. By further observing images obtained under rainy conditions, it is found that many of them are only partially contaminated by rain echoes. Therefore, in order to segment between rain-contaminated regions and those that are less or unaffected by rain, two types of methods are developed based on unsupervised learning techniques and convolutional neural network (CNN), respectively. Specifically, for the unsupervised learning-based method, texture features are first extracted from each pixel and then trained using a self organizing map (SOM)-based clustering model, which is able to conduct pixel-based identification of rain-contaminated regions. As for the CNN-based method, a SegNet-based semantic segmentation CNN is �rst designed and then trained using images with manually annotated labels. Both shipborne and shore-based marine radar data are used to train and validate the proposed methods and high classification accuracies of around 90% are obtained. Due to the similarities between how haze affects terrestrial images and how rain affects marine radar images, a type of CNN for image dehazing purposes, i.e., DehazeNet, is applied to rain-contaminated regions in radar images for correcting the in uence of rain, which reduces the estimation error of wind direction significantly. Besides, after extracting histogram and texture features from rain-corrected radar images, a support vector regression (SVR)-based model, which achieves high estimation accuracy, is trained for wind speed estimation. Finally, a convolutional gated recurrent unit (CGRU) network is designed and trained for significant wave height (SWH) estimation. As an end-to-end system, the proposed network is able to generate estimation results directly from radar image sequences by extracting multi-scale spatial and temporal features in radar image sequences automatically. Compared to the classic signal-to-noise (SNR)-based method, the CGRU-based model shows significant improvement in both estimation accuracy (under both rainless and rainy conditions) and computational efficiency
    • …
    corecore