298 research outputs found
Advances in Image Processing, Analysis and Recognition Technology
For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches
Style Transfer with Generative Adversarial Networks
This dissertation is focused on trying to use concepts from style transfer and image-to-image translation to address the problem of defogging. Defogging (or dehazing) is the ability to remove fog from an image, restoring it as if the photograph was taken during optimal weather conditions. The task of defogging is of particular interest in many fields, such as surveillance or self driving cars.
In this thesis an unpaired approach to defogging is adopted, trying to translate a foggy image to the correspondent clear picture without having pairs of foggy and ground truth haze-free images during training. This approach is particularly significant, due to the difficult of gathering an image collection of exactly the same scenes with and without fog.
Many of the models and techniques used in this dissertation already existed in literature, but they are extremely difficult to train, and often it is highly problematic to obtain the desired behavior. Our contribute was a systematic implementative and experimental activity, conducted with the aim of attaining a comprehensive understanding of how these models work, and the role of datasets and training procedures in the final results. We also analyzed metrics and evaluation strategies, in order to seek to assess the quality of the presented model in the most correct and appropriate manner.
First, the feasibility of an unpaired approach to defogging was analyzed, using the cycleGAN model. Then, the base model was enhanced with a cycle perceptual loss, inspired by style transfer techniques. Next, the role of the training set was investigated, showing that improving the quality of data is at least as important as the utilization of more powerful models. Finally, our approach is compared with state-of-the art defogging methods, showing that the quality of our results is in line with preexisting approaches, even if our model was trained using unpaired data
Cancer diagnosis using deep learning: A bibliographic review
In this paper, we first describe the basics of the field of cancer diagnosis, which includes steps of cancer diagnosis followed by the typical classification methods used by doctors, providing a historical idea of cancer classification techniques to the readers. These methods include Asymmetry, Border, Color and Diameter (ABCD) method, seven-point detection method, Menzies method, and pattern analysis. They are used regularly by doctors for cancer diagnosis, although they are not considered very efficient for obtaining better performance. Moreover, considering all types of audience, the basic evaluation criteria are also discussed. The criteria include the receiver operating characteristic curve (ROC curve), Area under the ROC curve (AUC), F1 score, accuracy, specificity, sensitivity, precision, dice-coefficient, average accuracy, and Jaccard index. Previously used methods are considered inefficient, asking for better and smarter methods for cancer diagnosis. Artificial intelligence and cancer diagnosis are gaining attention as a way to define better diagnostic tools. In particular, deep neural networks can be successfully used for intelligent image analysis. The basic framework of how this machine learning works on medical imaging is provided in this study, i.e., pre-processing, image segmentation and post-processing. The second part of this manuscript describes the different deep learning techniques, such as convolutional neural networks (CNNs), generative adversarial models (GANs), deep autoencoders (DANs), restricted Boltzmann’s machine (RBM), stacked autoencoders (SAE), convolutional autoencoders (CAE), recurrent neural networks (RNNs), long short-term memory (LTSM), multi-scale convolutional neural network (M-CNN), multi-instance learning convolutional neural network (MIL-CNN). For each technique, we provide Python codes, to allow interested readers to experiment with the cited algorithms on their own diagnostic problems. The third part of this manuscript compiles the successfully applied deep learning models for different types of cancers. Considering the length of the manuscript, we restrict ourselves to the discussion of breast cancer, lung cancer, brain cancer, and skin cancer. The purpose of this bibliographic review is to provide researchers opting to work in implementing deep learning and artificial neural networks for cancer diagnosis a knowledge from scratch of the state-of-the-art achievements
Deep Learning-Based Object Detection in Maritime Unmanned Aerial Vehicle Imagery: Review and Experimental Comparisons
With the advancement of maritime unmanned aerial vehicles (UAVs) and deep
learning technologies, the application of UAV-based object detection has become
increasingly significant in the fields of maritime industry and ocean
engineering. Endowed with intelligent sensing capabilities, the maritime UAVs
enable effective and efficient maritime surveillance. To further promote the
development of maritime UAV-based object detection, this paper provides a
comprehensive review of challenges, relative methods, and UAV aerial datasets.
Specifically, in this work, we first briefly summarize four challenges for
object detection on maritime UAVs, i.e., object feature diversity, device
limitation, maritime environment variability, and dataset scarcity. We then
focus on computational methods to improve maritime UAV-based object detection
performance in terms of scale-aware, small object detection, view-aware,
rotated object detection, lightweight methods, and others. Next, we review the
UAV aerial image/video datasets and propose a maritime UAV aerial dataset named
MS2ship for ship detection. Furthermore, we conduct a series of experiments to
present the performance evaluation and robustness analysis of object detection
methods on maritime datasets. Eventually, we give the discussion and outlook on
future works for maritime UAV-based object detection. The MS2ship dataset is
available at
\href{https://github.com/zcj234/MS2ship}{https://github.com/zcj234/MS2ship}.Comment: 32 pages, 18 figure
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Intelligent Transportation Related Complex Systems and Sensors
Building around innovative services related to different modes of transport and traffic management, intelligent transport systems (ITS) are being widely adopted worldwide to improve the efficiency and safety of the transportation system. They enable users to be better informed and make safer, more coordinated, and smarter decisions on the use of transport networks. Current ITSs are complex systems, made up of several components/sub-systems characterized by time-dependent interactions among themselves. Some examples of these transportation-related complex systems include: road traffic sensors, autonomous/automated cars, smart cities, smart sensors, virtual sensors, traffic control systems, smart roads, logistics systems, smart mobility systems, and many others that are emerging from niche areas. The efficient operation of these complex systems requires: i) efficient solutions to the issues of sensors/actuators used to capture and control the physical parameters of these systems, as well as the quality of data collected from these systems; ii) tackling complexities using simulations and analytical modelling techniques; and iii) applying optimization techniques to improve the performance of these systems. It includes twenty-four papers, which cover scientific concepts, frameworks, architectures and various other ideas on analytics, trends and applications of transportation-related data
Deep learning with multiple modalities : making the most out of available data
L’apprentissage profond, un sous domaine de l’apprentissage machine, est reconnu pour nécessiter une très grande quantité de données pour atteindre des performances satisfaisantes en généralisation. Une autre restriction actuelle des systèmes utilisant l’apprentissage machine en lien avec les données est la nécessité d’avoir accès au même type de données autant durant la phase d’entrainement du modèle que durant la phase de test de celui-ci. Dans plusieurs cas, ceci rend inutilisable en entrainement des données de modalité supplémentaire pouvant possiblement apporter de l’information additionnelle au système et l’améliorer. Dans ce mémoire, plusieurs méthodes d’entrainement permettant de tirer avantage de modalités additionnelles disponibles dans des jeux de données seulement en entrainement et non durant la phase de test seront proposées. Pour débuter, nous nous intéressons à diminuer le bruit présent dans images.. On débute le mémoire avec la technique la plus simple, soit un débruitage avant une tâche pour augmenter la capacité du système à faire cette tâche. Par la suite, deux techniques un peu plus poussées proposant de faire un débruitage guidé pour augmenter les performances d’une tâche subséquente sont présentées. On conclut finalement cette thèse en présentant une technique du nom d’Input Dropout permettant d’utiliser très facilement une modalité seulement disponible en entrainement pour augmenter les performances d’un système, et ce pour une multitude de tâches variées de vision numérique.Deep learning, a sub-domain of machine learning, is known to require a very large amount of data to achieve satisfactory performance in generalization. Another current limitation of these machine learning systems is the need to have access to the same type of data during the training phase of the model as during its testing phase. In many cases, this renders unusable training on additional modality data that could possibly bring additional information to the system and improve it. In this thesis, several training methods will be proposed to take advantage of additional modalities available in datasets only in training and not in testing. We will be particularly interested in reducing the noise present in images. The thesis begins with the simplest technique, which is a denoising before a task to increase the system’s ability to perform a task. Then, two more advanced techniques are presented, which propose guided denoising to increase the performance of a subsequent task. Finally, we conclude this thesis by presenting a technique called Input Dropout that facilitates the use of modality only available in training to increase the performance of a system, and this for a multitude of varied computer vision tasks
LiDAR-based Weather Detection: Automotive LiDAR Sensors in Adverse Weather Conditions
Technologische Verbesserungen erhöhen den Automatisierungsgrad von Fahrzeugen. Der natürliche Schritt ist dabei, den Fahrer dort zu unterstützen, wo er es am meisten wünscht: bei schlechtem Wetter. Das Wetter beeinflusst alle Sensoren, die zur Wahrnehmung der Umgebung verwendet werden, daher ist es entscheidend, diese Effekte zu berücksichtigen und abzuschwächen.
Die vorliegende Dissertation konzentriert sich auf die gerade entstehende Technologie der automobilen Light Detection and Ranging (LiDAR)-Sensoren und trägt zur Entwicklung von autonomen Fahrzeugen bei, die in der Lage sind, unter verschiedenen Wetterbedingungen zu fahren.
Die Grundlage ist der erste LiDAR-Punktwolken-Datensatz mit dem Schwerpunkt auf schlechte Wetterbedingungen, welcher punktweise annonatatierte Wetterinformationen enthält, während er
unter kontrollierten Wetterbedingungen aufgezeichnet wurde. Dieser Datensatz wird durch eine neuartige Wetter-Augmentation erweitert, um realistische Wettereffekte erzeugen zu können.
Ein neuartiger Ansatz zur Klassifizierung des Wetterzustands
und der erste CNN-basierte Entrauschungsalgorithmus werden entwickelt. Das Ergebnis ist eine genaue Vorhersage des Wetterstatus und eine Verbesserung der Punktwolkenqualität.
Kontrollierte Umgebungen unter verschiedenen Wetterbedingungen ermöglichen die Evaluierung der oben genannten Ansätze und liefern wertvolle Informationen für das automatisierte und autonome Fahren
Retinal image quality assessment using deep convolutional neural networks
Dissertação de mestrado integrado em Engenharia Biomédica (área de especialização em Informática Médica)Diabetic Retinopathy (DR) and diabetic macular edema (DME) are the damages caused to the retina and are complications that can affect the diabetic population. Diabetic retinopathy (DR), is the most common disease due to the presence of exudates and has three levels of severity, such as mild, moderate and severe, depending on the exudates distribution in the retina. For screening of diabetic retinopathy or a population-based clinical study, a large number of digital fundus images are captured and to be possible to recognize the signs of DR and DME, it is necessary that the images have quality, because low-quality images may force the patient to return for a second examination, wasting time and possibly delaying treatment.
These images are evaluated by trained human experts, which can be a time-consuming and expensive task due to the number of images that need to be examined. Therefore, this is a field that would be hugely benefited with the development of an automated eye fundus quality assessment and analysis systems. It can potentially facilitate health care in remote regions and in developing countries where reading skills are scarce. Deep Learning is a kind of Machine Learning method that involves learning multi-level representations that begin with raw data entry and gradually moves to more abstract levels through non-linear transformations. With enough training data and sufficiently deep architectures, neural networks, such as Convolutional Neural Networks (CNN), can learn very complex functions and discover complex structures in the data. Thus, Deep Learning emerges as a powerful tool for medical image analysis and evaluation of retinal image quality using computer-aided diagnosis.
Therefore, the aim of this study is to automatically assess all the three quality parameters alone (focus, illumination and color), and then an overall quality of fundus images assessment, classifying the images into the classes “accept” or “reject with a Deep Learning approach using convolutional neural networks (CNN). For the overall classification, the following results were obtained: test accuracy=97.89%, SN=97.9%, AUC=0.98 and 1-score=97.91%.A retinopatia diabĂ©tica (RD) e o edema macular diabĂ©tico (EMD) sĂŁo patologias da retina e sĂŁo uma complicação que pode afetar a população diabĂ©tica. A retinopatia diabĂ©tica Ă© a doença mais comum devido Ă presença de exsudatos e possui trĂŞs nĂveis de gravidade, como leve, moderado e grave, dependendo da distribuição dos exsudatos na retina. Para triagem da retinopatia diabĂ©tica ou estudo clĂnico de base populacional, um grande nĂşmero de imagens digitais de fundo do olho sĂŁo capturadas e para ser possĂvel reconhecer os sinais da RD e EMD, Ă© necessário que as imagens tenham qualidade, pois imagens de baixa qualidade podem forçar o paciente a retornar para um segundo exame, perdendo tempo e, possivelmente, retardando o tratamento. Essas imagens sĂŁo avaliadas por especialistas humanos treinados, o que pode ser uma tarefa demorada e cara devido ao nĂşmero de imagens que precisam de ser examinadas. Portanto, este Ă© um campo que seria enormemente beneficiado com o desenvolvimento de sistemas automatizados de avaliação e análise da qualidade da imagem do fundo de olho. Pode potencialmente facilitar a assistĂŞncia mĂ©dica em regiões remotas e em paĂses em desenvolvimento, onde as habilidades de leitura sĂŁo escassas. Deep Learning Ă© um tipo de mĂ©todo de Machine Learning que envolve a aprendizagem de representações em vários nĂveis que começam com a entrada de dados brutos e gradualmente se transformam para nĂveis mais abstratos atravĂ©s de transformações nĂŁo lineares, para se obterem as previsões. Com dados de treino suficientes e arquiteturas suficientemente profundas, as redes neuronais, como as Convolutional Neural Networks (CNN), podem aprender funções muito complexas e descobrir estruturas complexas nos dados. Assim, o Deep Learning surge como uma ferramenta poderosa para analisar imagens mĂ©dicas para avaliação da qualidade da retina, usando diagnĂłstico auxiliado por computador a partir do fundo do olho. Portanto, o objetivo deste estudo Ă© avaliar automaticamente a qualidade geral das imagens do fundo, classificando as imagens em “aceites” ou “rejeitadas”, com base em trĂŞs parâmetros principais, como o foco, a iluminação e cor com abordagem de Deep Learning usando convolutional neural networks (CNN). Para a classificação geral da qualidade das imagens, obtiveram-se os seguintes resultados: acurácia do teste = 97,89%, SN = 97,9%, AUC = 0,98 e 1-score=97.91%
Sea surface wind and wave parameter estimation from X-band marine radar images with rain detection and mitigation
In this research, the application of X-band marine radar backscatter images for sea surface
wind and wave parameter estimation with rain detection and mitigation is investigated.
In the presence of rain, the rain echoes in the radar image blur the wave signatures
and negatively affect estimation accuracy. Hence, in order to improve estimation accuracy,
it is meaningful to detect the presence of those rain echoes and mitigate their influence on
estimation results. Since rain alters radar backscatter intensity distribution, features are extracted
from the normalized histogram of each radar image. Then, a support vector machine
(SVM)-based rain detection model is proposed to classify radar images obtained between
rainless and rainy conditions. The classification accuracy shows significant improvement
compared to the existing threshold-based method. By further observing images obtained
under rainy conditions, it is found that many of them are only partially contaminated by rain
echoes. Therefore, in order to segment between rain-contaminated regions and those that
are less or unaffected by rain, two types of methods are developed based on unsupervised
learning techniques and convolutional neural network (CNN), respectively. Specifically, for
the unsupervised learning-based method, texture features are first extracted from each pixel
and then trained using a self organizing map (SOM)-based clustering model, which is able
to conduct pixel-based identification of rain-contaminated regions. As for the CNN-based
method, a SegNet-based semantic segmentation CNN is �rst designed and then trained using
images with manually annotated labels. Both shipborne and shore-based marine radar
data are used to train and validate the proposed methods and high classification accuracies
of around 90% are obtained.
Due to the similarities between how haze affects terrestrial images and how rain affects
marine radar images, a type of CNN for image dehazing purposes, i.e., DehazeNet, is
applied to rain-contaminated regions in radar images for correcting the in
uence of rain,
which reduces the estimation error of wind direction significantly. Besides, after extracting
histogram and texture features from rain-corrected radar images, a support vector regression
(SVR)-based model, which achieves high estimation accuracy, is trained for wind speed
estimation. Finally, a convolutional gated recurrent unit (CGRU) network is designed and
trained for significant wave height (SWH) estimation. As an end-to-end system, the proposed
network is able to generate estimation results directly from radar image sequences
by extracting multi-scale spatial and temporal features in radar image sequences automatically.
Compared to the classic signal-to-noise (SNR)-based method, the CGRU-based model
shows significant improvement in both estimation accuracy (under both rainless and rainy
conditions) and computational efficiency
- …