69 research outputs found

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

    A generic Self-Supervised Learning (SSL) framework for representation learning from spectral–spatial features of unlabeled remote sensing imagery

    Get PDF
    Remote sensing data has been widely used for various Earth Observation (EO) missions such as land use and cover classification, weather forecasting, agricultural management, and environmental monitoring. Most existing remote-sensing-data-based models are based on supervised learning that requires large and representative human-labeled data for model training, which is costly and time-consuming. The recent introduction of self-supervised learning (SSL) enables models to learn a representation from orders of magnitude more unlabeled data. The success of SSL is heavily dependent on a pre-designed pretext task, which introduces an inductive bias into the model from a large amount of unlabeled data. Since remote sensing imagery has rich spectral information beyond the standard RGB color space, it may not be straightforward to extend to the multi/hyperspectral domain the pretext tasks established in computer vision based on RGB images. To address this challenge, this work proposed a generic self-supervised learning framework based on remote sensing data at both the object and pixel levels. The method contains two novel pretext tasks, one for object-based and one for pixel-based remote sensing data analysis methods. One pretext task is used to reconstruct the spectral profile from the masked data, which can be used to extract a representation of pixel information and improve the performance of downstream tasks associated with pixel-based analysis. The second pretext task is used to identify objects from multiple views of the same object in multispectral data, which can be used to extract a representation and improve the performance of downstream tasks associated with object-based analysis. The results of two typical downstream task evaluation exercises (a multilabel land cover classification task on Sentinel-2 multispectral datasets and a ground soil parameter retrieval task on hyperspectral datasets) demonstrate that the proposed SSL method learns a target representation that covers both spatial and spectral information from massive unlabeled data. A comparison with currently available SSL methods shows that the proposed method, which emphasizes both spectral and spatial features, outperforms existing SSL methods on multi- and hyperspectral remote sensing datasets. We believe that this approach has the potential to be effective in a wider range of remote sensing applications and we will explore its utility in more remote sensing applications in the future

    Machine Learning Methods with Noisy, Incomplete or Small Datasets

    Get PDF
    In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios

    RSVD-CUR Decomposition for Matrix Triplets

    Full text link
    We propose a restricted SVD based CUR (RSVD-CUR) decomposition for matrix triplets (A,B,G)(A, B, G). Given matrices AA, BB, and GG of compatible dimensions, such a decomposition provides a coordinated low-rank approximation of the three matrices using a subset of their rows and columns. We pick the subset of rows and columns of the original matrices by applying either the discrete empirical interpolation method (DEIM) or the L-DEIM scheme on the orthogonal and nonsingular matrices from the restricted singular value decomposition of the matrix triplet. We investigate the connections between a DEIM type RSVD-CUR approximation and a DEIM type CUR factorization, and a DEIM type generalized CUR decomposition. We provide an error analysis that shows that the accuracy of the proposed RSVD-CUR decomposition is within a factor of the approximation error of the restricted singular value decomposition of given matrices. An RSVD-CUR factorization may be suitable for applications where we are interested in approximating one data matrix relative to two other given matrices. Two applications that we discuss include multi-view/label dimension reduction, and data perturbation problems of the form AE=A+BFGA_E=A + BFG, where BFGBFG is a nonwhite noise matrix. In numerical experiments, we show the advantages of the new method over the standard CUR approximation for these applications

    Diagnóstico automático de melanoma mediante técnicas modernas de aprendizaje automático

    Get PDF
    The incidence and mortality rates of skin cancer remain a huge concern in many countries. According to the latest statistics about melanoma skin cancer, only in the Unites States, 7,650 deaths are expected in 2022, which represents 800 and 470 more deaths than 2020 and 2021, respectively. In 2022, melanoma is ranked as the fifth cause of new cases of cancer, with a total of 99,780 people. This illness is mainly diagnosed with a visual inspection of the skin, then, if doubts remain, a dermoscopic analysis is performed. The development of e_ective non-invasive diagnostic tools for the early stages of the illness should increase quality of life, and decrease the required economic resources. The early diagnosis of skin lesions remains a tough task even for expert dermatologists because of the complexity, variability, dubiousness of the symptoms, and similarities between the different categories among skin lesions. To achieve this goal, previous works have shown that early diagnosis from skin images can benefit greatly from using computational methods. Several studies have applied handcrafted-based methods on high quality dermoscopic and histological images, and on top of that, machine learning techniques, such as the k-nearest neighbors approach, support vector machines and random forest. However, one must bear in mind that although the previous extraction of handcrafted features incorporates an important knowledge base into the analysis, the quality of the extracted descriptors relies heavily on the contribution of experts. Lesion segmentation is also performed manually. The above procedures have a common issue: they are time-consuming manual processes prone to errors. Furthermore, an explicit definition of an intuitive and interpretable feature is hardly achievable, since it depends on pixel intensity space and, therefore, they are not invariant regarding the differences in the input images. On the other hand, the use of mobile devices has sharply increased, which offers an almost unlimited source of data. In the past few years, more and more attention has been paid to designing deep learning models for diagnosing melanoma, more specifically Convolutional Neural Networks. This type of model is able to extract and learn high-level features from raw images and/or other data without the intervention of experts. Several studies showed that deep learning models can overcome handcrafted-based methods, and even match the predictive performance of dermatologists. The International Skin Imaging Collaboration encourages the development of methods for digital skin imaging. Every year since 2016 to 2019, a challenge and a conference have been organized, in which more than 185 teams have participated. However, convolutional models present several issues for skin diagnosis. These models can fit on a wide diversity of non-linear data points, being prone to overfitting on datasets with small numbers of training examples per class and, therefore, attaining a poor generalization capacity. On the other hand, this type of model is sensitive to some characteristics in data, such as large inter-class similarities and intra-class variances, variations in viewpoints, changes in lighting conditions, occlusions, and background clutter, which can be mostly found in non-dermoscopic images. These issues represent challenges for the application of automatic diagnosis techniques in the early phases of the illness. As a consequence of the above, the aim of this Ph.D. thesis is to make significant contributions to the automatic diagnosis of melanoma. The proposals aim to avoid overfitting and improve the generalization capacity of deep models, as well as to achieve a more stable learning and better convergence. Bear in mind that research into deep learning commonly requires an overwhelming processing power in order to train complex architectures. For example, when developing NASNet architecture, researchers used 500 x NVidia P100s - each graphic unit cost from 5,899to5,899 to 7,374, which represents a total of 2,949,500.002,949,500.00 - 3,687,000.00. Unfortunately, the majority of research groups do not have access to such resources, including ours. In this Ph.D. thesis, the use of several techniques has been explored. First, an extensive experimental study was carried out, which included state-of-the-art models and methods to further increase the performance. Well-known techniques were applied, such as data augmentation and transfer learning. Data augmentation is performed in order to balance out the number of instances per category and act as a regularizer in preventing overfitting in neural networks. On the other hand, transfer learning uses weights of a pre-trained model from another task, as the initial condition for the learning of the target network. Results demonstrate that the automatic diagnosis of melanoma is a complex task. However, different techniques are able to mitigate such issues in some degree. Finally, suggestions are given about how to train convolutional models for melanoma diagnosis and future interesting research lines were presented. Next, the discovery of ensemble-based architectures is tackled by using genetic algorithms. The proposal is able to stabilize the training process. This is made possible by finding sub-optimal combinations of abstract features from the ensemble, which are used to train a convolutional block. Then, several predictive blocks are trained at the same time, and the final diagnosis is achieved by combining all individual predictions. We empirically investigate the benefits of the proposal, which shows better convergence, mitigates the overfitting of the model, and improves the generalization performance. On top of that, the proposed model is available online and can be consulted by experts. The next proposal is focused on designing an advanced architecture capable of fusing classical convolutional blocks and a novel model known as Dynamic Routing Between Capsules. This approach addresses the limitations of convolutional blocks by using a set of neurons instead of an individual neuron in order to represent objects. An implicit description of the objects is learned by each capsule, such as position, size, texture, deformation, and orientation. In addition, a hyper-tuning of the main parameters is carried out in order to ensure e_ective learning under limited training data. An extensive experimental study was conducted where the fusion of both methods outperformed six state-of-the-art models. On the other hand, a robust method for melanoma diagnosis, which is inspired on residual connections and Generative Adversarial Networks, is proposed. The architecture is able to produce plausible photorealistic synthetic 512 x 512 skin images, even with small dermoscopic and non-dermoscopic skin image datasets as problema domains. In this manner, the lack of data, the imbalance problems, and the overfitting issues are tackled. Finally, several convolutional modes are extensively trained and evaluated by using the synthetic images, illustrating its effectiveness in the diagnosis of melanoma. In addition, a framework, which is inspired on Active Learning, is proposed. The batch-based query strategy setting proposed in this work enables a more faster training process by learning about the complexity of the data. Such complexities allow us to adjust the training process after each epoch, which leads the model to achieve better performance in a lower number of iterations compared to random mini-batch sampling. Then, the training method is assessed by analyzing both the informativeness value of each image and the predictive performance of the models. An extensive experimental study is conducted, where models trained with the proposal attain significantly better results than the baseline models. The findings suggest that there is still space for improvement in the diagnosis of skin lesions. Structured laboratory data, unstructured narrative data, and in some cases, audio or observational data, are given by radiologists as key points during the interpretation of the prediction. This is particularly true in the diagnosis of melanoma, where substantial clinical context is often essential. For example, symptoms like itches and several shots of a skin lesion during a period of time proving that the lesion is growing, are very likely to suggest cancer. The use of different types of input data could help to improve the performance of medical predictive models. In this regard, a _rst evolutionary algorithm aimed at exploring multimodal multiclass data has been proposed, which surpassed a single-input model. Furthermore, the predictive features extracted by primary capsules could be used to train other models, such as Support Vector Machine

    Application of Artificial Intelligence Approaches in the Flood Management Process for Assessing Blockage at Cross-Drainage Hydraulic Structures

    Get PDF
    Floods are the most recurrent, widespread and damaging natural disasters, and are ex-pected to become further devastating because of global warming. Blockage of cross-drainage hydraulic structures (e.g., culverts, bridges) by flood-borne debris is an influen-tial factor which usually results in reducing hydraulic capacity, diverting the flows, dam-aging structures and downstream scouring. Australia is among the countries adversely impacted by blockage issues (e.g., 1998 floods in Wollongong, 2007 floods in Newcas-tle). In this context, Wollongong City Council (WCC), under the Australian Rainfall and Runoff (ARR), investigated the impact of blockage on floods and proposed guidelines to consider blockage in the design process for the first time. However, existing WCC guide-lines are based on various assumptions (i.e., visual inspections as representative of hy-draulic behaviour, post-flood blockage as representative of peak floods, blockage remains constant during the whole flooding event), that are not supported by scientific research while also being criticised by hydraulic design engineers. This suggests the need to per-form detailed investigations of blockage from both visual and hydraulic perspectives, in order to develop quantifiable relationships and incorporate blockage into design guide-lines of hydraulic structures. However, because of the complex nature of blockage as a process and the lack of blockage-related data from actual floods, conventional numerical modelling-based approaches have not achieved much success. The research in this thesis applies artificial intelligence (AI) approaches to assess the blockage at cross-drainage hydraulic structures, motivated by recent success achieved by AI in addressing complex real-world problems (e.g., scour depth estimation and flood inundation monitoring). The research has been carried out in three phases: (a) litera-ture review, (b) hydraulic blockage assessment, and (c) visual blockage assessment. The first phase investigates the use of computer vision in the flood management domain and provides context for blockage. The second phase investigates hydraulic blockage using lab scale experiments and the implementation of multiple machine learning approaches on datasets collected from lab experiments (i.e., Hydraulics-Lab Dataset (HD), Visual Hydraulics-Lab Dataset (VHD)). The artificial neural network (ANN) and end-to-end deep learning approaches reported top performers among the implemented approaches and demonstrated the potential of learning-based approaches in addressing blockage is-sues. The third phase assesses visual blockage at culverts using deep learning classifi-cation, detection and segmentation approaches for two types of visual assessments (i.e., blockage status classification, percentage visual blockage estimation). Firstly, a range of existing convolutional neural network (CNN) image classification models are imple-mented and compared using visual datasets (i.e., Images of Culvert Openings and Block-age (ICOB), VHD, Synthetic Images of Culverts (SIC)), with the aim to automate the process of manual visual blockage classification of culverts. The Neural Architecture Search Network (NASNet) model achieved best classification results among those im-plemented. Furthermore, the study identified background noise and simplified labelling criteria as two contributing factors in degraded performance of existing CNN models for blockage classification. To address the background clutter issue, a detection-classification pipeline is proposed and achieved improved visual blockage classification performance. The proposed pipeline has been deployed using edge computing hardware for blockage monitoring of actual culverts. The role of synthetic data (i.e., SIC) on the performance of culvert opening detection is also investigated. Secondly, an automated segmentation-classification deep learning pipeline is proposed to estimate the percentage of visual blockage at circular culverts to better prioritise culvert maintenance. The AI solutions proposed in this thesis are integrated into a blockage assessment framework, designed to be deployed through edge computing to monitor, record and assess blockage at cross-drainage hydraulic structures

    Coarse-to-Fine Contrastive Learning on Graphs

    Full text link
    Inspired by the impressive success of contrastive learning (CL), a variety of graph augmentation strategies have been employed to learn node representations in a self-supervised manner. Existing methods construct the contrastive samples by adding perturbations to the graph structure or node attributes. Although impressive results are achieved, it is rather blind to the wealth of prior information assumed: with the increase of the perturbation degree applied on the original graph, 1) the similarity between the original graph and the generated augmented graph gradually decreases; 2) the discrimination between all nodes within each augmented view gradually increases. In this paper, we argue that both such prior information can be incorporated (differently) into the contrastive learning paradigm following our general ranking framework. In particular, we first interpret CL as a special case of learning to rank (L2R), which inspires us to leverage the ranking order among positive augmented views. Meanwhile, we introduce a self-ranking paradigm to ensure that the discriminative information among different nodes can be maintained and also be less altered to the perturbations of different degrees. Experiment results on various benchmark datasets verify the effectiveness of our algorithm compared with the supervised and unsupervised models

    Interpreting intermediate feature representations of raw-waveform deep CNNs by sonification

    Get PDF
    The majority of the recent works that address the interpretability of raw waveform based deep neural networks (DNNs) for audio processing focus on interpreting spectral and frequency response information, often limiting to visual and signal theoretic means of interpretation, solely for the first layer. This work proposes sonification, a method to interpret intermediate feature representations of sound event recognition (SER) 1D-convolutional neural networks (1D-CNNs) trained on raw waveforms by mapping these representations back into the discrete-time input signal domain, highlighting substructures in the input that maximally activate a feature map as intelligible acoustic events. Sonification is used to compare supervised and contrastive self-supervised feature representations, observing how the latter learn more acoustically discernible representations, especially in the deeper layers. A metric to quantify acoustic similarity between the interpretations and their corresponding inputs is proposed, and a layer-by-layer analysis of the trained feature representations using this metric supports the observations made

    Alzheimer’s Dementia Recognition Through Spontaneous Speech

    Get PDF
    corecore