268,786 research outputs found

    Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

    Full text link
    In recent years, various shadow detection methods from a single image have been proposed and used in vision systems; however, most of them are not appropriate for the robotic applications due to the expensive time complexity. This paper introduces a fast shadow detection method using a deep learning framework, with a time cost that is appropriate for robotic applications. In our solution, we first obtain a shadow prior map with the help of multi-class support vector machine using statistical features. Then, we use a semantic- aware patch-level Convolutional Neural Network that efficiently trains on shadow examples by combining the original image and the shadow prior map. Experiments on benchmark datasets demonstrate the proposed method significantly decreases the time complexity of shadow detection, by one or two orders of magnitude compared with state-of-the-art methods, without losing accuracy.Comment: 6 pages, 5 figures, Submitted to IROS 201

    Phase Transitions in Semidefinite Relaxations

    Full text link
    Statistical inference problems arising within signal processing, data mining, and machine learning naturally give rise to hard combinatorial optimization problems. These problems become intractable when the dimensionality of the data is large, as is often the case for modern datasets. A popular idea is to construct convex relaxations of these combinatorial problems, which can be solved efficiently for large scale datasets. Semidefinite programming (SDP) relaxations are among the most powerful methods in this family, and are surprisingly well-suited for a broad range of problems where data take the form of matrices or graphs. It has been observed several times that, when the `statistical noise' is small enough, SDP relaxations correctly detect the underlying combinatorial structures. In this paper we develop asymptotic predictions for several `detection thresholds,' as well as for the estimation error above these thresholds. We study some classical SDP relaxations for statistical problems motivated by graph synchronization and community detection in networks. We map these optimization problems to statistical mechanics models with vector spins, and use non-rigorous techniques from statistical mechanics to characterize the corresponding phase transitions. Our results clarify the effectiveness of SDP relaxations in solving high-dimensional statistical problems.Comment: 71 pages, 24 pdf figure

    Improving estimates and change detection of forest above-ground biomass using statistical methods

    Get PDF
    Forests store approximately as much carbon as is in the atmosphere, with potential to take in or release carbon rapidly based on growth, climate change and human disturbance. Above-ground biomass (AGB) is the largest carbon pool in most forest systems, and the quickest to change following disturbance. Quantifying AGB on a global scale and being able to reliably map how it is changing, is therefore required for tackling climate change by targeting and monitoring policies. AGB can be mapped using remote sensing and machine learning methods, but such maps have high uncertainties, and simply subtracting one from another does not give a reliable indication of changes. To improve the quantification of AGB changes it is necessary to add advanced statistical methodology to existing machine learning and remote sensing methods. This review discusses the areas in which techniques used in statistical research could positively impact AGB quantification. Nine global or continental AGB maps, and a further eight local AGB maps, were investigated in detail to understand the limitations of techniques currently used. It was found that both modelling and validation of maps lacked spatial consideration. Spatial cross validation or other sampling methods, which specifically account for the spatial nature of this data, are important to introduce into AGB map validation. Modelling techniques which capture the spatial nature should also be used. For example, spatial random effects can be included in various forms of hierarchical statistical models. These can be estimated using frequentist or Bayesian inference. Strategies including hierarchical modelling, Bayesian inference, and simulation methods can also be applied to improve uncertainty estimation. Additionally, if these uncertainties are visualised using pixelation or contour maps this could improve interpretation. Improved uncertainty, which is commonly between 30% and 40%, is in addition needed to produce accurate change maps which will benefit policy decisions, policy implementation, and our understanding of the carbon cycle

    Application of Machine Learning and Statistical Learning Methods for Prediction in a Large-Scale Vegetation Map

    Get PDF
    Original analyses of a large vegetation cover dataset from Roosevelt National Forest in northern Colorado were carried out by Blackard (1998) and Blackard and Dean (1998; 2000). They compared the classification accuracies of linear and quadratic discriminant analysis (LDA and QDA) with artificial neural networks (ANN) and obtained an overall classification accuracy of 70.58% for a tuned ANN compared to 58.38% for LDA and 52.76% for QDA. Because there has been tremendous development of machine learning classification methods over the last 35 years in both computer science and statistics, as well as substantial improvements in the speed of computer hardware, I applied five modern machine learning algorithms to the data to determine whether significant improvements in the classification accuracy were possible using one or more of these methods. I found that only a tuned gradient boosting machine had a higher accuracy (71.62%) that the ANN of Blackard and Dean (1998), and the difference in accuracies was only about 1%. Of the other four methods, Random Forests (RF), Support Vector Machines (SVM), Classification Trees (CT), and adaboosted trees (ADA), a tuned SVM and RF had accuracies of 67.17% and 67.57%, respectively. The partition of the data by Blackard and Dean (1998) was unusual in that the training and validation datasets had equal representation of the seven vegetation classes, even though 85% of the data fell into classes 1 and 2. For the second part of my analyses I randomly selected 60% of the data for the training data and 20% for each of the validation data and test data. On this partition of the data a single classification tree achieved an accuracy of 92.63% on the test data and the accuracy of RF is 83.98%. Unsurprisingly, most of the gains in accuracy were in classes 1 and 2, the largest classes which also had the highest misclassification rates under the original partition of the data. By decreasing the size of the training data but maintaining the same relative occurrences of the vegetation classes as in the full dataset I found that even for a training dataset of the same size as that of Blackard and Dean (1998) a single classification tree was more accurate (73.80%) that the ANN of Blackard and Dean (1998) (70.58%). The final part of my thesis was to explore the possibility that combining several of the machine learning classifiers predictions could result in higher predictive accuracies. In the analyses I carried out, the answer seems to be that increased accuracies do not occur with a simple voting of five machine learning classifiers

    Evaluation of Sampling and Cross-Validation Tuning Strategies for Regional-Scale Machine Learning Classification

    Get PDF
    High spatial resolution (1–5 m) remotely sensed datasets are increasingly being used to map land covers over large geographic areas using supervised machine learning algorithms. Although many studies have compared machine learning classification methods, sample selection methods for acquiring training and validation data for machine learning, and cross-validation techniques for tuning classifier parameters are rarely investigated, particularly on large, high spatial resolution datasets. This work, therefore, examines four sample selection methods—simple random, proportional stratified random, disproportional stratified random, and deliberative sampling—as well as three cross-validation tuning approaches—k-fold, leave-one-out, and Monte Carlo methods. In addition, the effect on the accuracy of localizing sample selections to a small geographic subset of the entire area, an approach that is sometimes used to reduce costs associated with training data collection, is investigated. These methods are investigated in the context of support vector machines (SVM) classification and geographic object-based image analysis (GEOBIA), using high spatial resolution National Agricultural Imagery Program (NAIP) orthoimagery and LIDAR-derived rasters, covering a 2,609 km2 regional-scale area in northeastern West Virginia, USA. Stratified-statistical-based sampling methods were found to generate the highest classification accuracy. Using a small number of training samples collected from only a subset of the study area provided a similar level of overall accuracy to a sample of equivalent size collected in a dispersed manner across the entire regional-scale dataset. There were minimal differences in accuracy for the different cross-validation tuning methods. The processing time for Monte Carlo and leave-one-out cross-validation were high, especially with large training sets. For this reason, k-fold cross-validation appears to be a good choice. Classifications trained with samples collected deliberately (i.e., not randomly) were less accurate than classifiers trained from statistical-based samples. This may be due to the high positive spatial autocorrelation in the deliberative training set. Thus, if possible, samples for training should be selected randomly; deliberative samples should be avoided

    WorldView-2 Satellite Image Classification using U-Net Deep Learning Model

    Get PDF
    Land cover maps are important documents for local governments to perform urban planning and management. A field survey using measuring instruments can produce an accurate land cover map. However, this method is time-consuming, expensive, and labor-intensive. A number of researchers have proposed using remote sensing, which generates land cover maps using an optical satellite image with various statistical classification procedures. Recently, artificial intelligence (AI) technology, such as deep learning, has been used in multiple fields, including satellite image classification, with satisfactory results. In this study, a WorldView-2 image of Terangun in Aceh Province, which was acquired on Aug 2, 2016, was classified using a commonly used deep-learning-based classification, namely, U-net. There were eight classes used in the experiment: building, road, open land (such as green open space, bare land, grass, or low vegetation), river, farm, field, aquaculture pond, and garden. For comparison, three classification methods: maximum-likelihood, random forest, and support vector machine, were performed compared to U-Net. A land cover map provided by the government was used as a reference to evaluate the accuracy of land cover maps generated using two classification methods. The results with 100 randomly selected pixels revealed that U-Net was able to obtain a 72% and 0.585 for overall and kappa accuracy, respectively; whereas, overall accuracy and kappa accuracy for the maximum likelihood, random forest and support vector machine methods were  49% and 0.148; 59% and 0.392; and 67% and 0. 511; respectively. Therefore, U-Net outperformed those three of classification methods in classifying the image. &nbsp

    Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree

    Get PDF
    Preparation of landslide susceptibility maps is considered as the first important step in landslide risk assessments, but these maps are accepted as an end product that can be used for land use planning. The main objective of this study is to explore some new state-of-the-art sophisticated machine learning techniques and introduce a framework for training and validation of shallow landslide susceptibility models by using the latest statistical methods. The Son La hydropower basin (Vietnam) was selected as a case study. First, a landslide inventory map was constructed using the historical landslide locations from two national projects in Vietnam. A total of 12 landslide conditioning factors were then constructed from various data sources. Landslide locations were randomly split into a ratio of 70:30 for training and validating the models. To choose the best subset of conditioning factors, predictive ability of the factors were assessed using the Information Gain Ratio with 10-fold cross-validation technique. Factors with null predictive ability were removed to optimize the models. Subsequently, five landslide models were built using support vector machines (SVM), multi-layer perceptron neural networks (MLP Neural Nets), radial basis function neural networks (RBF Neural Nets), kernel logistic regression (KLR), and logistic model trees (LMT). The resulting models were validated and compared using the receive operating characteristic (ROC), Kappa index, and several statistical evaluation measures. Additionally, Friedman and Wilcoxon signed-rank tests were applied to confirm significant statistical differences among the five machine learning models employed in this study. Overall, the MLP Neural Nets model has the highest prediction capability (90.2 %), followed by the SVM model (88.7 %) and the KLR model (87.9 %), the RBF Neural Nets model (87.1 %), and the LMT model (86.1 %). Results revealed that both the KLR and the LMT models showed promising methods for shallow landslide susceptibility mapping. The result from this study demonstrates the benefit of selecting the optimal machine learning techniques with proper conditioning selection method in shallow landslide susceptibility mapping
    corecore