40,876 research outputs found

    Machine Learning Techniques for Land Use/Land Cover Classification of Medium Resolution Optical Satellite Imagery Focusing on Temporary Inundated Areas

    Get PDF
    Classification of multispectral optical satellite data using machine learning techniques to derive land use/land cover thematic data is important for many applications. Comparing the latest algorithms, our research aims to determine the best option to classify land use/land cover with special focus on temporary inundated land in a flat area in the south of Hungary. These inundations disrupt agricultural practices and can cause large financial loss. Sentinel 2 data with a high temporal and medium spatial resolution is classified using open source implementations of a random forest, support vector machine and an artificial neural network. Each classification model is applied to the same data set and the results are compared qualitatively and quantitatively. The accuracy of the results is high for all methods and does not show large overall differences. A quantitative spatial comparison demonstrates that the neural network gives the best results, but that all models are strongly influenced by atmospheric disturbances in the image

    Object-Based Greenhouse Classification from GeoEye-1 and WorldView-2 Stereo Imagery

    Get PDF
    Remote sensing technologies have been commonly used to perform greenhouse detection and mapping. In this research, stereo pairs acquired by very high-resolution optical satellites GeoEye-1 (GE1) and WorldView-2 (WV2) have been utilized to carry out the land cover classification of an agricultural area through an object-based image analysis approach, paying special attention to greenhouses extraction. The main novelty of this work lies in the joint use of single-source stereo-photogrammetrically derived heights and multispectral information from both panchromatic and pan-sharpened orthoimages. The main features tested in this research can be grouped into different categories, such as basic spectral information, elevation data (normalized digital surface model; nDSM), band indexes and ratios, texture and shape geometry. Furthermore, spectral information was based on both single orthoimages and multiangle orthoimages. The overall accuracy attained by applying nearest neighbor and support vector machine classifiers to the four multispectral bands of GE1 were very similar to those computed from WV2, for either four or eight multispectral bands. Height data, in the form of nDSM, were the most important feature for greenhouse classification. The best overall accuracy values were close to 90%, and they were not improved by using multiangle orthoimages

    Rain Fall Prediction using Ada Boost Machine Learning Ensemble Algorithm

    Get PDF
    Every government takes initiative for the well-being of their citizens in terms of environment and climate in which they live. Global warming is one of the reason for climate change. With the help of machine learning algorithms in the flash light of Artificial Intelligence and Data Mining techniques, weather predictions not only rainfall, lightings, thunder outbreaks, etc. can be predicted. Management of water reservoirs, flooding, traffic - control in smart cities, sewer system functioning and agricultural production are the hydro-meteorological factors that affect human life very drastically. Due to dynamic nature of atmosphere, existing Statistical techniques (Support Vector Machine (SVM), Decision Tree (DT) and logistic regression (LR)) fail to provide good accuracy for rainfall forecasting. Different weather features (Temperature, Relative Humidity, Dew Point, Solar Radiation and Precipitable Water Vapour) are extracted for rainfall prediction. In this research work, data analysis using machine learning ensemble algorithm like Adaptive Boosting (Ada Boost) is proposed. Dataset used for this classification application is taken from hydrological department, India from 1901-2015. Overall, proposed algorithm is feasible to be used in order to qualitatively predict rainfall with the help of R tool and Ada Boost algorithm. Accuracy rate and error false rates are compared with the existing Support Vector Machine (SVM) algorithm and the proposed one gives the better result

    Survey: Detection Of Crop Diseases Using Multiscaling Technique

    Get PDF
    Diseases in crops reduces both quality and quantity of the agricultural products.A classification is a technique where leaf is classified based on its different morphological features. Since the quality of result can vary for different input data, selecting a classification method is always a difficult task. There are different classification techniques such as K-Nearest Neighbor Classifier (KNN), Probabilistic Neural Network(PNN), Genetic Algorithm, Support Vector Machine(SVM) and Principal Component Analysis, Artificial neural network(ANN), Fuzzy logic. Plant leaf disease classifications have wide applications in various fields such as in biological research, in Agriculture etc. Farmers experience great difficulties in switching from one disease control policy to another. Early information on crop health and disease detection can facilitate the control of diseases through proper management strategies. This technique will improves productivity of crops. In practice the traditional approach for detection and identification of plant diseases is the nacked eye observation. DOI: 10.17762/ijritcc2321-8169.15016

    An Automated Machine Learning Framework in Unmanned Aircraft Systems:New Insights into Agricultural Management Practices Recognition Approaches

    Get PDF
    The recent trend of automated machine learning (AutoML) has been driving further significant technological innovation in the application of artificial intelligence from its automated algorithm selection and hyperparameter optimization of the deployable pipeline model for unraveling substance problems. However, a current knowledge gap lies in the integration of AutoML technology and unmanned aircraft systems (UAS) within image-based data classification tasks. Therefore, we employed a state-of-the-art (SOTA) and completely open-source AutoML framework, Auto-sklearn, which was constructed based on one of the most widely used ML systems: Scikit-learn. It was combined with two novel AutoML visualization tools to focus particularly on the recognition and adoption of UAS-derived multispectral vegetation indices (VI) data across a diverse range of agricultural management practices (AMP). These include soil tillage methods (STM), cultivation methods (CM), and manure application (MA), and are under the four-crop combination fields (i.e., red clover-grass mixture, spring wheat, pea-oat mixture, and spring barley). Furthermore, they have currently not been efficiently examined and accessible parameters in UAS applications are absent for them. We conducted the comparison of AutoML performance using three other common machine learning classifiers, namely Random Forest (RF), support vector machine (SVM), and artificial neural network (ANN). The results showed AutoML achieved the highest overall classification accuracy numbers after 1200 s of calculation. RF yielded the second-best classification accuracy, and SVM and ANN were revealed to be less capable among some of the given datasets. Regarding the classification of AMPs, the best recognized period for data capture occurred in the crop vegetative growth stage (in May). The results demonstrated that CM yielded the best performance in terms of classification, followed by MA and STM. Our framework presents new insights into plant–environment interactions with capable classification capabilities. It further illustrated the automatic system would become an important tool in furthering the understanding for future sustainable smart farming and field-based crop phenotyping research across a diverse range of agricultural environmental assessment and management applications

    Improving DNA Barcode-based Fish Identification System on Imbalanced Data using SMOTE

    Get PDF
    Problem in imbalanced data is very common in classification or identification. The problem is raised when the number of instances of one class far exceeds the other. In the previous research, our DNA barcode-based Identification System of Tuna and Mackerel was developed in imbalanced dataset. The number of samples of Tuna and Mackerel were much more than those of other fish samples. Therefore, the accuracy of the classification model was probably still in bias. This research aimed at employing Synthetic Minority Oversampling Technique (SMOTE) to yield balanced dataset. We used k-mers frequencies from DNA barcode sequences as features and Support Vector Machine (SVM) as classification method. In this research we used trinucleotide (3-mers) and tetranucleotide (4-mers). The training dataset was taken from Barcode of Life Database (BOLD). For evaluating the model, we compared the accuracy of model using SMOTE and without SMOTE in order to classify DNA barcode sequences which is taken from Department of Aquatic Product Technology, Bogor Agricultural University. The results showed that the accuracy of the model in the species level using SMOTE was 7% and 13% higher than those of non-SMOTE for trinucleotide (3-mers) and tetranucleotide (4-mers), respectively. It is expected that the use of SMOTE, as one of data balancing technique, could increase the accuracy of DNA barcode based fish classification system, particularly in the species level which is difficult to be identified

    Object-Based Image Classification of Summer Crop with Machine Learning Methods

    Get PDF
    The strategic management of agricultural lands involves crop field monitoring each year. Crop discrimination via remote sensing is a complex task, especially if different crops have a similar spectral response and cropping pattern. In such cases, crop identification could be improved by combining object-based image analysis and advanced machine learning methods. In this investigation, we evaluated the C4.5 decision tree, logistic regression (LR), support vector machine (SVM) and multilayer perceptron (MLP) neural network methods, both as single classifiers and combined in a hierarchical classification, for the mapping of nine major summer crops (both woody and herbaceous) from ASTER satellite images captured in two different dates. Each method was built with different combinations of spectral and textural features obtained after the segmentation of the remote images in an object-based framework. As single classifiers, MLP and SVM obtained maximum overall accuracy of 88%, slightly higher than LR (86%) and notably higher than C4.5 (79%). The SVM+SVM classifier (best method) improved these results to 89%. In most cases, the hierarchical classifiers considerably increased the accuracy of the most poorly classified class (minimum sensitivity). The SVM+SVM method offered a significant improvement in classification accuracy for all of the studied crops compared to the conventional decision tree classifier, ranging between 4% for safflower and 29% for corn, which suggests the application of object-based image analysis and advanced machine learning methods in complex crop classification tasks.This research was partly financed by the TIN2011-22794 project of the Spanish Ministerial Commission of Science and Technology (MICYT), FEDER funds, the P2011-TIC-7508 project of the “Junta de Andalucía” (Spain) and the Kearney Foundation of Soil Science (USA). The research of Peña was co-financed by the Fulbright-MEC postdoctoral program, financed by the Spanish Ministry for Science and Innovation, and by the JAEDoc Program, supported by CSIC and FEDER funds. ASTER data were available to us through a NASA EOS scientific investigator affiliation.We acknowledge support by the CSIC Open Access Publication Initiative through its Unit of Information Resources for Research (URICI).Peer Reviewe

    Water Across Synthetic Aperture Radar Data (WASARD): SAR Water Body Classification for the Open Data Cube

    Get PDF
    The detection of inland water bodies from Synthetic Aperture Radar (SAR) data provides a great advantage over water detection with optical data, since SAR imaging is not impeded by cloud cover. Traditional methods of detecting water from SAR data involves using thresholding methods that can be labor intensive and imprecise. This paper describes Water Across Synthetic Aperture Radar Data (WASARD): a method of water detection from SAR data which automates and simplifies the thresholding process using machine learning on training data created from Geoscience Australias WOFS algorithm. Of the machine learning models tested, the Linear Support Vector Machine was determined to be optimal, with the option of training using solely the VH polarization or a combination of the VH and VV polarizations. WASARD was able to identify water in the target area with a correlation of 97% with WOFS. Sentinel-1, Open Data Cube, Earth Observations, Machine Learning, Water Detection 1. INTRODUCTION Water classification is an important function of Earth imaging satellites, as accurate remote classification of land and water can assist in land use analysis, flood prediction, climate change research, as well as a variety of agricultural applications [2]. The ability to identify bodies of water remotely via satellite is immensely cheaper than contracting surveys of the areas in question, meaning that an application that can accurately use satellite data towards this function can make valuable information available to nations which would not be able to afford it otherwise. Highly reliable applications for the remote detection of water currently exist for use with optical satellite data such as that provided by LANDSAT. One such application, Geoscience Australias Water Observations from Space (WOFS) has already been ported for use with the Open Data Cube [6]. However, water detection using optical data from Landsat is constrained by its relatively long revisit cycle of 16 days [5], and water detection using any optical data is constrained in that it lacks the ability to make accurate classifications through cloud cover [2]. The alternative solution which solves these problems is water detection using SAR data, which images the Earth using cloud-penetrating microwaves. Because of its advantages over optical data, much research has been done into water detection using SAR data. Traditionally, this has been done using the thresholding method, which involves picking a polarization band and labeling all pixels for which this bands value is below a certain threshold as containing water. The thresholding method works since water tends to return a much lower backscatter value to the satellite than land [1]. However, this method can be flawed since estimating the proper threshold is often imprecise, complicated, and labor intensive for the end user. Thresholding also tends to use data from only one SAR polarization, when a combination of polarizations can provide insight into whether water is present. [2] In order to alleviate these problems, this paper presents an application for the Open Data Cube to detect water from SAR data using support vector machine (SVM) classification. 2. PLATFORM WASARD is an application for the Open Data Cube, a mechanism which provides a simple yet efficient means of ingesting, storing, and retrieving remote sensing data. Data can be ingested and made analysis ready according to whatever specifications the researcher chooses, and easily resampled to artificially alter a scenes resolution. Currently WASARD supports water detection on scenes from ESAs Sentinel-1 and JAXAs ALOS. When testing WASARD, Sentinel-1 was most commonly used due to its relatively high spatial resolution and its rapid 6 day revisit cycle [5]. With minor alterations to the application's code, however, it could support data from other satellites. 3. METHODOLOGY Using supervised classification, WASARD compares SAR data to a dataset pre-classified by WOFS in order to train an SVM classifier. This classifier is then used to detect water in other SAR scenes outside the training set. Accuracy was measured according to the following metrics: Precision: a measure of what percentage of the points WASARD labels as water are truly water Recall: a measure of what percentage of the total water cover WASARD was able to identify. F1 Score: a harmonic average of the precision and recall scores Both precision and recall are calculated at the end of the training phase, when the trained classifier is compared to a testing dataset. Because the WOFS algorithms classifications are used as the truth values when training a WASARD classifier, when precision and recall are mentioned in this paper, they are always with respect to the values produced by WOFS on a similar scene of Landsat data, which themselves have a classification accuracy of 97% [6]. Visual representations of water identified by WASARD in this paper were produced using the function wasard_plot(), which is included in WASARD. 3.1 Algorithm Selection The machine learning model used by WASARD is the Linear Support Vector Machine (SVM). This model uses a supervised learning algorithm to develop a classifier, meaning it creates a vector which can be multiplied by the vector formed by the relevant data bands to determine whether a pixel in a SAR scene contains water. This classifier is trained by comparing data points from selected bands in a SAR scene to their respective labels, which in this case are water or not water as given by the WOFS algorithm. The SVM was selected over the Random Forest model, which outperformed the SVM in training speed, but had a greater classification time and lower accuracy, and the Multilayer Perceptron Artificial Neural Network, which had a slightly higher average accuracy than the SVM, but much greater training and classification times. Figure 1: Visual representation of the SVM Classifier. Each white point represents a pixel in a SAR scene. In Figure 1, the diagonal line separating pixels determined to be water from those determined not to be water represents the actual classification vector produced by the SVM. It is worth noting that once the model has been trained, classification of pixels is done in a similar manner as in the thresholding method. This is especially true if only one band was used to train the model. 3.1 Feature Selection Sentinel-1 collects data from two bands: the Vertical/Vertical polarization (VV) and the Vertical/Horizontal polarization (VH). When 100 SVM classifiers were created for each polarization individually, and for the combination of the two, the following results were achieved: Figure 2: Accuracy of classifiers trained using different polarization bands. Precision and Recall were measured with respect to the values produced by WOFS. Figure 2 demonstrates that using both the VV and VH bands trades slightly lower recall for significantly greater precision when compared with the VH band alone, and that using the VV band alone is inferior in both metrics. WASARD therefore defaults to using both the VV and VH bands, and includes the option to use solely the VH band. The VV polarizations lower precision compared to the VH polarization is in contrast to results from previous research and may merit further analysis [4]. 3.2 Training a Classifier The steps in training a classifier with WASARD are 1. Selecting two scenes (one SAR, one optical) with the same spatial extents, and acquired close to each other in time, with a preference that the scenes are taken on the same day. 2. Using the WOFS algorithm to produce an array of the detected water in the scene of optical data, to be used as the labels during supervised learning 3. Data points from the selected bands from the SAR acquisition are bundled together into an array with the corresponding labels gathered from WOFS. A random sample with an equal number of points labeled Water and Not Water is selected to be partitioned into a training and a testing dataset 4. Using Scikit-Learns LinearSVC object, the training dataset is used to produce a classifier, which is then tested against the testing dataset to determine its precision and recall The result is a wasard_classifier object, which has the following attributes: 1. f1, recall, and precision: 3 metrics used to determine the classifiers accuracy 2. Coefficient: Vector which the SVM uses to make its predictions. The classifier detects water when the dot product of the coefficient and the vector formed by the SAR bands is positive 3. Save(): allows a user to save a classifier to the disk in order to use it without retraining 4. wasard_classify(): Classifies an entire xarray of SAR data using the SVM classifier All of the above steps are performed automatically when the user creates a wasard_classifier object. 3.3 Classifying a Dataset Once the classifier has been created, it can be used to detect water in an xarray of SAR data using wasard_classify(). By taking the dot product of the classifiers coefficients and the vector formed by the selected bands of SAR data, an array of predictions is constructed. A classifier can effectively be used on the same spatial extents as the ones where it was trained, or on any area with a similar landscape. Whil

    Delineating smallholder maize farms from Sentinel-1 coupled with Sentinel-2 data using machine learning

    Get PDF
    Rural communities rely on smallholder maize farms for subsistence agriculture, the main driver of local economic activity and food security. However, their planted area estimates are unknown in most developing countries. This study explores the use of Sentinel-1 and Sentinel-2 data to map smallholder maize farms. The random forest (RF), support vector (SVM) machine learning algorithms and model stacking (ST) were applied. Results show that the classification of combined Sentinel-1 and Sentinel-2 data improved the RF, SVM and ST algorithms by 24.2%, 8.7%, and 9.1%, respectively, compared to the classification of Sentinel-1 data individually. Similarities in the estimated areas (7001.35 ± 1.2 ha for RF, 7926.03 ± 0.7 ha for SVM and 7099.59 ± 0.8 ha for ST) show that machine learning can estimate smallholder maize areas with high accuracies. The study concludes that the single-date Sentinel-1 data were insufficient to map smallholder maize farms. However, single-date Sentinel-1 combined with Sentinel-2 data were sufficient in mapping smallholder farms. These results can be used to support the generation and validation of national crop statistics, thus contributing to food security.The Agricultural Research Council, the National Research Foundation and the University of Pretoria.https://www.mdpi.com/journal/sustainabilitydm2022Geography, Geoinformatics and Meteorolog

    Characterizing and mapping cropping patterns in a complex agro-ecosystem: An iterative participatory mapping procedure using machine learning algorithms and MODIS vegetation indices

    Get PDF
    Accurate and up-to-date spatial agricultural information is essential for applications including agro-environmental assessment, crop management, and appropriate targeting of agricultural technologies. There is growing research interest in spatial analysis of agricultural ecosystems applying satellite remote sensing technologies. However, usability of information generated from many of remotely sensed data is often constrained by accuracy problems. This is of particular concern in mapping complex agro-ecosystems in countries where small farm holdings are dominated by diverse crop types. This study is a contribution to the ongoing efforts towards overcoming accuracy challenges faced in remote sensing of agricultural ecosystems. We applied time-series analysis of vegetation indices (Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI)) derived from the Moderate Resolution Imaging Spectrometer (MODIS) sensor to detect seasonal patterns of irrigated and rainfed cropping patterns in five townships in the Central Dry Zone of Myanmar, which is an important agricultural region of the country has been poorly mapped with respect to cropping practices. To improve mapping accuracy and map legend completeness, we implemented a combination of (i) an iterative participatory approach to field data collection and classification, (ii) the identification of appropriate size and types of predictor variables (VIs), and (iii) evaluation of the suitability of three Machine Learning algorithms: Support Vector Machine (SVM), Random Forest (RF), and C5.0 algorithms under varying training sample sizes. Through these procedures, we were able to progressively improve accuracy and achieve maximum overall accuracy of 95% When a small sized training dataset was used, accuracy achieved by RF was significantly higher compared to SVM and C5.0 (P < 0.01), but as sample size increased, accuracy differences among the three machine learning algorithms diminished. Accuracy achieved by use of NDVI was consistently better than that of EVI (P < 0.01). The maximum overall accuracy was achieved using RF and 8-days NDVI composites for three years of remote sensing data. In conclusion, our findings highlight the important role of participatory classification, especially in areas where cropping systems are highly diverse and differ over space and time. We also show that the choice of classifiers and size of predictor variables are essential and complementary to the participatory mapping approach in achieving desired accuracy of cropping pattern mapping in areas where other sources of spatial information are scarce
    • …
    corecore