20 research outputs found

    Field Sampling Scheme Optimization Using Simulated Annealing

    Get PDF

    Assessment of accuracy: systematic reduction of training points for maximum likelihood classification and mixture discriminant analysis (Gaussian and t-distribution)

    Get PDF
    Remote sensing provides a valuable tool for monitoring land cover across large areas of land. A simple yet popular method for land cover classification is Maximum Likelihood Classification (MLC), which assumes a single normal distribution of the samples per class in the feature space. Mixture Discriminant Analysis (MDA) is a natural extension of MLC which can be used with varying distributions and multiple distributions per class, which simplifies the classification process tremendously. We compare the accuracies of MLC and MDA (using a Gaussian and t-distribution) as the number of training points are systematically reduced in order to simulate varying reference data availability conditions. The results show that the more robust t-distribution MDA performs comparatively with the Gaussian MDA and that both outperform MLC when sufficient training points are available. As the number of training points increases the MDA accuracies increase while the MLC accuracy stagnates. At very low numbers of training samples (ranging from 22 to 169 dependent on the class), there is more variability in terms of which method performs best

    Distinguishing tree species from in situ hyperspectral and temporal measurements through ensemble statistical learning

    Get PDF
    The data presented in this study may be obtained from the corresponding author upon request. Due to intellectual property and confidentiality concerns, the data is not publicly available.Hyperspectral sensors capture and compute spectral reflectance of objects over many wavelength bands, resulting in a high-dimensional space with enough information to differentiate between spectrally similar objects. Due to the curse of dimensionality, high spectral dimensionality can also be difficult to handle and analyse, demanding complex processing and the use of advanced analytical techniques. Moreover, when hyperspectral measurements are taken at different temporal frequencies, separation is likely to improve; however, additional complexities in modelling time variability concurrently with this high spectral dimensionality may be created. As a result, the applicability of ensemble-based techniques suitable for high-dimensional data is examined in this research, together with the statistical evaluation of time-induced variability, since spectral measurements of tree species were taken at different time periods. Classification errors for the stochastic gradient boosting (SGB) and random forest (RF) methods ranged between 5.6% and 13.5%, respectively. Differences in classification accuracy or errors were also accounted for in the assessment of the models, with up to 46% of variation in classification error due to the effect of time in the RF model, indicating that measurement time is important in improving discrimination between tree species. This is because optical leaf characteristics can vary during the course of the year due to seasonal effects, health status, or the developmental stage of a tree. Different spectral properties (assumed from relevant wavelength bands) were found to be key factors impacting the models’ discrimination performance at various measurement times.The Council for Scientific and Industrial Research (CSIR).https://www.mdpi.com/journal/remotesensingPlant Production and Soil SciencePlant ScienceSDG-15:Life on lan

    A Markov chain model for geographical accessibility

    Get PDF
    Accessibility analyses are conducted for a variety of applications, including urban planning and public health studies. These applications may aggregate data at the level of administrative units, such as provinces or municipalities. Accessibility between administrative units can be quantified by travel distance. However, modelling the distances between all administrative units in a region is computationally expensive if a large number of administrative units is considered. We propose a methodology to model accessibility between administrative units as a homogeneous Markov chain, where the administrative units are states and standardised inverse travel distances act as transition probabilities. Single transitions are allowed only between adjacent administrative units, resulting in a sparse one-step transition probability matrix (TPM). Powers of the TPM are taken to obtain transition probabilities between non-adjacent units. The methodology assumes that the Markov property holds for travel between units. We apply the methodology to administrative units within Tshwane, South Africa, considering only major roads for the sake of computation. The results are compared to those obtained using Euclidean distance, and we show that using network distance yields more reasonable results. The proposed methodology is computationally efficient and can be used to estimate accessibility between any set of administrative units connected by a road network.In part by the National Research Foundation of South Africa and the NRF-SASA Academic Statistics Grant.http://www.elsevier.com/locate/spastaam2024StatisticsNon

    Short-term real-time prediction of total number of reported COVID-19 cases and deaths in South Africa : a data driven approach

    Get PDF
    BACKGROUND: The rising burden of the ongoing COVID-19 epidemic in South Africa has motivated the application of modeling strategies to predict the COVID-19 cases and deaths. Reliable and accurate short and long-term forecasts of COVID-19 cases and deaths, both at the national and provincial level, are a key aspect of the strategy to handle the COVID-19 epidemic in the country. METHODS: In this paper we apply the previously validated approach of phenomenological models, fitting several nonlinear growth curves (Richards, 3 and 4 parameter logistic, Weibull and Gompertz), to produce short term forecasts of COVID-19 cases and deaths at the national level as well as the provincial level. Using publicly available daily reported cumulative case and death data up until 22 June 2020, we report 5, 10, 15, 20, 25 and 30-day ahead forecasts of cumulative cases and deaths. All predictions are compared to the actual observed values in the forecasting period. RESULTS: We observed that all models for cases provided accurate and similar short-term forecasts for a period of 5 days ahead at the national level, and that the three and four parameter logistic growth models provided more accurate forecasts than that obtained from the Richards model 10 days ahead. However, beyond 10 days all models underestimated the cumulative cases. Our forecasts across the models predict an additional 23,551–26,702 cases in 5 days and an additional 47,449–57,358 cases in 10 days. While the three parameter logistic growth model provided the most accurate forecasts of cumulative deaths within the 10 day period, the Gompertz model was able to better capture the changes in cumulative deaths beyond this period. Our forecasts across the models predict an additional 145–437 COVID-19 deaths in 5 days and an additional 243–947 deaths in 10 days. CONCLUSIONS: By comparing both the predictions of deaths and cases to the observed data in the forecasting period, we found that this modeling approach provides reliable and accurate forecasts for a maximum period of 10 days ahead.http://www.biomedcentral.com/bmcmedresmethodolpm2021Statistic

    A spatial model with vaccinations for COVID-19 in South Africa

    Get PDF
    Since the emergence of the novel COVID-19 virus pandemic in December 2019, numerous mathematical models were published to assess the transmission dynamics of the disease, predict its future course, and evaluate the impact of different control measures. The simplest models make the basic assumptions that individuals are perfectly and evenly mixed and have the same social structures. Such assumptions become problematic for large developing countries that aggregate heterogeneous COVID-19 outbreaks in local areas. Thus, this paper proposes a spatial SEIRDV model that includes spatial vaccination coverage, spatial vulnerability, and level of mobility, to take into account the spatial–temporal clustering pattern of COVID-19 cases. The conclusion of this study is that immunity, government interventions, infectiousness and virulence are the main drivers of the spread of COVID-19. These factors should be taken into consideration when scientists, public policy makers and other stakeholders in the health community analyse, create and project future disease prevention scenarios. Such a model has a place for disease outbreaks that may occur in future, allowing for the inclusion of vaccination rates in a spatial manner.In part by the National Research Foundation of South Africa and also funded by Canada’s International Development Research Centre (IDRC).http://www.elsevier.com/locate/spastaam2024StatisticsSDG-03:Good heatlh and well-bein

    Are earth sciences lagging behind in data integration methodologies?

    Get PDF
    This article reflects discussions German and South African Earth scientists, statisticians and risk analysts had on occasion of two bilateral workshops on Data Integration Technologies for Earth System Modelling and Resource Management. The workshops were held in October 2012 at Leipzig, Germany, and April 2013 at Pretoria, South Africa, and were attended by about 70 researchers, practitioners and data managers of both countries. Both events were arranged as part of the South African-German Year of Science 2012/2013. The South African National Research Foundation (NRF, UID 81579) has supported the two workshops as part of the South African--German Year of Science activities 2012/2013 established by the German Federal Ministry of Education and Research and the South African Department of Science and Technology.http://link.springer.com/journal/12665hb201
    corecore