1,620 research outputs found

    Framework for Aligning Big-Data Strategy with Organizational Goals

    Get PDF
    Organisations are currently looking to adopt Big Data technology but are uncertain of the benefits it may bring to the organization and concerned with the implementation costs. To this end, this research proposes a Strategic Framework aimed at helping on the alignment of the Business objectives with Big Data projects. The framework is expected to help on the understanding of the value that a proposed Big Data project may bring to the Organization. This paper focuses on the third phase of the framework: Generation of Strategic Big Data goals. The framework was tested on a broadcasting TV station in Nigeria. The conclusions of this phase are: the identification of strategic goals before implementation which offered a clearer view of the benefits the proposed project bring to the organization and helped to focus the project to deliver the best value. In addition, it helped them to reduced time and implementation cost

    On the Statistical and Scaling Properties of Observed and Simulated Soil Moisture

    Get PDF
    abstract: Soil moisture (θ) is a fundamental variable controlling the exchange of water and energy at the land surface. As a result, the characterization of the statistical properties of θ across multiple scales is essential for many applications including flood prediction, drought monitoring, and weather forecasting. Empirical evidences have demonstrated the existence of emergent relationships and scale invariance properties in θ fields collected from the ground and airborne sensors during intensive field campaigns, mostly in natural landscapes. This dissertation advances the characterization of these relations and statistical properties of θ by (1) analyzing the role of irrigation, and (2) investigating how these properties change in time and across different landscape conditions through θ outputs of a distributed hydrologic model. First, θ observations from two field campaigns in Australia are used to explore how the presence of irrigated fields modifies the spatial distribution of θ and the associated scale invariance properties. Results reveal that the impact of irrigation is larger in drier regions or conditions, where irrigation creates a drastic contrast with the surrounding areas. Second, a physically-based distributed hydrologic model is applied in a regional basin in northern Mexico to generate hyperresolution θ fields, which are useful to conduct analyses in regions and times where θ has not been monitored. For this aim, strategies are proposed to address data, model validation, and computational challenges associated with hyperresolution hydrologic simulations. Third, analyses are carried out to investigate whether the hyperresolution simulated θ fields reproduce the statistical and scaling properties observed from the ground or remote sensors. Results confirm that (i) the relations between spatial mean and standard deviation of θ derived from the model outputs are very similar to those observed in other areas, and (ii) simulated θ fields exhibit the scale invariance properties that are consistent with those analyzed from aircraft-derived estimates. The simulated θ fields are then used to explore the influence of physical controls on the statistical properties, finding that soil properties significantly affect spatial variability and multifractality. The knowledge acquired through this dissertation provides insights on θ statistical properties in regions and landscape conditions that were never investigated before; supports the refinement of the calibration of multifractal downscaling models; and contributes to the improvement of hyperresolution hydrologic modeling.Dissertation/ThesisDoctoral Dissertation Civil, Environmental and Sustainable Engineering 201

    Using sentinel-1 and sentinel-2 time series for slangbos mapping in the free state province, South Africa

    Get PDF
    Increasing woody cover and overgrazing in semi-arid ecosystems are known to be the major factors driving land degradation. This study focuses on mapping the distribution of the slangbos shrub (Seriphium plumosum) in a test region in the Free State Province of South Africa. The goal of this study is to monitor the slangbos encroachment on cultivated land by synergistically combining Synthetic Aperture Radar (SAR) (Sentinel-1) and optical (Sentinel-2) Earth observation information. Both optical and radar satellite data are sensitive to different vegetation properties and surface scattering or reflection mechanisms caused by the specific sensor characteristics. We used a supervised random forest classification to predict slangbos encroachment for each individual crop year between 2015 and 2020. Training data were derived based on expert knowledge and in situ information from the Department of Agriculture, Land Reform and Rural Development (DALRRD). We found that the Sentinel-1 VH (cross-polarization) and Sentinel-2 SAVI (Soil Adjusted Vegetation Index) time series information have the highest importance for the random forest classifier among all input parameters. The modelling results confirm the in situ observations that pastures are most affected by slangbos encroachment. The estimation of the model accuracy was accomplished via spatial cross-validation (SpCV) and resulted in a classification precision of around 80% for the slangbos class within each time step

    Big Earth Data and Machine Learning for Sustainable and Resilient Agriculture

    Full text link
    Big streams of Earth images from satellites or other platforms (e.g., drones and mobile phones) are becoming increasingly available at low or no cost and with enhanced spatial and temporal resolution. This thesis recognizes the unprecedented opportunities offered by the high quality and open access Earth observation data of our times and introduces novel machine learning and big data methods to properly exploit them towards developing applications for sustainable and resilient agriculture. The thesis addresses three distinct thematic areas, i.e., the monitoring of the Common Agricultural Policy (CAP), the monitoring of food security and applications for smart and resilient agriculture. The methodological innovations of the developments related to the three thematic areas address the following issues: i) the processing of big Earth Observation (EO) data, ii) the scarcity of annotated data for machine learning model training and iii) the gap between machine learning outputs and actionable advice. This thesis demonstrated how big data technologies such as data cubes, distributed learning, linked open data and semantic enrichment can be used to exploit the data deluge and extract knowledge to address real user needs. Furthermore, this thesis argues for the importance of semi-supervised and unsupervised machine learning models that circumvent the ever-present challenge of scarce annotations and thus allow for model generalization in space and time. Specifically, it is shown how merely few ground truth data are needed to generate high quality crop type maps and crop phenology estimations. Finally, this thesis argues there is considerable distance in value between model inferences and decision making in real-world scenarios and thereby showcases the power of causal and interpretable machine learning in bridging this gap.Comment: Phd thesi

    An Approach to Designing Clusters for Large Data Processing

    Get PDF
    Cloud computing is increasingly being adopted due to its cost savings and abilities to scale. As data continues to grow rapidly, an increasing amount of institutions are adopting non standard SQL clusters to address the storage and processing demands of large data. However, evaluating and modelling non SQL clusters presents many challenges. In order to address some of these challenges, this thesis proposes a methodology for designing and modelling large scale processing configurations that respond to the end user requirements. Firstly, goals are established for the big data cluster. In this thesis, we use performance and cost as our goals. Secondly, the data is transformed from relational data schema to an appropriate HBase schema. In the third step, we iteratively deploy different clusters. We then model the clusters and evaluate different topologies (size of instances, number of instances, number of clusters, etc.). We use HBase as the large data processing cluster and we evaluate our methodology on traffic data from a large city and on a distributed community cloud infrastructure

    Estimating Groundnut Yield in Smallholder Agriculture Systems Using PlanetScope Data

    Get PDF
    Crop yield is related to household food security and community resilience, especially in smallholder agricultural systems. As such, it is crucial to accurately estimate within-season yield in order to provide critical information for farm management and decision making. Therefore, the primary objective of this paper is to assess the most appropriate method, indices, and growth stage for predicting the groundnut yield in smallholder agricultural systems in northern Malawi. We have estimated the yield of groundnut in two smallholder farms using the observed yield and vegetation indices (VIs), which were derived from multitemporal PlanetScope satellite data. Simple linear, multiple linear (MLR), and random forest (RF) regressions were applied for the prediction. The leave-one-out cross-validation method was used to validate the models. The results showed that (i) of the modelling approaches, the RF model using the five most important variables (RF5) was the best approach for predicting the groundnut yield, with a coefficient of determination (R2) of 0.96 and a root mean square error (RMSE) of 0.29 kg/ha, followed by the MLR model (R2 = 0.84, RMSE = 0.84 kg/ha); in addition, (ii) the best within-season stage to accurately predict groundnut yield is during the R5/beginning seed stage. The RF5 model was used to estimate the yield for four different farms. The estimated yields were compared with the total reported yields from the farms. The results revealed that the RF5 model generally accurately estimated the groundnut yields, with the margins of error ranging between 0.85% and 11%. The errors are within the post-harvest loss margins in Malawi. The results indicate that the observed yield and VIs, which were derived from open-source remote sensing data, can be applied to estimate yield in order to facilitate farming and food security planning

    Literature Review of Software Process Assessment Methodology ISO/IEC 15504

    Get PDF
    An assessment method with the objective of process improvement adapted to small software company based on the standard ISO/IEC 15504 is being developed. This article describes the design, development, validation and results of a Process Assessment Model for assessing Technological and Business Competencies on Software Development. The model follows the ISO/IEC 15504 (SPICE) requirements for Process Assessment Models. A prime motivation for developing this standard has been the perceived need for an internationally recognized software process assessment framework that pulls together the existing public and proprietary models and methods. Assessment process has been adapted and refined in order to provide ready support. The methods includes an adapted and enhanced assessment model based on the ISO 15504 exemplar model

    Uso de sensores remotos en el seguimiento de la vegetación de dehesa y su influencia en el balance hidrológico a escala de cuenca

    Get PDF
    The Mediterranean region is characterized by hot summers with long dry periods, a situation that may be exacerbated by the progressive global warming. In these water-limited environments where productivity of the ecosystems depends mainly on water availability, the reduction of freshwater resources can have severe consequences. An increase in aridity may lead to low productivity, land degradation and unwanted changes in land use. To reduce the vulnerability of Mediterranean landscapes it is important to improve our knowledge of the hydrological processes conditioning the water exchanges, with evapotranspiration (ET) being a key indicator of the state of ecosystems and playing a crucial role in the basin's water and energy balances. The goal of this dissertation is to improve our understanding of the evapotranspiration dynamics over Mediterranean heterogeneous and complex vegetation covers, with a focus on the dehesa ecosystem. The final aim is to contribute to the conservation of the water resources in these regions in the medium to long term, supporting the decision-making processes with quantitative, distributed, and high-quality information. To reach this goal, in this research the evaluation of remote sensing-based soil water balance (SWB) and surface energy balance (SEB) models was proposed to monitor the water consumption and water stress of typical Mediterranean vegetation at different spatial and temporal scales. In particular, the VI-ETo methodology (SWB) and the ALEXI/DisALEXI approach (SEB) have been adapted and applied. ET modeling using the VI-ETo scheme has been improved through the assessment of the vegetation layers' effective parameters. A data fusion algorithm was applied to the ET maps produced by the SEB model over the dehesa ecosystem, and we analyzed the opportunities that this high-resolution ET product in time and space can provide for water and vegetation resource management. The results have demonstrated the feasibility of both approaches (SWB and SEB models) to accurately monitor ET dynamics over the dehesa landscape, adequately reproducing the annual bimodal behavior and the response of the vegetation in periods of water deficit. The error obtained using the SWB approach (the VI-ETo method) was RMSE = 0.47 mm day-1 over the whole dehesa system (grass + trees) and over an open grassland. The monitoring of water stress for both systems with different canopy structure, using as a proxy the ET/ETo ratio, and the stress coefficient (Ks), was successful. Improvements on the specific spectral properties of oak trees and layer-specific parameters were included into the modeling. We also analyzed the influence of the spectral properties of oak trees and another typical Mediterranean tree canopy, the olive orchard, in the VI-ETo model. We found that the use of appropriate values of the parameter SAVImax (0.51 for oak trees and 0.57 for olive trees) had notable implications in the computation of ET and water stress, in contrast to using a generic value for Mediterranean crops (SAVImax= 0.75). The accuracy of this water balance-based approach was also evaluated over two heterogeneous Mediterranean basins, with a mosaic of holm oaks and grasslands, shrubs, coniferous plantations, and irrigated horticultural crops. The annual discharge flows of both watersheds, which were determined from the modeled ET data and using a simple surface water balance, were very similar to those obtained with the HBV hydrological model, and to the values measured at the outlet of one of the basins, corroborating the usefulness of the VI-ETo methodology on these vegetation types. On the other hand, the resulting ET series (30 m, daily) derived with the SEB approach (ALEXI/DisALEXI method) and the STARFM fusion algorithm provided an RMSE value of 0.67 mm day-1, which was considered an acceptable error for management purposes. This error was slightly lower compared to using simpler interpolation methods, probably due to the high temporal frequency and better spatial representation of the flux tower footprint of the fused time series. The analysis of ET patterns over small heterogeneous vegetated patches that form the dehesa structure revealed the importance of having fine resolution information at field scale to distinguish the water consumed by the different vegetation components, which influences the provision of many ecosystem services. For example, it was key for identifying phenology dates of grasslands, or understanding the hydrological functioning of riverside dense evergreen vegetation with high ET rates during the whole year, in contrast with the herbaceous areas. Accurately modeling these different behaviors of dehesa microclimates is useful to support farmers‘ management and provide recommendations tailored for each structural component and requirements.La región mediterránea se caracteriza por veranos calurosos con largos períodos sin precipitaciones, situación que puede agravarse con el progresivo calentamiento global. En estos ambientes donde la productividad de los ecosistemas depende principalmente de la disponibilidad de agua, la reducción de los recursos hídricos puede tener graves consecuencias. Un aumento de la aridez puede conducir a una baja productividad, degradación de la tierra y cambios no deseados en el uso del suelo. Para reducir la vulnerabilidad de las zonas mediterráneas es importante profundizar en el estudio de los procesos hidrológicos que condicionan los intercambios de agua, siendo la evapotranspiración (ET) un indicador clave del estado de los ecosistemas y jugando un papel crucial en los balances hídricos y energéticos de la cuenca. El objetivo de esta tesis es mejorar nuestro conocimiento sobre la dinámica de la evapotranspiración en cubiertas mediterráneas heterogéneas y complejas, con el foco en el ecosistema de dehesa. El objetivo final es contribuir a la conservación de los recursos hídricos de estas regiones en el medio-largo plazo, apoyando en los procesos de toma de decisiones con información cuantitativa, distribuida y de calidad. Para alcanzar este objetivo, en esta investigación se propuso evaluar modelos de balance de agua en el suelo (SWB) y balance de energía en superficie (SEB) basados en el uso de sensores remotos, para el seguimiento del consumo de agua y el estrés hídrico de la vegetación mediterránea a diferentes escalas espaciales y temporales. En particular, se ha adaptado y aplicado la metodología VI-ETo (SWB) y el enfoque ALEXI/DisALEXI (SEB). Se ha mejorado el modelado de ET utilizando el esquema VI-ETo mediante la evaluación de los parámetros efectivos de las capas de vegetación. Se aplicó un algoritmo de fusión de datos remotos a los mapas de ET generados por el modelo SEB sobre el ecosistema de dehesa, y estudiamos las oportunidades que este producto de ET con alta resolución espacial y temporal puede aportar en la gestión de los recursos hídricos y de los ecosistemas. Los resultados han demostrado la viabilidad de ambos enfoques (modelos SWB y SEB) para monitorear con precisión la dinámica de la ET sobre el ecosistema de dehesa, reproduciendo adecuadamente el comportamiento bimodal anual y la respuesta de la vegetación en períodos de déficit hídrico. El error obtenido usando el enfoque SWB (el método VI-ETo) fue RMSE = 0.47 mm día-1, tanto para el sistema dehesa (pasto + árboles) como para una zona de pastizal. El seguimiento del estrés hídrico para ambos sistemas con diferente estructura de vegetación, utilizando la relación ET/ETo y el coeficiente de estrés (Ks), fue satisfactorio. Se incluyeron en el modelado mejoras sobre las propiedades espectrales específicas de las encinas y los parámetros específicos de los diferentes estratos de vegetación. También analizamos la influencia de las propiedades espectrales de las encinas y otra cubierta mediterránea, el olivar, en el modelo VI-ETo. Encontramos que el uso de valores apropiados del parámetro SAVImax (0,51 para robles y 0,57 para olivos) tuvo un efecto significativo en la determinación del consumo de agua y estrés hídrico, en comparación con usar un valor genérico para cultivos mediterráneos (SAVImax = 0,75). La precisión de este enfoque basado en el balance hídrico también se evaluó en dos cuencas mediterráneas heterogéneas, con un mosaico de encinas y pastizales, arbustos, plantaciones de coníferas y cultivos hortícolas de regadío. Los caudales de descarga anual de ambas cuencas, determinados a partir de los datos de ET modelados y utilizando un balance hídrico superficial muy simple, fueron muy similares a los obtenidos con el modelo hidrológico HBV, y a los valores medidos en la salida de una de las cuencas, corroborando la utilidad de la metodología VI-ETo sobre estas formaciones vegetales. Por otra parte, la serie final de ET (30 m, diaria) derivada del enfoque SEB (método ALEXI/DisALEXI) y del algoritmo de fusión STARFM proporcionó un valor de RMSE de 0,67 mm día-1, considerado un error aceptable para fines de manejo. Este error fue ligeramente inferior a los obtenidos usando métodos de interpolación más simples, debido probablemente a la alta frecuencia temporal y una mejor representación espacial del footprint de la torre de medida de flujos en la serie temporal fusionada. El análisis de los patrones de la ET sobre pequeñas manchas de vegetación heterogéneas, que forman la estructura de la dehesa, reveló la importancia de tener información con alta resolución a escala de campo para distinguir el agua consumida por los diferentes componentes de la vegetación, que tienen influencia en el aprovisionamiento de muchos servicios ecosistémicos. Por ejemplo, fue clave para identificar ciertas fechas fenológicas de los pastizales, o entender el funcionamiento hidrológico de la vegetación densa de hoja perenne en zonas de ribera con altas tasas de ET durante todo el año, en comparación con zonas de especies herbáceas. Modelar con precisión estos comportamientos diferentes de los microclimas de la dehesa es útil para apoyar la gestión de los agricultores y ofrecer recomendaciones adaptadas a cada componente y necesidades estructurales
    corecore