132 research outputs found

    Machine Learning Approaches for Natural Resource Data

    Get PDF
    Abstract Real life applications involving efficient management of natural resources are dependent on accurate geographical information. This information is usually obtained by manual on-site data collection, via automatic remote sensing methods, or by the mixture of the two. Natural resource management, besides accurate data collection, also requires detailed analysis of this data, which in the era of data flood can be a cumbersome process. With the rising trend in both computational power and storage capacity, together with lowering hardware prices, data-driven decision analysis has an ever greater role. In this thesis, we examine the predictability of terrain trafficability conditions and forest attributes by using a machine learning approach with geographic information system data. Quantitative measures on the prediction performance of terrain conditions using natural resource data sets are given through five distinct research areas located around Finland. Furthermore, the estimation capability of key forest attributes is inspected with a multitude of modeling and feature selection techniques. The research results provide empirical evidence on whether the used natural resource data is sufficiently accurate enough for practical applications, or if further refinement on the data is needed. The results are important especially to forest industry since even slight improvements to the natural resource data sets utilized in practice can result in high saves in terms of operation time and costs. Model evaluation is also addressed in this thesis by proposing a novel method for estimating the prediction performance of spatial models. Classical model goodness of fit measures usually rely on the assumption of independently and identically distributed data samples, a characteristic which normally is not true in the case of spatial data sets. Spatio-temporal data sets contain an intrinsic property called spatial autocorrelation, which is partly responsible for breaking these assumptions. The proposed cross validation based evaluation method provides model performance estimation where optimistic bias due to spatial autocorrelation is decreased by partitioning the data sets in a suitable way. Keywords: Open natural resource data, machine learning, model evaluationTiivistelmä Käytännön sovellukset, joihin sisältyy luonnonvarojen hallintaa ovat riippuvaisia tarkasta paikkatietoaineistosta. Tämä paikkatietoaineisto kerätään usein manuaalisesti paikan päällä, automaattisilla kaukokartoitusmenetelmillä tai kahden edellisen yhdistelmällä. Luonnonvarojen hallinta vaatii tarkan aineiston keräämisen lisäksi myös sen yksityiskohtaisen analysoinnin, joka tietotulvan aikakautena voi olla vaativa prosessi. Nousevan laskentatehon, tallennustilan sekä alenevien laitteistohintojen myötä datapohjainen päätöksenteko on yhä suuremmassa roolissa. Tämä väitöskirja tutkii maaston kuljettavuuden ja metsäpiirteiden ennustettavuutta käyttäen koneoppimismenetelmiä paikkatietoaineistojen kanssa. Maaston kuljettavuuden ennustamista mitataan kvantitatiivisesti käyttäen kaukokartoitusaineistoa viideltä eri tutkimusalueelta ympäri Suomea. Tarkastelemme lisäksi tärkeimpien metsäpiirteiden ennustettavuutta monilla eri mallintamistekniikoilla ja piirteiden valinnalla. Väitöstyön tulokset tarjoavat empiiristä todistusaineistoa siitä, onko käytetty luonnonvaraaineisto riittävän laadukas käytettäväksi käytännön sovelluksissa vai ei. Tutkimustulokset ovat tärkeitä erityisesti metsäteollisuudelle, koska pienetkin parannukset luonnonvara-aineistoihin käytännön sovelluksissa voivat johtaa suuriin säästöihin niin operaatioiden ajankäyttöön kuin kuluihin. Tässä työssä otetaan kantaa myös mallin evaluointiin esittämällä uuden menetelmän spatiaalisten mallien ennustuskyvyn estimointiin. Klassiset mallinvalintakriteerit nojaavat yleensä riippumattomien ja identtisesti jakautuneiden datanäytteiden oletukseen, joka ei useimmiten pidä paikkaansa spatiaalisilla datajoukoilla. Spatio-temporaaliset datajoukot sisältävät luontaisen ominaisuuden, jota kutsutaan spatiaaliseksi autokorrelaatioksi. Tämä ominaisuus on osittain vastuussa näiden oletusten rikkomisesta. Esitetty ristiinvalidointiin perustuva evaluointimenetelmä tarjoaa mallin ennustuskyvyn mitan, missä spatiaalisen autokorrelaation vaikutusta vähennetään jakamalla datajoukot sopivalla tavalla. Avainsanat: Avoin luonnonvara-aineisto, koneoppiminen, mallin evaluoint

    Machine learned daily life history classification using low frequency tracking data and automated modelling pipelines: application to North American waterfowl

    Get PDF
    Background: Identifying animal behaviors, life history states, and movement patterns is a prerequisite for many animal behavior analyses and effective management of wildlife and habitats. Most approaches classify short-term movement patterns with high frequency location or accelerometry data. However, patterns reflecting life history across longer time scales can have greater relevance to species biology or management needs, especially when available in near real-time. Given limitations in collecting and using such data to accurately classify complex behaviors in the long-term, we used hourly GPS data from 5 waterfowl species to produce daily activity classifications with machine-learned models using “automated modelling pipelines”. Methods: Automated pipelines are computer-generated code that complete many tasks including feature engineering, multi-framework model development, training, validation, and hyperparameter tuning to produce daily classifications from eight activity patterns reflecting waterfowl life history or movement states. We developed several input features for modeling grouped into three broad categories, hereafter “feature sets”: GPS locations, habitat information, and movement history. Each feature set used different data sources or data collected across different time intervals to develop the “features” (independent variables) used in models. Results: Automated modelling pipelines rapidly developed easily reproducible data preprocessing and analysis steps, identification and optimization of the best performing model and provided outputs for interpreting feature importance. Unequal expression of life history states caused unbalanced classes, so we evaluated feature set importance using a weighted F1-score to balance model recall and precision among individual classes. Although the best model using the least restrictive feature set (only 24 hourly relocations in a day) produced effective classifications (weighted F1 = 0.887), models using all feature sets performed substantially better (weighted F1 = 0.95), particularly for rarer but demographically more impactful life history states (i.e., nesting). Conclusions: Automated pipelines generated models producing highly accurate classifications of complex daily activity patterns using relatively low frequency GPS and incorporating more classes than previous GPS studies. Near real-time classification is possible which is ideal for time-sensitive needs such as identifying reproduction. Including habitat and longer sequences of spatial information produced more accurate classifications but incurred slight delays in processing

    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.The PhD Symposium was a very good opportunity for the young researchers to share information and knowledge, to present their current research, and to discuss topics with other students in order to look for synergies and common research topics. The idea was very successful and the assessment made by the PhD Student was very good. It also helped to achieve one of the major goals of the NESUS Action: to establish an open European research network targeting sustainable solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big data management, training, contributing to glue disparate researchers working across different areas and provide a meeting ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in research topics such as sustainable software solutions (applications and system software stack), data management, energy efficiency, and resilience.European Cooperation in Science and Technology. COS

    Exploiting Multi-Level Parallelism in Streaming Applications for Heterogeneous Platforms with GPUs

    Get PDF
    Heterogeneous computing platforms support the traditional types of parallelism, such as e.g., instruction-level, data, task, and pipeline parallelism, and provide the opportunity to exploit a combination of different types of parallelism at different platform levels. The architectural diversity of platform components makes tapping into the platform potential a challenging programming task. This thesis makes an important step in this direction by introducing a novel methodology for automatic generation of structured, multi-level parallel programs from sequential applications. We introduce a novel hierarchical intermediate program representation (HiPRDG) that captures the notions of structure and hierarchy in the polyhedral model used for compile-time program transformation and code generation. Using the HiPRDG as the starting point, we present a novel method for generation of multi-level programs (MLPs) featuring different types of parallelism, such as task, data, and pipeline parallelism. Moreover, we introduce concepts and techniques for data parallelism identification, GPU code generation, and asynchronous data-driven execution on heterogeneous platforms with efficient overlapping of host-accelerator communication and computation. By enabling the modular, hybrid parallelization of program model components via HiPRDG, this thesis opens the door for highly efficient tailor-made parallel program generation and auto-tuning for next generations of multi-level heterogeneous platforms with diverse accelerators.Computer Systems, Imagery and Medi

    Action-oriented Scene Understanding

    Get PDF
    In order to allow robots to act autonomously it is crucial that they do not only describe their environment accurately but also identify how to interact with their surroundings. While we witnessed tremendous progress in descriptive computer vision, approaches that explicitly target action are scarcer. This cumulative dissertation approaches the goal of interpreting visual scenes “in the wild” with respect to actions implied by the scene. We call this approach action-oriented scene understanding. It involves identifying and judging opportunities for interaction with constituents of the scene (e.g. objects and their parts) as well as understanding object functions and how interactions will impact the future. All of these aspects are addressed on three levels of abstraction: elements, perception and reasoning. On the elementary level, we investigate semantic and functional grouping of objects by analyzing annotated natural image scenes. We compare object label-based and visual context definitions with respect to their suitability for generating meaningful object class representations. Our findings suggest that representations generated from visual context are on-par in terms of semantic quality with those generated from large quantities of text. The perceptive level concerns action identification. We propose a system to identify possible interactions for robots and humans with the environment (affordances) on a pixel level using state-of-the-art machine learning methods. Pixel-wise part annotations of images are transformed into 12 affordance maps. Using these maps, a convolutional neural network is trained to densely predict affordance maps from unknown RGB images. In contrast to previous work, this approach operates exclusively on RGB images during both, training and testing, and yet achieves state-of-the-art performance. At the reasoning level, we extend the question from asking what actions are possible to what actions are plausible. For this, we gathered a dataset of household images associated with human ratings of the likelihoods of eight different actions. Based on the judgement provided by the human raters, we train convolutional neural networks to generate plausibility scores from unseen images. Furthermore, having considered only static scenes previously in this thesis, we propose a system that takes video input and predicts plausible future actions. Since this requires careful identification of relevant features in the video sequence, we analyze this particular aspect in detail using a synthetic dataset for several state-of-the-art video models. We identify feature learning as a major obstacle for anticipation in natural video data. The presented projects analyze the role of action in scene understanding from various angles and in multiple settings while highlighting the advantages of assuming an action-oriented perspective. We conclude that action-oriented scene understanding can augment classic computer vision in many real-life applications, in particular robotics

    Development of Deep Learning Hybrid Models for Hydrological Predictions

    Get PDF
    The Abstract is currently unavailable, due to the thesis being under Embargo

    Dynamic Generalisation of Continuous Action Spaces in Reinforcement Learning: A Neurally Inspired Approach

    Get PDF
    Institute for Adaptive and Neural ComputationAward number: 98318242.This thesis is about the dynamic generalisation of continuous action spaces in reinforcement learning problems. The standard Reinforcement Learning (RL) account provides a principled and comprehensive means of optimising a scalar reward signal in a Markov Decision Process. However, the theory itself does not directly address the imperative issue of generalisation which naturally arises as a consequence of large or continuous state and action spaces. A current thrust of research is aimed at fusing the generalisation capabilities of supervised (and unsupervised) learning techniques with the RL theory. An example par excellence is Tesauro’s TD-Gammon. Although much effort has gone into researching ways to represent and generalise over the input space, much less attention has been paid to the action space. This thesis first considers the motivation for learning real-valued actions, and then proposes a set of key properties desirable in any candidate algorithm addressing generalisation of both input and action spaces. These properties include: Provision of adaptive and online generalisation, adherence to the standard theory with a central focus on estimating expected reward, provision for real-valued states and actions, and full support for a real-valued discounted reward signal. Of particular interest are issues pertaining to robustness in non-stationary environments, scalability, and efficiency for real-time learning in applications such as robotics. Since exploring the action space is discovered to be a potentially costly process, the system should also be flexible enough to enable maximum reuse of learned actions. A new approach is proposed which succeeds for the first time in addressing all of the key issues identified. The algorithm, which is based on the ubiquitous self-organising map, is analysed and compared with other techniques including those based on the backpropagation algorithm. The investigation uncovers some important implications of the differences between these two particular approaches with respect to RL. In particular, the distributed representation of the multi-layer perceptron is judged to be something of a double-edged sword offering more sophisticated and more scalable generalising power, but potentially causing problems in dynamic or non-equiprobable environments, and tasks involving a highly varying input-output mapping. The thesis concludes that the self-organising map can be used in conjunction with current RL theory to provide real-time dynamic representation and generalisation of continuous action spaces. The proposed model is shown to be reliable in non-stationary, unpredictable and noisy environments and judged to be unique in addressing and satisfying a number of desirable properties identified as important to a large class of RL problems

    Machine Intelligence in Africa: a survey

    Full text link
    In the last 5 years, the availability of large audio datasets in African countries has opened unlimited opportunities to build machine intelligence (MI) technologies that are closer to the people and speak, learn, understand, and do businesses in local languages, including for those who cannot read and write. Unfortunately, these audio datasets are not fully exploited by current MI tools, leaving several Africans out of MI business opportunities. Additionally, many state-of-the-art MI models are not culture-aware, and the ethics of their adoption indexes are questionable. The lack thereof is a major drawback in many applications in Africa. This paper summarizes recent developments in machine intelligence in Africa from a multi-layer multiscale and culture-aware ethics perspective, showcasing MI use cases in 54 African countries through 400 articles on MI research, industry, government actions, as well as uses in art, music, the informal economy, and small businesses in Africa. The survey also opens discussions on the reliability of MI rankings and indexes in the African continent as well as algorithmic definitions of unclear terms used in MI.Comment: Accepted and to be presented at DSAI 202
    corecore