197,732 research outputs found

    Lessons for Data-Driven Modelling from Harmonics in the Norwegian Grid

    Get PDF
    With the advancing integration of fluctuating renewables, a more dynamic demand-side, and a grid running closer to its operational limits, future power system operators require new tools to anticipate unwanted events. Advances in machine learning and availability of data suggest great potential in using data-driven approaches, but these will only ever be as good as the data they are based on. To lay the ground-work for future data-driven modelling, we establish a baseline state by analysing the statistical distribution of voltage measurements from three sites in the Norwegian power grid (22, 66, and 300 kV). Measurements span four years, are line and phase voltages, are cycle-by-cycle, and include all (even and odd) harmonics up to the 96 order. They are based on four years of historical data from three Elspec Power Quality Analyzers (corresponding to one trillion samples), which we have extracted, processed, and analyzed. We find that: (i) the distribution of harmonics depends on phase and voltage level; (ii) there is little power beyond the 13 harmonic; (iii) there is temporal clumping of extreme values; and (iv) there is seasonality on different time-scales. For machine learning based modelling these findings suggest that: (i) models should be trained in two steps (first with data from all sites, then adapted to site-level); (ii) including harmonics beyond the 13 is unlikely to increase model performance, and that modelling should include features that (iii) encode the state of the grid, as well as (iv) seasonality. View Full-Text Keywords: machine learning; power systems; harmonic distortion; power qualitypublishedVersio

    Modelling maritime pine (Pinus pinaster Aiton) spatial distribution and productivity in Portugal : tools for forest management.

    Get PDF
    Research Highlights: Modelling species’ distribution and productivity is key to support integrated landscape planning, species’ afforestation, and sustainable forest management. Background and Objectives: Maritime pine (Pinus pinaster Aiton) forests in Portugal were lately affected by wildfires and measures to overcome this situation are needed. The aims of this study were: (1) to model species’ spatial distribution and productivity using a machine learning (ML) regression approach to produce current species’ distribution and productivity maps; (2) to model the species’ spatial productivity using a stochastic sequential simulation approach to produce the species’ current productivity map; (3) to produce the species’ potential distribution map, by using a ML classification approach to define species’ ecological envelope thresholds; and (4) to identify present and future key factors for the species’ afforestation and management. Materials and Methods: Spatial land cover/land use data, inventory, and environmental data (climate, topography, and soil) were used in a coupled ML regression and stochastic sequential simulation approaches to model species’ current and potential distributions and productivity. Results: Maritime pine spatial distribution modelling by the ML approach provided 69% fitting efficiency, while species productivity modelling achieved only 43%. The species’ potential area covered 60% of the country’s area, where 78% of the species’ forest inventory plots (1995) were found. The change in the Maritime pine stands’ age structure observed in the last decades is causing the species’ recovery by natural regeneration to be at risk. Conclusions: The maps produced allow for best site identification for species afforestation, wood production regulation support, landscape planning considering species’ diversity, and fire hazard mitigation. These maps were obtained by modelling using environmental covariates, such as climate attributes, so their projection in future climate change scenarios can be performed.info:eu-repo/semantics/publishedVersio

    Supporting systematic reviews using LDA-based document representations

    Get PDF
    BACKGROUND: Identifying relevant studies for inclusion in a systematic review (i.e. screening) is a complex, laborious and expensive task. Recently, a number of studies has shown that the use of machine learning and text mining methods to automatically identify relevant studies has the potential to drastically decrease the workload involved in the screening phase. The vast majority of these machine learning methods exploit the same underlying principle, i.e. a study is modelled as a bag-of-words (BOW). METHODS: We explore the use of topic modelling methods to derive a more informative representation of studies. We apply Latent Dirichlet allocation (LDA), an unsupervised topic modelling approach, to automatically identify topics in a collection of studies. We then represent each study as a distribution of LDA topics. Additionally, we enrich topics derived using LDA with multi-word terms identified by using an automatic term recognition (ATR) tool. For evaluation purposes, we carry out automatic identification of relevant studies using support vector machine (SVM)-based classifiers that employ both our novel topic-based representation and the BOW representation. RESULTS: Our results show that the SVM classifier is able to identify a greater number of relevant studies when using the LDA representation than the BOW representation. These observations hold for two systematic reviews of the clinical domain and three reviews of the social science domain. CONCLUSIONS: A topic-based feature representation of documents outperforms the BOW representation when applied to the task of automatic citation screening. The proposed term-enriched topics are more informative and less ambiguous to systematic reviewers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13643-015-0117-0) contains supplementary material, which is available to authorized users

    Deep learning for supervised classification of temporal data in ecology

    Get PDF
    Temporal data is ubiquitous in ecology and ecologists often face the challenge of accurately differentiating these data into predefined classes, such as biological entities or ecological states. The usual approach consists of transforming the time series into user-defined features and then using these features as predictors in conventional statistical or machine learning models. Here we suggest the use of deep learning models as an alternative to this approach. Recent deep learning techniques can perform the classification directly from the time series, eliminating subjective and resource-consuming data transformation steps, and potentially improving classification results. We describe some of the deep learning architectures relevant for time series classification and show how these architectures and their hyper-parameters can be tested and used for the classification problems at hand. We illustrate the approach using three case studies from distinct ecological subdisciplines: i) insect species identification from wingbeat spectrograms; ii) species distribution modelling from climate time series and iii) the classification of phenological phases from continuous meteorological data. The deep learning approach delivered ecologically sensible and accurate classifications demonstrating its potential for wide applicability across subfields of ecology.info:eu-repo/semantics/publishedVersio

    Machine learning stochastic design models.

    Get PDF
    Due to the fluid nature of the early stages of the design process, it is difficult to obtain deterministic product design evaluations. This is primarily due to the flexibility of the design at this stage, namely that there can be multiple interpretations of a single design concept. However, it is important for designers to understand how these design concepts are likely to fulfil the original specification, thus enabling the designer to select or bias towards solutions with favourable outcomes. One approach is to create a stochastic model of the design domain. This paper tackles the issues of using a product database to induce a Bayesian model that represents the relationships between the design parameters and characteristics. A greedy learning algorithm is presented and illustrated using a simple case study
    • …