34 research outputs found

    Predictive modeling of PV energy production: How to set up the learning task for a better prediction?

    Get PDF
    In this paper, we tackle the problem of power prediction of several photovoltaic (PV) plants spread over an extended geographic area and connected to a power grid. The paper is intended to be a comprehensive study of one-day ahead forecast of PV energy production along several dimensions of analysis: i) The consideration of the spatio-temporal autocorrelation, which characterizes geophysical phenomena, to obtain more accurate predictions.ii) The learning setting to be considered, i.e. using simple output prediction for each hour or structured output prediction for each day. iii) The learning algorithms: We compare artificial neural networks, most often used for PV prediction forecast, and regression trees for learning adaptive models. The results obtained on two PV power plant datasets show that: taking into account spatio/temporal autocorrelation is beneficial; the structured output prediction setting significantly outperforms the non-structured output prediction setting; and regression trees provide better models than artificial neural networks

    Sensor networks and data streams: Basics

    No full text
    Recent advances in pervasive computing and sensor technologies have significantly influenced the field of geosciences, by changing the type of dynamic environmental phenomena that can be detected, monitored, and reacted to. Another important aspect is the real-time data delivery of novel platforms. In this chapter, we describe the specific characteristics of sensor data and sensor networks. Furthermore, we identify the most promising streaming models, which can be embedded in intelligent sensor platforms and used to mine real-time data for a variety of analytical insights

    Sensor data surveillance

    No full text
    A growing volume of geodata requires for appropriate data management systems, which ensure data acquisition and memory-preserving storage as well as continuous surveillance of this unbounded amount of georeferenced data. Trend cluster discovery, as a spatiotemporal aggregate operator, may play a crucial role in the surveillance process of the sensor data. We describe a computation-preserving algorithm, which employs an incremental learning strategy to continuously maintain sliding window trend clusters across a sensor network. The analysis of trend clusters, which are discovered at the consecutive sliding windows, is useful to look for possible changes in the data, as well as to produce forecasts of the future

    Mapping web pages to database records via link paths

    No full text
    In this paper we propose a new knowledge management task which aims to map Web pages to their corresponding records in a structured database. For example, the DBLP database contains records for many computer scientists, and most of these persons have public Web pages; if we can map the database record with the appropriate Web page then the new information could be used to further describe the person’s database record. To accomplish this goal we employ link paths which contain anchor texts from multiple paths through the Web ending at the Web page in question. We hypothesize that the information from these link paths can be used to generate an accurate Web page to database record mapping. Experiments on two large, real world data sets, DBLP and IMDB for the structured data and computer science faculty members ’ Web pages and official movie homepages for the Web page data, show that our method does provide an accurate mapping. Finally, we conclude by issuing a call for further research on this promising new task. Categories and Subject Descriptor

    Sensor data analysis applications

    No full text
    A PhotoVoltaic (PV) plant is a power station which converts sunlight energy into electric energy. In the last decade, PV plants have become ubiquitous in several countries of the European Union, due to a valuable policy of economic incentives (e.g., feed-in tariffs). Today, this ubiquity of PV plants has paved the way to the marketing of new smart systems, designed to monitor the energy production of a PV plant grid and supply intelligent services for customer and production applications. In this chapter, we start moving in this direction by fulfilling the urgent request of PV customers and PV companies to enjoy knowledge-based managing and monitoring services, integrated within a PV plant network. In particular, we illustrate a business intelligence solution developed to monitor the efficiency of the energy production of PV plants and a data mining solution for the fault diagnosis in PV plants

    Missing sensor data interpolation

    No full text
    Ubiquitous sensor stations continuously measure several geophysical variables over large zones and long (potentially unbounded) periods of time. However, observations can cover neither every space location nor every time. Interpolation, i.e., the estimation of unknown data in each location or time of interest, can be used to supplement station records. Although in GIScience there has been a tendency to treat space and time separately, there is now great interest in analyzing data in both the domains. This suggests that integrating space and time would yield better results than treating them separately, when interpolating several geophysical fields. This chapter contributes to the investigation of spatiotemporal interpolators in a remote-sensing scenario. We describe two interpolation techniques, which use trend clusters to interpolate missing data. The former performs the estimation phase by using the Inverse Distance Weighting approach, while the latter uses Kriging. Both have been adapted to a sensor network scenario. The proposed techniques have been evaluated in a large air-climate sensor network. The empirical study compares the accuracy and efficiency of both techniques

    Geodata stream summarization

    No full text
    The management of massive amounts of geodata collected by sensor networks creates several challenges, including the real-time application of summarization techniques, which should allow the storage of this unbounded volume of georeferenced and timestamped data in a server with a limited memory for any future query. SUMATRA is a summarization technique, which accounts for spatial and temporal information of sensor data to produce the appropriate trade-off between size and accuracy of geodata summarization. It uses the count-based model to process the stream. In particular, it segments the stream into windows, computes summaries window-by-window, and stores these summaries in a database. The trend clusters are discovered as a summary of each window. They are clusters of georeferenced data, which vary according to a similar trend along the time horizon of the window. Signal compression techniques are also considered to derive a compact representation of these trends for storage in the database. The empirical analysis of trend clusters contributes to assess the summarization capability, the accuracy, and the efficiency of the trend cluster-based summarization schema in real applications. Finally, a stream cube, called geo-trend stream cube, is defined. It uses trends to aggregate a numeric measure, which is streamed by a sensor network and is organized around space and time dimensions. Space-time roll-up and drill-down operators allow the exploration of trends from a coarse-grained and inner-grained hierarchical view

    Automatic generation of sitemaps based on navigation systems

    No full text
    In this paper we present a method to automatically discover sitemaps from websites. Given a website, existing automatic solutions extract only a flat list of urls that do not show the hierarchical structure of its content. Manual approaches, performed by web-masters, extract deeper sitemaps (with respect to automatic methods). However, in many cases, also because of the natural evolution of the websites’ content, generated sitemaps do not reflect the actual content becoming soon helpless and confusing for users. We propose a different approach that is both automatic and effective. Our solution combines an algorithm to extract frequent patterns from navigation systems (e.g. menu, nav-bar, content list, etc.) contained in a website, with a hierarchy extraction algorithm able to discover rich hierarchies that unveil relationships among web pages (e.g. relationships of super/sub topic). Experimental results, show how our approach discovers high quality sitemaps that have a deep hierarchy and are complete in the extracted urls
    corecore