101 research outputs found

    Predicting large scale fine grain energy consumption

    Get PDF
    Today a large volume of energy-related data have been continuously collected. Extracting actionable knowledge from such data is a multi-step process that opens up a variety of interesting and novel research issues across two domains: energy and computer science. The computer science aim is to provide energy scientists with cutting-edge and scalable engines to effectively support them in their daily research activities. This paper presents SPEC, a scalable and distributed predictor of fine grain energy consumption in buildings. SPEC exploits a data stream methodology analysis over a sliding time window to train a prediction model tailored to each building. The building model is then exploited to predict the upcoming energy consumption at a time instant in the near future. SPEC currently integrates the artificial neural networks technique and the random forest regression algorithm. The SPEC methodology exploits the computational advantages of distributed computing frameworks as the current implementation runs on Spark. As a case study, real data of thermal energy consumption collected in a major city have been exploited to preliminarily assess the SPEC accuracy. The initial results are promising and represent a first step towards predicting fine grain energy consumption over a sliding time window

    Towards a Repository of Bx Examples

    Get PDF
    We argue for the creation of a curated repository of examples of bidirectional transformations (bx). In particular, such a resource may support research on bx, especially cross-fertilisation between the different communities involved. We have initiated a bx repository, which is introduced in this paper. We discuss our design decisions and their rationale, and illustrate them using the now classic Composers example. We discuss the difficulties that this undertaking may face, and comment on how they may be overcome. 1

    Finite Automata Algorithms in Map-Reduce

    Get PDF
    In this thesis the intersection of several large nondeterministic finite automata (NFA's) as well as minimization of a large deterministic finite automaton (DFA) in map-reduce are studied. We have derived a lower bound on replication rate for computing NFA intersections and provided three concrete algorithms for the problem. Our investigation of the replication rate for each of all three algorithms shows where each algorithm could be applied through detailed experiments on large datasets of finite automata. Denoting n the number of states in DFA A, we propose an algorithm to minimize A in n map-reduce rounds in the worst-case. Our experiments, however, indicate that the number of rounds, in practice, is much smaller than n for all DFA's we examined. In other words, this algorithm converges in d iterations by computing the equivalence classes of each state, where d is the diameter of the input DFA

    GeoTriples: Transforming geospatial data into RDF graphs using R2RML and RML mappings

    Get PDF
    A lot of geospatial data has become available at no charge in many countries recently. Geospatial data that is currently made available by government agencies usually do not follow the linked data paradigm. In the few cases where government agencies do follow the linked data paradigm (e.g., Ordnance Survey in the United Kingdom), specialized scripts have been used for transforming geospatial data into RDF. In this paper we present the open source tool GeoTriples which generates and processes extended R2RML and RML mappings that transform geospatial data from many input formats into RDF. GeoTriples allows the transformation of geospatial data stored in raw files (shapefiles, CSV, KML, XML, GML and GeoJSON) and spatially-enabled RDBMS (PostGIS and MonetDB) into RDF graphs using well-known vocabularies like GeoSPARQL and stSPARQL, but without being tightly coupled to a specific vocabulary. GeoTriples has been developed in European projects LEO and Melodies and has been used to transform many geospatial data sources into linked data. We study the performance of GeoTriples experimentally using large publicly available geospatial datasets, and show that GeoTriples is very efficient and scalable especially when its mapping processor is implemented using Apache Hadoop

    A Survey on User Interaction with Linked Data

    Get PDF
    Since the beginning of the Semantic Web and the coining of the term Linked Data in 2006, more than one thousand datasets with over sixteen thousand links have been published to the Linked Open Data Cloud. This rising interest is fuelled by the benefits that semantically annotated and machine-readable information can have in many systems. Alongside this growth we also observe a rise in humans creating and consuming Linked Data, and the opportunity to study and develop guidelines for tackling the new user interaction problems that arise with it. To gather information on the current solutions for modelling user interaction for these applications, we conducted a study surveying the interaction techniques provided in the state of the art of Linked Data tools and applications developed for users with no experience with Semantic Web technologies. The 18 tools reviewed are described and compared according to the interaction features provided, techniques used for visualising one instance and a set of instances, search solutions implemented, and the evaluation methods used to evaluate the proposed interaction solutions. From this review, we can conclude that researchers have started to deviate from more traditional visualisation techniques, like graph visualisations, when developing for lay users. This shows a current effort in developing Semantic Web tools to be used by lay users and motivates the documentation and formalisation of the solutions encountered in the studied tools. Copyright (c) 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)
    corecore