861 research outputs found
Smart Building Data Collection and Ventilation System Energy Prediction
Data has the potential to transform our environments for the better if utilized to its full
potential. A highly interesting use case of data is in relation to Smart Buildings, where
IoT technology presents new possibilities. With appropriate collection and structuring
of the available data, many new opportunities present themselves.
In this thesis, a data gathering system is proposed for sensors in Arkivenes Hus. To
illustrate the potential in the data, one specific problem is researched, namely that of
indoor climate optimization and its effects on energy usage. The problem description
and the development of the data system comprises identifying governing system equations using sparse identification of nonlinear dynamics, control strategy using model
predictive control and various machine learning methods to predict energy usage.
For a one day simulation, the proposed optimization strategy yields a 174.86% increase
in energy usage. The conducted work indicates that the proposed model identification
technique is unsuitable for the underlying data utilized in this work. The proposed
model predictive control strategy and machine learning methods contain promising results
Smart Building Data Collection and Ventilation System Energy Prediction
Data has the potential to transform our environments for the better if utilized to its full
potential. A highly interesting use case of data is in relation to Smart Buildings, where
IoT technology presents new possibilities. With appropriate collection and structuring
of the available data, many new opportunities present themselves.
In this thesis, a data gathering system is proposed for sensors in Arkivenes Hus. To
illustrate the potential in the data, one specific problem is researched, namely that of
indoor climate optimization and its effects on energy usage. The problem description
and the development of the data system comprises identifying governing system equations using sparse identification of nonlinear dynamics, control strategy using model
predictive control and various machine learning methods to predict energy usage.
For a one day simulation, the proposed optimization strategy yields a 174.86% increase
in energy usage. The conducted work indicates that the proposed model identification
technique is unsuitable for the underlying data utilized in this work. The proposed
model predictive control strategy and machine learning methods contain promising results
Survey and Analysis of Production Distributed Computing Infrastructures
This report has two objectives. First, we describe a set of the production
distributed infrastructures currently available, so that the reader has a basic
understanding of them. This includes explaining why each infrastructure was
created and made available and how it has succeeded and failed. The set is not
complete, but we believe it is representative.
Second, we describe the infrastructures in terms of their use, which is a
combination of how they were designed to be used and how users have found ways
to use them. Applications are often designed and created with specific
infrastructures in mind, with both an appreciation of the existing capabilities
provided by those infrastructures and an anticipation of their future
capabilities. Here, the infrastructures we discuss were often designed and
created with specific applications in mind, or at least specific types of
applications. The reader should understand how the interplay between the
infrastructure providers and the users leads to such usages, which we call
usage modalities. These usage modalities are really abstractions that exist
between the infrastructures and the applications; they influence the
infrastructures by representing the applications, and they influence the ap-
plications by representing the infrastructures
Hive on spark and MapReduce : a methodology for parameter tuning
Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Information Systems and Technologies ManagementAs the era of “big data” has arrived, more and more companies start using distributed file systems to manage and process their data streams like the Hadoop distributed file system framework (HDFS). This software library offers a way to store large files across multiple machines. Large data sets are processed by using its inherent programming model MapReduce. Apache Spark is a relatively new alternative to Hadoop MapReduce and claims to offer a performance boost up to 10 times for certain applications, while maintaining its automatic fault tolerance. To leverage the Data Warehouse capabilities of Hadoop Apache Hive was introduced. It is a concept for Big Data analytics that works on top of Hadoop and provides data analysis tools and most importantly translates queries to MapReduce and Spark jobs. Therefore, it exploits the scalability of Hadoop and offers data exploration and mining capabilities to non-developers. However, it is difficult for users to utilize the full potential of the Apache Spark execution engine. This results in very long execution times. Therefore, this project work gives researches and companies a tuning methodology that significantly can improve the execution time of queries. As a result, this tuning methodology could optimize a real-world batch-processing query by 5 times. Moreover, it gives insides in the underlying reasons of this big improvement by using Apache Spark Monitoring tools. The result can be helpful for many practitioners and researchers that would like to optimise the performance of Spark and MapReduce queries executed in Hive on top of an Apache Hadoop cluster
- …