1,959 research outputs found

    Data science for buildings, a multi-scale approach bridging occupants to smart-city energy planning

    Get PDF

    Data science for buildings, a multi-scale approach bridging occupants to smart-city energy planning

    Get PDF
    In a context of global carbon emission reduction goals, buildings have been identified to detain valuable energy-saving abilities. With the exponential increase of smart, connected building automation systems, massive amounts of data are now accessible for analysis. These coupled with powerful data science methods and machine learning algorithms present a unique opportunity to identify untapped energy-saving potentials from field information, and effectively turn buildings into active assets of the built energy infrastructure.However, the diversity of building occupants, infrastructures, and the disparities in collected information has produced disjointed scales of analytics that make it tedious for approaches to scale and generalize over the building stock.This coupled with the lack of standards in the sector has hindered the broader adoption of data science practices in the field, and engendered the following questioning:How can data science facilitate the scaling of approaches and bridge disconnected spatiotemporal scales of the built environment to deliver enhanced energy-saving strategies?This thesis focuses on addressing this interrogation by investigating data-driven, scalable, interpretable, and multi-scale approaches across varying types of analytical classes. The work particularly explores descriptive, predictive, and prescriptive analytics to connect occupants, buildings, and urban energy planning together for improved energy performances.First, a novel multi-dimensional data-mining framework is developed, producing distinct dimensional outlines supporting systematic methodological approaches and refined knowledge discovery. Second, an automated building heat dynamics identification method is put forward, supporting large-scale thermal performance examination of buildings in a non-intrusive manner. The method produced 64\% of good quality model fits, against 14\% close, and 22\% poor ones out of 225 Dutch residential buildings. %, which were open-sourced in the interest of developing benchmarks. Third, a pioneering hierarchical forecasting method was designed, bridging individual and aggregated building load predictions in a coherent, data-efficient fashion. The approach was evaluated over hierarchies of 37, 140, and 383 nodal elements and showcased improved accuracy and coherency performances against disjointed prediction systems.Finally, building occupants and urban energy planning strategies are investigated under the prism of uncertainty. In a neighborhood of 41 Dutch residential buildings, occupants were determined to significantly impact optimal energy community designs in the context of weather and economic uncertainties.Overall, the thesis demonstrated the added value of multi-scale approaches in all analytical classes while fostering best data-science practices in the sector from benchmarks and open-source implementations

    On-the-fly tracing for data-centric computing : parallelization, workflow and applications

    Get PDF
    As data-centric computing becomes the trend in science and engineering, more and more hardware systems, as well as middleware frameworks, are emerging to handle the intensive computations associated with big data. At the programming level, it is crucial to have corresponding programming paradigms for dealing with big data. Although MapReduce is now a known programming model for data-centric computing where parallelization is completely replaced by partitioning the computing task through data, not all programs particularly those using statistical computing and data mining algorithms with interdependence can be re-factorized in such a fashion. On the other hand, many traditional automatic parallelization methods put an emphasis on formalism and may not achieve optimal performance with the given limited computing resources. In this work we propose a cross-platform programming paradigm, called on-the-fly data tracing , to provide source-to-source transformation where the same framework also provides the functionality of workflow optimization on larger applications. Using a big-data approximation computations related to large-scale data input are identified in the code and workflow and a simplified core dependence graph is built based on the computational load taking in to account big data. The code can then be partitioned into sections for efficient parallelization; and at the workflow level, optimization can be performed by adjusting the scheduling for big-data considerations, including the I/O performance of the machine. Regarding each unit in both source code and workflow as a model, this framework enables model-based parallel programming that matches the available computing resources. The techniques used in model-based parallel programming as well as the design of the software framework for both parallelization and workflow optimization as well as its implementations with multiple programming languages are presented in the dissertation. Then, the following experiments are performed to validate the framework: i) the benchmarking of parallelization speed-up using typical examples in data analysis and machine learning (e.g. naive Bayes, k-means) and ii) three real-world applications in data-centric computing with the framework are also described to illustrate the efficiency: pattern detection from hurricane and storm surge simulations, road traffic flow prediction and text mining from social media data. In the applications, it illustrates how to build scalable workflows with the framework along with performance enhancements

    Event Discovery and Classification in Space-Time Series: A Case Study for Storms

    Get PDF
    Recent advancement in sensor technology has enabled the deployment of wireless sensors for surveillance and monitoring of phenomenon in diverse domains such as environment and health. Data generated by these sensors are typically high-dimensional and therefore difficult to analyze and comprehend. Additionally, high level phenomenon that humans commonly recognize, such as storms, fire, traffic jams are often complex and multivariate which individual univariate sensors are incapable of detecting. This thesis describes the Event Oriented approach, which addresses these challenges by providing a way to reduce dimensionality of space-time series and a way to integrate multivariate data over space and/or time for the purpose of detecting and exploring high level events. The proposed Event Oriented approach is implemented using space-time series data from the Gulf of Maine Ocean Observation System (GOMOOS). GOMOOS is a long standing network of wireless sensors in the Gulf of Maine monitoring the high energy ocean environment. As a case study, high level storm events are detected and classified using the Event Oriented approach. A domain-independent ontology for detecting high level xvi composite events called a General Composite Event Ontology is presented and used as a basis of the Storm Event Ontology. Primitive events are detected from univariate sensors and assembled into Composite Storm Events using the Storm Event Ontology. To evaluate the effectiveness of the Event Oriented approach, the resulting candidate storm events are compared with an independent historic Storm Events Database from the National Climatic Data Center (NCDC) indicating that the Event Oriented approach detected about 92% of the storms recorded by the NCDC. The Event Oriented approach facilitates classification of high level composite event. In the case study, candidate storms were classified based on their spatial progression and profile. Since ontological knowledge is used for constructing high level event ontology, detection of candidate high level events could help refine existing ontological knowledge about them. In summary, this thesis demonstrates the Event Oriented approach to reduce dimensionality in complex space-time series sensor data and the facility to integrate ime series data over space for detecting high level phenomenon
    corecore