1 research outputs found
Waveform Signal Entropy and Compression Study of Whole-Building Energy Datasets
Electrical energy consumption has been an ongoing research area since the
coming of smart homes and Internet of Things devices. Consumption
characteristics and usages profiles are directly influenced by building
occupants and their interaction with electrical appliances. Extracted
information from these data can be used to conserve energy and increase user
comfort levels. Data analysis together with machine learning models can be
utilized to extract valuable information for the benefit of occupants
themselves, power plants, and grid operators. Public energy datasets provide a
scientific foundation to develop and benchmark these algorithms and techniques.
With datasets exceeding tens of terabytes, we present a novel study of five
whole-building energy datasets with high sampling rates, their signal entropy,
and how a well-calibrated measurement can have a significant effect on the
overall storage requirements. We show that some datasets do not fully utilize
the available measurement precision, therefore leaving potential accuracy and
space savings untapped. We benchmark a comprehensive list of 365 file formats,
transparent data transformations, and lossless compression algorithms. The
primary goal is to reduce the overall dataset size while maintaining an
easy-to-use file format and access API. We show that with careful selection of
file format and encoding scheme, we can reduce the size of some datasets by up
to 73%