3 research outputs found
Stopword detection for streaming content
© Springer International Publishing AG, part of Springer Nature 2018. The removal of stopwords is an important preprocessing step in many natural language processing tasks, which can lead to enhanced performance and execution time. Many existing methods either rely on a predefined list of stopwords or compute word significance based on metrics such as tf-idf. The objective of our work in this paper is to identify stopwords, in an unsupervised way, for streaming textual corpora such as Twitter, which have a temporal nature. We propose to consider and model the dynamics of a word within the streaming corpus to identify the ones that are less likely to be informative or discriminative. Our work is based on the discrete wavelet transform (DWT) of word signals in order to extract two features, namely scale and energy. We show that our proposed approach is effective in identifying stopwords and improves the quality of topics in the task of topic detection
Cyclic Homomorphic Encryption Aggregation (CHEA)—A Novel Approach to Data Aggregation in the Smart Grid
The transactive energy market is an emerging development in energy economics built on advanced metering infrastructure. Data generated in this context is often required for market operations, while also being privacy sensitive. This dual concern has necessitated the development of various methods of obfuscation in order to maintain privacy while still facilitating operations. While data aggregation is a common approach in this context, many of the existing aggregation methods rely on additional network components or lack flexibility. In this paper, we introduce Cyclic Homomorphic Encryption Aggregation (CHEA), a secure aggregation protocol that eliminates the need for additional network components or complicated key distribution schemes, while providing additional capabilities compared to similar protocols. We validate our scheme with formal security analysis as well as a software simulation of a transactive energy network running the scheme. Results indicate that CHEA performs well in comparison to similar works, with minimal communication overheads. Additionally, CHEA retains all standard security properties held by other aggregation schemes, while improving flexibility and reducing infrastructural requirements. Our scheme operates on similar assumptions as other works, but current smart metering hardware lags in terms of processing power, making the scheme infeasible on the current generation of hardware. However, these capabilities should quickly advance to an accommodating state. With this in mind, and given the results, we believe CHEA is a strong candidate for aggregating transactive energy data