9,731 research outputs found
Adaptive Normalization in Streaming Data
In todays digital era, data are everywhere from Internet of Things to health
care or financial applications. This leads to potentially unbounded
ever-growing Big data streams and it needs to be utilized effectively. Data
normalization is an important preprocessing technique for data analytics. It
helps prevent mismodeling and reduce the complexity inherent in the data
especially for data integrated from multiple sources and contexts.
Normalization of Big Data stream is challenging because of evolving
inconsistencies, time and memory constraints, and non-availability of whole
data beforehand. This paper proposes a distributed approach to adaptive
normalization for Big data stream. Using sliding windows of fixed size, it
provides a simple mechanism to adapt the statistics for normalizing changing
data in each window. Implemented on Apache Storm, a distributed real-time
stream data framework, our approach exploits distributed data processing for
efficient normalization. Unlike other existing adaptive approaches that
normalize data for a specific use (e.g., classification), ours does not.
Moreover, our adaptive mechanism allows flexible controls, via user-specified
thresholds, for normalization tradeoffs between time and precision. The paper
illustrates our proposed approach along with a few other techniques and
experiments on both synthesized and real-world data. The normalized data
obtained from our proposed approach, on 160,000 instances of data stream,
improves over the baseline by 89% with 0.0041 root-mean-square error compared
with the actual data
Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State Vowel Identification
Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
Speaker Normalization Using Cortical Strip Maps: A Neural Model for Steady State vowel Categorization
Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. The transformation from speaker-dependent to speaker-independent language representations enables speech to be learned and understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitch-independent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
FLEET: Butterfly Estimation from a Bipartite Graph Stream
We consider space-efficient single-pass estimation of the number of
butterflies, a fundamental bipartite graph motif, from a massive bipartite
graph stream where each edge represents a connection between entities in two
different partitions. We present a space lower bound for any streaming
algorithm that can estimate the number of butterflies accurately, as well as
FLEET, a suite of algorithms for accurately estimating the number of
butterflies in the graph stream. Estimates returned by the algorithms come with
provable guarantees on the approximation error, and experiments show good
tradeoffs between the space used and the accuracy of approximation. We also
present space-efficient algorithms for estimating the number of butterflies
within a sliding window of the most recent elements in the stream. While there
is a significant body of work on counting subgraphs such as triangles in a
unipartite graph stream, our work seems to be one of the few to tackle the case
of bipartite graph streams.Comment: This is the author's version of the work. It is posted here by
permission of ACM for your personal use. Not for redistribution. The
definitive version was published in Seyed-Vahid Sanei-Mehri, Yu Zhang, Ahmet
Erdem Sariyuce and Srikanta Tirthapura. "FLEET: Butterfly Estimation from a
Bipartite Graph Stream". The 28th ACM International Conference on Information
and Knowledge Managemen
- …