Search CORE

5,043 research outputs found

On the Online Classification of Data Streams Using Weak Estimators

Author: Oommen John
Tavasoli Hanane
Yazidi Anis
Publication venue
Publication date: 01/01/2016
Field of study

On the online classification of data streams using weak estimators

Author: Oommen J. (B. John)
Tavasoli H. (Hanane)
Yazidi A. (Anis)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this paper, we propose a novel online classifier for complex data streams which are generated from non-stationary stochastic properties. Instead of using a single training model and counters to keep important data statistics, the introduced online classifier scheme provides a real-time self-adjusting learning model. The learning model utilizes the multiplication-based update algorithm of the Stochastic Learning Weak Estimator (SLWE) at each time instant as a new labeled instance arrives. In this way, the data statistics are updated every time a new element is inserted, without requiring that we have to rebuild its model when changes occur in the data distributions. Finally, and most importantly, the model operates with the understanding that the correct classes of previously-classified patterns become available at a later juncture subsequent to some time instances, thus requiring us to update the training set and the training model. The results obtained from rigorous empirical analysis on multinomial distributions, is remarkable. Indeed, it demonstrates the applicability of our method on synthetic datasets, and proves the advantages of the introduced scheme

Carleton University's Institutional Repository

NORA - Norwegian Open Research Archives

Agder University Research Archive

On utilizing weak estimators to achieve the online classification of data streams

Author: Oommen John B.
Tavasoli Hanane
Yazidi Anis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Author's accepted version (post-print).Available from 03/09/2021.acceptedVersio

NORA - Norwegian Open Research Archives

Agder University Research Archive

Improved Algorithms for Time Decay Streams

Author: Braverman Vladimir
Lang Harry
Ullah Enayat
Zhou Samson
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)
Publication date: 01/01/2019
Field of study

In the time-decay model for data streams, elements of an underlying data set arrive sequentially with the recently arrived elements being more important. A common approach for handling large data sets is to maintain a coreset, a succinct summary of the processed data that allows approximate recovery of a predetermined query. We provide a general framework that takes any offline-coreset and gives a time-decay coreset for polynomial time decay functions. We also consider the exponential time decay model for k-median clustering, where we provide a constant factor approximation algorithm that utilizes the online facility location algorithm. Our algorithm stores O(k log(h Delta)+h) points where h is the half-life of the decay function and Delta is the aspect ratio of the dataset. Our techniques extend to k-means clustering and M-estimators as well

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Detection of fast radio transients with multiple stations: a case study using the Very Long Baseline Array

Author: Adam T. Deller
Bannister
Bertail
Bhat
Bhattacharya
Bishop
Bower
Burke-Spolaor
Cordes
Croft
David R. Thompson
Deller
Deneva
Duin
Hall
Hessels
Kiri L. Wagstaff
Lazio
Lyne
Randall B. Wayth
Steven J. Tingay
Taylor
Taylor
von Korff
Walid A. Majid
Walter F. Brisken
Wasserman
Publication venue: 'IOP Publishing'
Publication date: 01/01/2011
Field of study

Recent investigations reveal an important new class of transient radio phenomena that occur on sub-millisecond timescales. Often transient surveys' data volumes are too large to archive exhaustively. Instead, an on-line automatic system must excise impulsive interference and detect candidate events in real-time. This work presents a case study using data from multiple geographically distributed stations to perform simultaneous interference excision and transient detection. We present several algorithms that incorporate dedispersed data from multiple sites, and report experiments with a commensal real-time transient detection system on the Very Long Baseline Array (VLBA). We test the system using observations of pulsar B0329+54. The multiple-station algorithms enhanced sensitivity for detection of individual pulses. These strategies could improve detection performance for a future generation of geographically distributed arrays such as the Australian Square Kilometre Array Pathfinder and the Square Kilometre Array.Comment: 12 pages, 14 figures. Accepted for Ap

arXiv.org e-Print Archive

Crossref

espace@Curtin

Network Sampling: From Static to Streaming Graphs

Author: Ahmed Nesreen K.
Kompella Ramana
Neville Jennifer
Publication venue
Publication date: 13/11/2012
Field of study

Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network sampling, by highlighting the different objectives, population and units of interest, and classes of network sampling methods. In addition, we propose a spectrum of computational models for network sampling methods, ranging from the traditionally studied model based on the assumption of a static domain to a more challenging model that is appropriate for streaming domains. We design a family of sampling methods based on the concept of graph induction that generalize across the full spectrum of computational models (from static to streaming) while efficiently preserving many of the topological properties of the input graphs. Furthermore, we demonstrate how traditional static sampling algorithms can be modified for graph streams for each of the three main classes of sampling methods: node, edge, and topology-based sampling. Our experimental results indicate that our proposed family of sampling methods more accurately preserves the underlying properties of the graph for both static and streaming graphs. Finally, we study the impact of network sampling algorithms on the parameter estimation and performance evaluation of relational classification algorithms

arXiv.org e-Print Archive

CiteSeerX