Search CORE

32 research outputs found

The Global Legal Information Network ( GLIN )

Author: Adam Nabil R.
Edelson Burton I.
El-Ghazawi Tarek A.
Halem Milton
Kalpakis Konstantinos
Kozura Nick J.
Medina Rubens
Yesha Yelena
Publication venue: Digital Commons @ American University Washington College of Law
Publication date: 01/12/1996
Field of study

bepress Legal Repository

Digital Commons @ American University Washington College of Law

Linear Sketches for Approximate Aggregate Range Queries 1,2

Author: Konstantinos Kalpakis
Konstantinos Kalpakis
Vasundhara Puttagunta
Vasundhara Puttagunta
Publication venue
Publication date
Field of study

Answering aggregate queries approximately over multidimensional data is an important problem that arises naturally in many applications. An approach to the problem is to maintain a succinct (i.e. O(k) space) representation, called sketch, of the frequency distribution h of the data, and use ˆ h for answering queries. Common sketches are constructed via linear mappings of h onto a k–dimensional space, e.g. map h to its top–k Fourier/Wavelet coefficients. We call such sketches linear sketches, since ˆ h = P ∗ h for some sketching matrix P. Linear sketches have the benefit that they can be easily maintained incrementally over data streams. Sketches are typically optimized for approximating the data distribution, but not the answers to queries. In this paper, we are concerned with linear sketches that approximate well not only the data but also the answers to the aggregate queries. The quality of approximations is measured using the mean squared and relative errors (MSE and RLE). A query is represented by a column vector q such that its answer is q T h. A given set of queries can be represented by an appropriate query matrix Q. We show that the MSE for the queries is minimized when the sketching matrix used to construct a linear sketch of h has as columns the top-k eigenvectors of the query matrix Q. Further, if the quer

CiteSeerX

Distance Measures for Effective Clustering of ARIMA Time-Series

Author: Dhiral Gada
Konstantinos Kalpakis
Konstantinos Kalpakis Dhiral
Vasundhara Puttagunta
Publication venue
Publication date
Field of study

Many environmental and socioeconomic time--series data can be adequately modeled using Auto-Regressive Integrated Moving Average (ARIMA) models. We call such time--series ARIMA time--series. We consider the problem of clustering ARIMA time--series. We propose the use of the Linear Predictive Coding (LPC) cepstrum of time--series for clustering ARIMA time--series, by using the Euclidean distance between the LPC cepstra of two time--series as their dissimilarity measure. We demonstrate that LPC cepstral coefficients have the desired features for accurate clustering and efficient indexing of ARIMA time--series. For example, few LPC cepstral coefficients are sufficient in order to discriminate between time--series that are modeled by different ARIMA models. In fact this approach requires fewer coefficients than traditional approaches, such as DFT and DWT. The proposed distance measure can be used for measuring the similarity between different ARIMA models as well

CiteSeerX