Search CORE

10 research outputs found

Approximation trade-offs in Markovian stream processing: An empirical study

Author: Christopher Ré
Julie Letchner
Magdalena Balazinska
Matthai Philipose
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/06/2010
Field of study

A large amount of the world’s data is both sequential and imprecise. Such data is commonly modeled as Markovian streams; examples include words/sentences inferred from raw audio signals, or discrete location sequences inferred from RFID or GPS data. The rich semantics and large volumes of these streams make them difficult to query efficiently. In this paper, we study the effects—on both efficiency and accuracy—of two common stream approximations. Through experiments on a realworld RFID data set, we identify conditions under which these approximations can improve performance by several orders of magnitude, with only minimal effects on query results. We also identify cases when the full rich semantics are necessary

CiteSeerX

Crossref

RFID-Based Indoor Spatial Query Evaluation with Bayesian Filtering Techniques

Author: Gong Zhitao
Hui Bo
Ku Wei-Shinn
Lu Hua
Sun Min-Te
Wang Wenlu
Yu Jiao
Publication venue
Publication date: 01/04/2022
Field of study

People spend a significant amount of time in indoor spaces (e.g., office buildings, subway systems, etc.) in their daily lives. Therefore, it is important to develop efficient indoor spatial query algorithms for supporting various location-based applications. However, indoor spaces differ from outdoor spaces because users have to follow the indoor floor plan for their movements. In addition, positioning in indoor environments is mainly based on sensing devices (e.g., RFID readers) rather than GPS devices. Consequently, we cannot apply existing spatial query evaluation techniques devised for outdoor environments for this new challenge. Because Bayesian filtering techniques can be employed to estimate the state of a system that changes over time using a sequence of noisy measurements made on the system, in this research, we propose the Bayesian filtering-based location inference methods as the basis for evaluating indoor spatial queries with noisy RFID raw data. Furthermore, two novel models, indoor walking graph model and anchor point indexing model, are created for tracking object locations in indoor environments. Based on the inference method and tracking models, we develop innovative indoor range and k nearest neighbor (kNN) query algorithms. We validate our solution through use of both synthetic data and real-world data. Our experimental results show that the proposed algorithms can evaluate indoor spatial queries effectively and efficiently. We open-source the code, data, and floor plan at https://github.com/DataScienceLab18/IndoorToolKit

arXiv.org e-Print Archive

Probabilistic management of OCR data using an RDBMS

Author: Allauzen C.
Baeza-Yates R. A.
Bishop C. M.
Cho J.
Cowell R. G.
Gupta R.
Hopcroft J. E.
Jordan M. I.
Kimura H.
Lafferty J.
Mori S.
Widom J.
Yen J. Y.
Zobel J.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

The Trail, 2009-04-17

Author: Associated Students of the University of Puget Sound
Publication venue: Sound Ideas
Publication date: 17/04/2009
Field of study

https://soundideas.pugetsound.edu/thetrail_all/2949/thumbnail.jp

Sound Ideas

The Trail, 2009-04-10

Author: Associated Students of the University of Puget Sound
Publication venue: Sound Ideas
Publication date: 10/04/2009
Field of study

https://soundideas.pugetsound.edu/thetrail_all/2948/thumbnail.jp

Sound Ideas

Towards Enabling Probabilistic Databases for Participatory Sensing

Author: Aberer Karl
Duong Chi Thang
Nguyen Quoc Viet Hung
Sathe Saket
Publication venue
Publication date: 12/09/2014
Field of study

Participatory sensing has emerged as a new data collection paradigm, in which humans use their own devices (cell phone accelerometers, cameras, etc.) as sensors. This paradigm enables to collect a huge amount of data from the crowd for world-wide applications, without spending cost to buy dedicated sensors. Despite of this benefit, the data collected from human sensors are inherently uncertain due to no quality guarantee from the participants. Moreover, the participatory sensing data are time series that not only exhibit highly irregular dependencies on time, but also vary from sensor to sensor. To overcome these issues, we study in this paper the problem of creating probabilistic data from given (uncertain) time series collected by participatory sensors. We approach the problem in two steps. In the first step, we generate probabilistic times series from raw time series using a dynamical model from the time series literature. In the second step, we combine probabilistic time series from multiple sensors based on the mutual relationship between the reliability of the sensors and the quality of their data. Through extensive experimentation, we demonstrate the efficiency of our approach on both real data and synthetic data

Infoscience - École polytechnique fédérale de Lausanne

Access Methods for Markovian Streams

Author: Christopher Ré
Julie Letchner
Magdalena Balazinska
Matthai Philipose
Publication venue
Publication date: 01/01/2009
Field of study

letchner

CiteSeerX

Crossref