23,822 research outputs found
PPQ-Trajectory : spatio-temporal quantization for querying in large trajectory repositories
We present PPQ-trajectory, a spatio-temporal quantization based solution for querying large dynamic trajectory data. PPQ-trajectory includes a partition-wise predictive quantizer (PPQ) that generates an error-bounded codebook with autocorrelation and spatial proximity-based partitions. The codebook is indexed to run approximate and exact spatio-temporal queries over compressed trajectories. PPQ-trajectory includes a coordinate quadtree coding for the codebook with support for exact queries. An incremental temporal partition-based index is utilised to avoid full reconstruction of trajectories during queries. An extensive set of experimental results for spatio-temporal queries on real trajectory datasets is presented. PPQ-trajectory shows significant improvements over the alternatives with respect to several performance measures, including the accuracy of results when the summary is used directly to provide approximate query results, the spatial deviation with which spatio-temporal path queries can be answered when the summary is used as an index, and the time taken to construct the summary. Superior results on the quality of the summary and the compression ratio are also demonstrated
Data Management and Mining in Astrophysical Databases
We analyse the issues involved in the management and mining of astrophysical
data. The traditional approach to data management in the astrophysical field is
not able to keep up with the increasing size of the data gathered by modern
detectors. An essential role in the astrophysical research will be assumed by
automatic tools for information extraction from large datasets, i.e. data
mining techniques, such as clustering and classification algorithms. This asks
for an approach to data management based on data warehousing, emphasizing the
efficiency and simplicity of data access; efficiency is obtained using
multidimensional access methods and simplicity is achieved by properly handling
metadata. Clustering and classification techniques, on large datasets, pose
additional requirements: computational and memory scalability with respect to
the data size, interpretability and objectivity of clustering or classification
results. In this study we address some possible solutions.Comment: 10 pages, Late
Fertility and its Meaning: Evidence from Search Behavior
Fertility choices are linked to the different preferences and constraints of
individuals and couples, and vary importantly by socio-economic status, as well
by cultural and institutional context. The meaning of childbearing and
child-rearing, therefore, differs between individuals and across groups. In
this paper, we combine data from Google Correlate and Google Trends for the
U.S. with ground truth data from the American Community Survey to derive new
insights into fertility and its meaning. First, we show that Google Correlate
can be used to illustrate socio-economic differences on the circumstances
around pregnancy and birth: e.g., searches for "flying while pregnant" are
linked to high income fertility, and "paternity test" are linked to non-marital
fertility. Second, we combine several search queries to build predictive models
of regional variation in fertility, explaining about 75% of the variance.
Third, we explore if aggregated web search data can also be used to model
fertility trends.Comment: This is a preprint of a short paper accepted at ICWSM'17. Please cite
that version instea
- …