2,407 research outputs found
Onion Curve: A Space Filling Curve with Near-Optimal Clustering
Space filling curves (SFCs) are widely used in the design of indexes for
spatial and temporal data. Clustering is a key metric for an SFC, that measures
how well the curve preserves locality in moving from higher dimensions to a
single dimension. We present the {\em onion curve}, an SFC whose clustering
performance is provably close to optimal for the cube and near-cube shaped
query sets, irrespective of the side length of the query. We show that in
contrast, the clustering performance of the widely used Hilbert curve can be
far from optimal, even for cube-shaped queries. Since the clustering
performance of an SFC is critical to the efficiency of multi-dimensional
indexes based on the SFC, the onion curve can deliver improved performance for
data structures involving multi-dimensional data.Comment: The short version is published in ICDE 1
Recommended from our members
Generation of Porous Structures Using Fused Deposition
The Fused Deposition Modeling process uses hardware and software machine-level
language that are very similar to that of a pen-plotter. Consequently, the·use of patterns with
poly-lines as basic geometric features, instead of the current method based on filled polygons
(monolithic models), can increase its efficiency.
In the current study, various toolpath planning methods have been developed to fabricate
porous structures. Computational domain decomposition methods can be applied to the physical
or to slice-level domains to generate structured and unstructured grids. Also, textures can be
created using periodic tiling of the layer with unit cells (squares, honeycombs, etc). Methods
'based on curves include fractal space filling curves and.change of effective road width Within a
layer or within a continuous curve. Individual phases can also be placed in binary compositions.
In present investigation, a custom software has been developed and implemented to
generate build files (SML) and slice files (SSL) for the above-mentioned structures, demonstrating the efficient control ofthe size, shape, and distribution ofporosity.Mechanical Engineerin
Compressive Mining: Fast and Optimal Data Mining in the Compressed Domain
Real-world data typically contain repeated and periodic patterns. This
suggests that they can be effectively represented and compressed using only a
few coefficients of an appropriate basis (e.g., Fourier, Wavelets, etc.).
However, distance estimation when the data are represented using different sets
of coefficients is still a largely unexplored area. This work studies the
optimization problems related to obtaining the \emph{tightest} lower/upper
bound on Euclidean distances when each data object is potentially compressed
using a different set of orthonormal coefficients. Our technique leads to
tighter distance estimates, which translates into more accurate search,
learning and mining operations \textit{directly} in the compressed domain.
We formulate the problem of estimating lower/upper distance bounds as an
optimization problem. We establish the properties of optimal solutions, and
leverage the theoretical analysis to develop a fast algorithm to obtain an
\emph{exact} solution to the problem. The suggested solution provides the
tightest estimation of the -norm or the correlation. We show that typical
data-analysis operations, such as k-NN search or k-Means clustering, can
operate more accurately using the proposed compression and distance
reconstruction technique. We compare it with many other prevalent compression
and reconstruction techniques, including random projections and PCA-based
techniques. We highlight a surprising result, namely that when the data are
highly sparse in some basis, our technique may even outperform PCA-based
compression.
The contributions of this work are generic as our methodology is applicable
to any sequential or high-dimensional data as well as to any orthogonal data
transformation used for the underlying data compression scheme.Comment: 25 pages, 20 figures, accepted in VLD
On the non-local geometry of turbulence
A multi-scale methodology for the study of the non-local geometry of eddy structures in turbulence is developed. Starting from a given three-dimensional field, this consists of three main steps: extraction, characterization and classification of structures. The extraction step is done in two stages. First, a multi-scale decomposition based on the curvelet transform is applied to the full three-dimensional field, resulting in a finite set of component three-dimensional fields, one per scale. Second, by iso-contouring each component field at one or more iso-contour levels, a set of closed iso-surfaces is obtained that represents the structures at that scale. The characterization stage is based on the joint probability density function (p.d.f.), in terms of area coverage on each individual iso-surface, of two differential-geometry properties, the shape index and curvedness, plus the stretching parameter, a dimensionless global invariant of the surface. Taken together, this defines the geometrical signature of the iso-surface. The classification step is based on the construction of a finite set of parameters, obtained from algebraic functions of moments of the joint p.d.f. of each structure, that specify its location as a point in a multi-dimensional ‘feature space’. At each scale the set of points in feature space represents all structures at that scale, for the specified iso-contour value. This then allows the application, to the set, of clustering techniques that search for groups of structures with a common geometry. Results are presented of a first application of this technique to a passive scalar field obtained from 5123 direct numerical simulation of scalar mixing by forced, isotropic turbulence (Reλ = 265). These show transition, with decreasing scale, from blob-like structures in the larger scales to blob- and tube-like structures with small or moderate stretching in the inertial range of scales, and then toward tube and, predominantly, sheet-like structures with high level of stretching in the dissipation range of scales. Implications of these results for the dynamical behaviour of passive scalar stirring and mixing by turbulence are discussed
- …