19,346 research outputs found
Efficient Compression Technique for Sparse Sets
Recent technological advancements have led to the generation of huge amounts
of data over the web, such as text, image, audio and video. Most of this data
is high dimensional and sparse, for e.g., the bag-of-words representation used
for representing text. Often, an efficient search for similar data points needs
to be performed in many applications like clustering, nearest neighbour search,
ranking and indexing. Even though there have been significant increases in
computational power, a simple brute-force similarity-search on such datasets is
inefficient and at times impossible. Thus, it is desirable to get a compressed
representation which preserves the similarity between data points. In this
work, we consider the data points as sets and use Jaccard similarity as the
similarity measure. Compression techniques are generally evaluated on the
following parameters --1) Randomness required for compression, 2) Time required
for compression, 3) Dimension of the data after compression, and 4) Space
required to store the compressed data. Ideally, the compressed representation
of the data should be such, that the similarity between each pair of data
points is preserved, while keeping the time and the randomness required for
compression as low as possible.
We show that the compression technique suggested by Pratap and Kulkarni also
works well for Jaccard similarity. We present a theoretical proof of the same
and complement it with rigorous experimentations on synthetic as well as
real-world datasets. We also compare our results with the state-of-the-art
"min-wise independent permutation", and show that our compression algorithm
achieves almost equal accuracy while significantly reducing the compression
time and the randomness
The 2-dimensional non-linear sigma-model on a random latice
The O(n) non-linear -model is simulated on 2-dimensional regular and
random lattices. We use two different levels of randomness in the construction
of the random lattices and give a detailed explanation of the geometry of such
lattices. In the simulations, we calculate the mass gap for and 8,
analysing the asymptotic scaling of the data and computing the ratio of Lambda
parameters . These ratios are in
agreement with previous semi-analytical calculations. We also numerically
calculate the topological susceptibility by using the cooling method.Comment: REVTeX file, 23 pages. 13 postscript figures in a separate compressed
tar fil
Compressed/reconstructed test images for CRAF/Cassini
A set of compressed, then reconstructed, test images submitted to the Comet Rendezvous Asteroid Flyby (CRAF)/Cassini project is presented as part of its evaluation of near lossless high compression algorithms for representing image data. A total of seven test image files were provided by the project. The seven test images were compressed, then reconstructed with high quality (root mean square error of approximately one or two gray levels on an 8 bit gray scale), using discrete cosine transforms or Hadamard transforms and efficient entropy coders. The resulting compression ratios varied from about 2:1 to about 10:1, depending on the activity or randomness in the source image. This was accomplished without any special effort to optimize the quantizer or to introduce special postprocessing to filter the reconstruction errors. A more complete set of measurements, showing the relative performance of the compression algorithms over a wide range of compression ratios and reconstruction errors, shows that additional compression is possible at a small sacrifice in fidelity
Computing A Glimpse of Randomness
A Chaitin Omega number is the halting probability of a universal Chaitin
(self-delimiting Turing) machine. Every Omega number is both computably
enumerable (the limit of a computable, increasing, converging sequence of
rationals) and random (its binary expansion is an algorithmic random sequence).
In particular, every Omega number is strongly non-computable. The aim of this
paper is to describe a procedure, which combines Java programming and
mathematical proofs, for computing the exact values of the first 64 bits of a
Chaitin Omega:
0000001000000100000110001000011010001111110010111011101000010000. Full
description of programs and proofs will be given elsewhere.Comment: 16 pages; Experimental Mathematics (accepted
- …