9,042 research outputs found
Bolt: Accelerated Data Mining with Fast Vector Compression
Vectors of data are at the heart of machine learning and data mining.
Recently, vector quantization methods have shown great promise in reducing both
the time and space costs of operating on vectors. We introduce a vector
quantization algorithm that can compress vectors over 12x faster than existing
techniques while also accelerating approximate vector operations such as
distance and dot product computations by up to 10x. Because it can encode over
2GB of vectors per second, it makes vector quantization cheap enough to employ
in many more circumstances. For example, using our technique to compute
approximate dot products in a nested loop can multiply matrices faster than a
state-of-the-art BLAS implementation, even when our algorithm must first
compress the matrices.
In addition to showing the above speedups, we demonstrate that our approach
can accelerate nearest neighbor search and maximum inner product search by over
100x compared to floating point operations and up to 10x compared to other
vector quantization methods. Our approximate Euclidean distance and dot product
computations are not only faster than those of related algorithms with slower
encodings, but also faster than Hamming distance computations, which have
direct hardware support on the tested platforms. We also assess the errors of
our algorithm's approximate distances and dot products, and find that it is
competitive with existing, slower vector quantization algorithms.Comment: Research track paper at KDD 201
Comparative Analysis: Heart Diagnosis Classification using BP-LVQ Neural Network Models For Analog and Digital Data
Decades onwards companies are creating massive data warehouses to store the collected resources. Even though the stored resources are available, only few companies have been able to know that the actual value stored in the database. Procedure used to extract those values is known as data mining. We use so-many technologies to apply this data-mining technique, artificial neural network(ANN) also includes in this data-mining techniques ,ANN is the information processing units which are similar to biological nervous systems. Backpropagation is one of the techniques that used for classification and LVQ (learning Vector Quantization) can be plotted under the competitive learning scheme which is also used for classification. This paper elaborates artificial neural networks, its characteristics and working of backpropagation and LVQ algorithms. In this paper we show the intriguing comparisons between backpropagation and LVQ (Learning Vector Quantization) for both analog and digital data. It also attempts to explain the results between back-propagation and LV
Competitive learning/reflected residual vector quantization for coding angiogram images
Medical images need to be compressed for the purpose of storage/transmission of a large volume of medical data. Reflected residual vector quantization (RRVQ) has emerged recently as one of the computationally cheap compression algorithms. RRVQ, which is a lossy compression scheme, was introduced as an alternative design algorithm for residual vector quantization (RVQ) structure (a structure famous for providing progressive quantization). However, RRVQ is not guaranteed to reach global minimum. It was found that it has a higher probability to diverge when used with nonGaussian and nonLaplacian image sources such as angiogram images. By employing competitive learning neural network in the codebook design process, we tried to obtain a stable and convergent algorithm. This paper deals with employing competitive learning neural network in RRVQ design algorithm that results in competitive learning RRVQ algorithm for the RVQ structure. Simulation results indicate that the new proposed algorithm is indeed convergent with high probability and provides peak signal-to-noise ratio (PSNR) of approximately 32 dB for an-giogram images at an average encoding bit rate of 0.25 bits per pixel
On the use of self-organizing maps to accelerate vector quantization
Self-organizing maps (SOM) are widely used for their topology preservation
property: neighboring input vectors are quantified (or classified) either on
the same location or on neighbor ones on a predefined grid. SOM are also widely
used for their more classical vector quantization property. We show in this
paper that using SOM instead of the more classical Simple Competitive Learning
(SCL) algorithm drastically increases the speed of convergence of the vector
quantization process. This fact is demonstrated through extensive simulations
on artificial and real examples, with specific SOM (fixed and decreasing
neighborhoods) and SCL algorithms.Comment: A la suite de la conference ESANN 199
S-TREE: Self-Organizing Trees for Data Clustering and Online Vector Quantization
This paper introduces S-TREE (Self-Organizing Tree), a family of models that use unsupervised learning to construct hierarchical representations of data and online tree-structured vector quantizers. The S-TREE1 model, which features a new tree-building algorithm, can be implemented with various cost functions. An alternative implementation, S-TREE2, which uses a new double-path search procedure, is also developed. S-TREE2 implements an online procedure that approximates an optimal (unstructured) clustering solution while imposing a tree-structure constraint. The performance of the S-TREE algorithms is illustrated with data clustering and vector quantization examples, including a Gauss-Markov source benchmark and an image compression application. S-TREE performance on these tasks is compared with the standard tree-structured vector quantizer (TSVQ) and the generalized Lloyd algorithm (GLA). The image reconstruction quality with S-TREE2 approaches that of GLA while taking less than 10% of computer time. S-TREE1 and S-TREE2 also compare favorably with the standard TSVQ in both the time needed to create the codebook and the quality of image reconstruction.Office of Naval Research (N00014-95-10409, N00014-95-0G57
Batch and median neural gas
Neural Gas (NG) constitutes a very robust clustering algorithm given
euclidian data which does not suffer from the problem of local minima like
simple vector quantization, or topological restrictions like the
self-organizing map. Based on the cost function of NG, we introduce a batch
variant of NG which shows much faster convergence and which can be interpreted
as an optimization of the cost function by the Newton method. This formulation
has the additional benefit that, based on the notion of the generalized median
in analogy to Median SOM, a variant for non-vectorial proximity data can be
introduced. We prove convergence of batch and median versions of NG, SOM, and
k-means in a unified formulation, and we investigate the behavior of the
algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari
Magnification Control in Self-Organizing Maps and Neural Gas
We consider different ways to control the magnification in self-organizing
maps (SOM) and neural gas (NG). Starting from early approaches of magnification
control in vector quantization, we then concentrate on different approaches for
SOM and NG. We show that three structurally similar approaches can be applied
to both algorithms: localized learning, concave-convex learning, and winner
relaxing learning. Thereby, the approach of concave-convex learning in SOM is
extended to a more general description, whereas the concave-convex learning for
NG is new. In general, the control mechanisms generate only slightly different
behavior comparing both neural algorithms. However, we emphasize that the NG
results are valid for any data dimension, whereas in the SOM case the results
hold only for the one-dimensional case.Comment: 24 pages, 4 figure
- …