9,042 research outputs found

    Bolt: Accelerated Data Mining with Fast Vector Compression

    Full text link
    Vectors of data are at the heart of machine learning and data mining. Recently, vector quantization methods have shown great promise in reducing both the time and space costs of operating on vectors. We introduce a vector quantization algorithm that can compress vectors over 12x faster than existing techniques while also accelerating approximate vector operations such as distance and dot product computations by up to 10x. Because it can encode over 2GB of vectors per second, it makes vector quantization cheap enough to employ in many more circumstances. For example, using our technique to compute approximate dot products in a nested loop can multiply matrices faster than a state-of-the-art BLAS implementation, even when our algorithm must first compress the matrices. In addition to showing the above speedups, we demonstrate that our approach can accelerate nearest neighbor search and maximum inner product search by over 100x compared to floating point operations and up to 10x compared to other vector quantization methods. Our approximate Euclidean distance and dot product computations are not only faster than those of related algorithms with slower encodings, but also faster than Hamming distance computations, which have direct hardware support on the tested platforms. We also assess the errors of our algorithm's approximate distances and dot products, and find that it is competitive with existing, slower vector quantization algorithms.Comment: Research track paper at KDD 201

    Comparative Analysis: Heart Diagnosis Classification using BP-LVQ Neural Network Models For Analog and Digital Data

    Get PDF
    Decades onwards companies are creating massive data warehouses to store the collected resources. Even though the stored resources are available, only few companies have been able to know that the actual value stored in the database. Procedure used to extract those values is known as data mining. We use so-many technologies to apply this data-mining technique, artificial neural network(ANN) also includes in this data-mining techniques ,ANN is the information processing units which are similar to biological nervous systems. Backpropagation is one of the techniques that used for classification and LVQ (learning Vector Quantization) can be plotted under the competitive learning scheme which is also used for classification. This paper elaborates artificial neural networks, its characteristics and working of backpropagation and LVQ algorithms. In this paper we show the intriguing comparisons between backpropagation and LVQ (Learning Vector Quantization) for both analog and digital data. It also attempts to explain the results between back-propagation and LV

    Competitive learning/reflected residual vector quantization for coding angiogram images

    Get PDF
    Medical images need to be compressed for the purpose of storage/transmission of a large volume of medical data. Reflected residual vector quantization (RRVQ) has emerged recently as one of the computationally cheap compression algorithms. RRVQ, which is a lossy compression scheme, was introduced as an alternative design algorithm for residual vector quantization (RVQ) structure (a structure famous for providing progressive quantization). However, RRVQ is not guaranteed to reach global minimum. It was found that it has a higher probability to diverge when used with nonGaussian and nonLaplacian image sources such as angiogram images. By employing competitive learning neural network in the codebook design process, we tried to obtain a stable and convergent algorithm. This paper deals with employing competitive learning neural network in RRVQ design algorithm that results in competitive learning RRVQ algorithm for the RVQ structure. Simulation results indicate that the new proposed algorithm is indeed convergent with high probability and provides peak signal-to-noise ratio (PSNR) of approximately 32 dB for an-giogram images at an average encoding bit rate of 0.25 bits per pixel

    On the use of self-organizing maps to accelerate vector quantization

    Full text link
    Self-organizing maps (SOM) are widely used for their topology preservation property: neighboring input vectors are quantified (or classified) either on the same location or on neighbor ones on a predefined grid. SOM are also widely used for their more classical vector quantization property. We show in this paper that using SOM instead of the more classical Simple Competitive Learning (SCL) algorithm drastically increases the speed of convergence of the vector quantization process. This fact is demonstrated through extensive simulations on artificial and real examples, with specific SOM (fixed and decreasing neighborhoods) and SCL algorithms.Comment: A la suite de la conference ESANN 199

    S-TREE: Self-Organizing Trees for Data Clustering and Online Vector Quantization

    Full text link
    This paper introduces S-TREE (Self-Organizing Tree), a family of models that use unsupervised learning to construct hierarchical representations of data and online tree-structured vector quantizers. The S-TREE1 model, which features a new tree-building algorithm, can be implemented with various cost functions. An alternative implementation, S-TREE2, which uses a new double-path search procedure, is also developed. S-TREE2 implements an online procedure that approximates an optimal (unstructured) clustering solution while imposing a tree-structure constraint. The performance of the S-TREE algorithms is illustrated with data clustering and vector quantization examples, including a Gauss-Markov source benchmark and an image compression application. S-TREE performance on these tasks is compared with the standard tree-structured vector quantizer (TSVQ) and the generalized Lloyd algorithm (GLA). The image reconstruction quality with S-TREE2 approaches that of GLA while taking less than 10% of computer time. S-TREE1 and S-TREE2 also compare favorably with the standard TSVQ in both the time needed to create the codebook and the quality of image reconstruction.Office of Naval Research (N00014-95-10409, N00014-95-0G57

    Batch and median neural gas

    Full text link
    Neural Gas (NG) constitutes a very robust clustering algorithm given euclidian data which does not suffer from the problem of local minima like simple vector quantization, or topological restrictions like the self-organizing map. Based on the cost function of NG, we introduce a batch variant of NG which shows much faster convergence and which can be interpreted as an optimization of the cost function by the Newton method. This formulation has the additional benefit that, based on the notion of the generalized median in analogy to Median SOM, a variant for non-vectorial proximity data can be introduced. We prove convergence of batch and median versions of NG, SOM, and k-means in a unified formulation, and we investigate the behavior of the algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari

    Magnification Control in Self-Organizing Maps and Neural Gas

    Get PDF
    We consider different ways to control the magnification in self-organizing maps (SOM) and neural gas (NG). Starting from early approaches of magnification control in vector quantization, we then concentrate on different approaches for SOM and NG. We show that three structurally similar approaches can be applied to both algorithms: localized learning, concave-convex learning, and winner relaxing learning. Thereby, the approach of concave-convex learning in SOM is extended to a more general description, whereas the concave-convex learning for NG is new. In general, the control mechanisms generate only slightly different behavior comparing both neural algorithms. However, we emphasize that the NG results are valid for any data dimension, whereas in the SOM case the results hold only for the one-dimensional case.Comment: 24 pages, 4 figure
    corecore