3 research outputs found
A method of storing vector data in compressed form using clustering
The development of the machine learning algorithms for information search in recent years made it possible to represent
text and multimodal documents in the form of vectors. These vector representations (embeddings) preserve the semantic content of documents and allow the search to be performed as the calculation of distance between vectors. Compressing
embeddings can reduce the amount of memory they occupy and improve computational efficiency. The article discusses
existing methods for compressing vector representations without loss of accuracy and with loss of accuracy. A method
is proposed to reduce error by clustering vector representations using lossy compression. The essence of the method
is in performing the preliminary clustering of vector representations, saving the centers of each cluster, and saving
the coordinate value of each vector representation relative to the center of its cluster. Then, the centers of each cluster
are compressed without loss of accuracy, and the resulting shifted vector representations are compressed with loss of
accuracy. To restore the original vector representations, the coordinates of the center of the corresponding cluster are
added to the coordinates of the displaced representation. The proposed method was tested on the fashion-mnist-784-
euclidean and NYT-256-angular datasets. A comparison has been made of compressed vector representations with
loss of accuracy by reducing the bit depth with vector representations compressed using the proposed method. With a
slight (around 10 %) increase in the size of the compressed data, the absolute value of the error from loss of accuracy
decreased by four and two times, respectively, for the tested sets. The developed method can be applied in tasks where
it is necessary to store and process vector representations of multimodal documents, for example, in the development
of search engines
Interactions of single and multi-layer graphene oxides with water, methane, organic solvents and HCl studied by 1H NMR
Abstract Contemporary characterisation techniques for graphenes are often performed for samples in a dried state or vacuum, which can lead to significant structural changes and difficulty in assessing the actual physical or physicochemical characteristics of graphenes in a colloid state. The interfacial phenomena between water or mixtures (of water with benzene, methane, or HCl) bound to single-layer graphene oxide (SLGO) and multi-layer graphene oxide (MLGO) in different dispersion media (CDCl3, CCl4, CDCl3/DMSO, air) were studied using low-temperature (200–280K) 1H NMR spectroscopy. Use of the NMR cryoporometry method allows determination of the textural characteristics of SLGO and MLGO depending on their hydration degree. It was found that SLGO in diluted suspensions is more agglomerated after freezing-thawing. This effect could be assigned to cryogelation of carbon sheets leading to a decrease in the specific surface area (from 1841 to 533m2/g) representing the area of sheets that are accessible for water that is unfrozen at subzero temperatures. The results obtained show that the cryoporometry method is appropriate for the investigation of the texture of both wetted and suspended graphene oxides