5 research outputs found

    Warped K-Means: An algorithm to cluster sequentially-distributed data

    Full text link
    [EN] Many devices generate large amounts of data that follow some sort of sequentiality, e.g., motion sensors, e-pens, eye trackers, etc. and often these data need to be compressed for classification, storage, and/or retrieval tasks. Traditional clustering algorithms can be used for this purpose, but unfortunately they do not cope with the sequential information implicitly embedded in such data. Thus, we revisit the well-known K-means algorithm and provide a general method to properly cluster sequentially-distributed data. We present Warped K-Means (WKM), a multi-purpose partitional clustering procedure that minimizes the sum of squared error criterion, while imposing a hard sequentiality constraint in the classification step. We illustrate the properties of WKM in three applications, one being the segmentation and classification of human activity. WKM outperformed five state-of- the-art clustering techniques to simplify data trajectories, achieving a recognition accuracy of near 97%, which is an improvement of around 66% over their peers. Moreover, such an improvement came with a reduction in the computational cost of more than one order of magnitude.This work has been partially supported by Casmacat (FP7-ICT-2011-7, Project 287576), tranScriptorium (FP7-ICT-2011-9, Project 600707), STraDA (MINECO, TIN2012-37475-0O2-01), and ALMPR (GVA, Prometeo/20091014) projects.Leiva Torres, LA.; Vidal, E. (2013). Warped K-Means: An algorithm to cluster sequentially-distributed data. Information Sciences. 237:196-210. https://doi.org/10.1016/j.ins.2013.02.042S19621023

    Vacuum ultraviolet laser induced breakdown spectroscopy (VUV-LIBS) for pharmaceutical analysis

    Get PDF
    Laser induced breakdown spectroscopy (LIBS) allows quick analysis to determine the elemental composition of the target material. Samples need little\no preparation, removing the risk of contamination or loss of analyte. It is minimally ablative so negligible amounts of the sample is destroyed, while allowing quantitative and qualitative results. Vacuum ultraviolet (VUV)-LIBS, due to the abundance of transitions at shorter wavelengths, offers improvements over LIBS in the visible region, such as achieving lower limits of detection for trace elements and extends LIBS to elements\samples not suitable to visible LIBS. These qualities also make VUV-LIBS attractive for pharmaceutical analysis. Due to success in the pharmaceutical sector molecules representing the active pharmaceutical ingredients (APIs) have become increasingly complex. These organic compounds reveal spectra densely populated with carbon and oxygen lines in the visible and infrared regions, making it increasingly difficult to identify an inorganic analyte. The VUV region poses a solution as there is much better spacing between spectral lines. VUV-LIBS experiments were carried out on pharmaceutical samples. This work is a proof of principle that VUV-LIBS in conjunction with machine learning can tell pharmaceuticals apart via classification. This work will attempt to test this principle in two ways. Firstly, by classifying pharmaceuticals that are very different from one another i.e., having different APIs. This first test will gauge the efficacy of separating into different classes analytes that are essentially carbohydrates with distinctly different APIs apart from one another using their VUV emission spectra. Secondly, by classifying two different brands of the same pharmaceutical, i.e., paracetamol. The second test will investigate of the ability of machine learning to abstract and identify the differences in the spectra of two pharmaceuticals with the same API and separate them. This second test presents the application of VUV-LIBS combined with machine learning as a solution for at-line analysis of similar analytes e.g., quality control. The machine learning techniques explored in this thesis were convolutional neural networks (CNNs), support vector machines, self-organizing maps and competitive learning. The motivation for the application of principal component analysis (PCA) and machine learning is for the classification of analytes, allowing us to distinguish pharmaceuticals from one another based on their spectra. PCA and the machine learning techniques are compared against one another in this thesis. Several innovations were made; this work is the first in LIBS to implement the use of a short-time Fourier transform (STFT) method to generate input images for a CNN for VUV-LIBS spectra. This is also believed to be the first work in LIBS to carry out the development and application of an ellipsoidal classifier based on PCA. The results of this work show that by lowering the pulse energy it is possible to gather more useful spectra over the surface of a sample. Although this yields spectra with poorer signal-to-noise, the samples can still be classified using the machine learning analytics. The results in this thesis indicate that, of all the machine learning techniques evaluated, CNNs have the best classification accuracy combined with the fastest run time. Prudent data augmentation can significantly reduce experimental workloads, without reducing classification rates

    Competitive Learning Algorithms for Robust Vector Quantization

    No full text
    The efficient representation and encoding of signals with limited resources, e.g., finite storage capacity and restricted transmission bandwidth, is a fundamental problem in technical as well as biological information processing systems. Typically, under realistic circumstances, the encoding and communication of messages has to deal with different sources of noise and disturbances. In this paper, we propose a unifying approach to data compression by robust vector quantization, which explicitly deals with channel noise, bandwidth limitations, and random elimination of prototypes. The resulting algorithm is able to limit the detrimental effect of noise in a very general communication scenario. In addition, the presented model allows us to derive a novel competitive neural networks algorithm, which covers topology preserving feature maps, the so-called neural-gas algorithm, and the maximum entropy soft-max rule as special cases. Furthermore, continuation methods based on these noise models impr..

    Competitive learning algorithms for robust vector quantization

    No full text
    corecore