7 research outputs found

    Applying Fourier-Transform Infrared Spectroscopy and Self-Organizing Maps for Forensic Classification of White-Copy Papers

    Get PDF
    White-copy A4 paper is an important kind of substrate for preparation of most formal as well as informal documents. It often encountered as questioned document in cases such as falsification, embezzlement or forgery. By comparing the questioned piece, (e.g. of a contract) against the rest deemed authentic, forgery indicator could be derived from an inconsistent chemical composition.  However, classification and even differentiation of white copy paper have been difficult due to highly similar physical properties and chemical composition. Self-organizing map (SOM) has been proven useful in many published works as a good tool for clustering and classification of samples, especially when involving high-dimensional data. In this preliminary paper, we explore the feasibility of SOM in classifying white copy paper for forensic purposes. A total of 150 infrared spectra were collected from three varieties of white paper using Attenuated Total Reflectance Fourier-transform infrared (ATR-FTIR) spectroscopy. Each IR spectrum composed of over thousands of wavenumbers (i.e. input variables) and resembles chemical fingerprint for the sample. Comparative performance between raw wavenumbers and its reduced form (i.e. principal components, PCs) in SOM modeling also conducted. Results showed that SOM built with PCs is much efficient than built with raw wavenumbers, with the classification accuracy of over 90% is obtained with external validation test. This study shows that SOM coupled with ATR-FTIR spectroscopy could be a potential non-destructive approach for forensic paper analysis

    Visualizing engineering design data using a modified two-level self-organizing map clustering approach

    Get PDF
    Engineers tasked with designing large and complex systems are continually in need of decision-making aids able to sift through enormous amounts of data produced through simulation and experimentation. Understanding these systems often requires visualizing multidimensional design data. Visual cues such as size, color, and symbols are often used to denote specific variables (dimensions) as well as characteristics of the data. However, these cues are unable to effectively convey information attributed to a system containing more than three dimensions. Two general techniques can be employed to reduce the complexity of information presented to an engineer: dimension reduction, and individual variable comparison. Each approach can provide a comprehensible visualization of the resulting design space, which is vital for an engineer to decide upon an appropriate optimization algorithm. Visualization techniques, such as self-organizing maps (SOMs), offer powerful methods able to surmount the difficulties of reducing the complexity of n-dimensional data by producing simple to understand visual representations that quickly highlight trends to support decision-making. The SOM can be extended by providing relevant output information in the form of contextual labels. Furthermore, these contextual labels can be leveraged to visualize a set of output maps containing statistical evaluations of each node residing within a trained SOM. These maps give a designer a visual context to the data set’s natural topology by highlighting the nodal performance amongst the maps. A drawback to using SOMs is the clustering of promising points with predominately less desirable data. Similar data groupings can be revealed from the trained output maps using visualization techniques such as the SOM, but these are not inherently cluster analysis methods. Cluster analysis is an approach able to assimilate similar data objects into “natural groups” from an otherwise unknown prior knowledge of a data set. Engineering data composed of design alternatives with associated variable parameters often contain data objects with unknown classification labels. Consequently, identifying the correct classifications can be difficult and costly. This thesis applies a cluster analysis technique to SOMs to segment a high-dimensional dataset into “meta-clusters”. Furthermore, the thesis will describe the algorithm created to establish these meta-clusters through the development of several computational metrics involving intra and inter cluster densities. The results from this work show the presented algorithm’s ability to narrow a large-complex system’s plethora of design alternatives into a few overarching set of design groups containing similar principal characteristics, which saves the time a designer would otherwise spend analyzing numerous design alternatives

    Spiking neurons in 3D growing self-organising maps

    Get PDF
    In Kohonen’s Self-Organising Maps (SOM) learning, preserving the map topology to simulate the actual input features appears to be a significant process. Misinterpretation of the training samples can lead to failure in identifying the important features that may affect the outcomes generated by the SOM model. Nonetheless, it is a challenging task as most of the real problems are composed of complex and insufficient data. Spiking Neural Network (SNN) is the third generation of Artificial Neural Network (ANN), in which information can be transferred from one neuron to another using spike, processed, and trigger response as output. This study, hence, embedded spiking neurons for SOM learning in order to enhance the learning process. The proposed method was divided into five main phases. Phase 1 investigated issues related to SOM learning algorithm, while in Phase 2; datasets were collected for analyses carried out in Phase 3, wherein neural coding scheme for data representation process was implemented in the classification task. Next, in Phase 4, the spiking SOM model was designed, developed, and evaluated using classification accuracy rate and quantisation error. The outcomes showed that the proposed model had successfully attained exceptional classification accuracy rate with low quantisation error to preserve the quality of the generated map based on original input data. Lastly, in the final phase, a Spiking 3D Growing SOM is proposed to address the surface reconstruction issue by enhancing the spiking SOM using 3D map structure in SOM algorithm with a growing grid mechanism. The application of spiking neurons to enhance the performance of SOM is relevant in this study due to its ability to spike and to send a reaction when special features are identified based on its learning of the presented datasets. The study outcomes contribute to the enhancement of SOM in learning the patterns of the datasets, as well as in proposing a better tool for data analysis

    Redução de dimensionalidade e visualização interativa de dados multimensionais utilizando processamento paralelo em GPU

    Get PDF
    Orientador : Prof. Dr. Sérgio ScheerTese (doutorado) - Universidade Federal do Paraná, Setor de Tecnologia, Programa de Pós-Graduação em Métodos Numéricos em Engenharia. Defesa: Curitiba, 29/08/2016Inclui referências : f. 102-105Resumo: O método de apresentação de um conjunto de dados influencia os processos de análise e tomada de decisão acerca de seu conteúdo. Portanto, o processo de visualização deve representar, da melhor forma possível, as relações existentes entre seus elementos. Fenômenos ou processos reais apresentam conjuntos de dados multidimensionais, para os quais seria ideal utilizar representações visuais com o maior número de características possível, o que nem sempre é viável devido a limitações nos dispositivos e pelo fato de que a compreensão de um conjunto com mais de três dimensões não é natural. O problema abordado é a visualização de um grande conjunto de dados, como os resultantes de simulações numéricas ou do sensoriamento de uma estrutura, processo ou mesmo fenômeno natural a partir de um conjunto de diferentes tipos de sensores, utilizando um ambiente computacional de baixo custo. Considerando estes casos, são necessárias ferramentas que auxiliem na visualização e análise dos dados produzidos, facilitando sua compreensão pelos distintos profissionais envolvidos. A partir destas considerações, esta pesquisa tem por objetivo propor uma abordagem para realizar a visualização e análise interativas de um volume de dados multidimensional, de modo que todo o conjunto de dados esteja representado na imagem resultante. Para isso utilizar processamento paralelo baseado em processadores gráficos para implementar as técnicas de Redução Dimensional (RD): Multidimensional Scaling (MDS) e transformação por Coordenadas Estrela, de modo a produzir imagens que representem o conteúdo do volume multidimensional (n-dimensional) de dados. Quatro abordagens para realizar a visualização de dados multidimensionais são descritas e, posteriormente, testadas em um protótipo utilizando General-Purpose Computation on Graphics Processing Units (GPGPU). Os resultados de processamento indicam a viabilidade de se realizar a visualização de um volume de dados n-dimensional utilizando uma técnica de RD em um computador de baixo custo equipado com uma placa gráfica. Palavras-chave: Escala multidimensional, Processamento paralelo, Coordenadas Estrela, Redução Dimensional (RD), Imagem tridimensional.Abstract: The method of presenting a data set influences the analysis and decision-making processes, about its contents. So the visualization process should represent in the best possible way the different relations between its elements. Phenomena or real processes present multidimensional data sets, for which it would be ideal to use visual representations with as many features as possible, which is not always feasible due to limitations in the devices and the fact that the understanding of a range of more than three dimensions is not natural. The problem addressed is the view of a large data set, as a result of numerical simulations or the sensoring of a structure, process or natural phenomenon from a number of different types of sensors, using for this a low cost computing environment. Considering these cases, tools are needed to assist in the visualization and analysis of the data produced, facilitating their comprehension by the various professionals involved. Based on these considerations, this research aims to propose an approach to perform interactive visualization and analysis of a multidimensional data volume, so that the entire data set is represented in the resulting image. Using for this parallel processing based on graphical processing units to implement: the MDS and the Star Coordinates transformation Dimensional Reduction (DR) techniques to produce images that represent the contents of the n-dimensional data volume. Four approaches to perform multidimensional data visualization are described and subsequently tested in a prototype using GPGPU. The processing results indicate the feasibility of performing the visualization of a n-dimensional data volume using a DR technique in a low cost computer equipped with a video card Keywords: Dimensional scale, Parallel processing, Star coordinates, Dimensional Reduction (DR), Tridimensional image

    A new approach for data clustering and visualization using self-organizing maps

    No full text
    A self-organizing map (SOM) is a nonlinear, unsupervised neural network model that could be used for applications of data clustering and visualization. One of the major shortcomings of the SOM algorithm is the difficulty for non-expert users to interpret the information involved in a trained SOM. In this paper, this problem is tackled by introducing an enhanced version of the proposed visualization method which consists of three major steps: (1) calculating single-linkage inter-neuron distance, (2) calculating the number of data points in each neuron, and (3) finding cluster boundary. The experimental results show that the proposed approach has the strong ability to demonstrate the data distribution, inter-neuron distances, and cluster boundary, effectively. The experimental results indicate that the effects of visualization of the proposed algorithm are better than that of other visualization methods. Furthermore, our proposed visualization scheme is not only intuitively easy understanding of the clustering results, but also having good visualization effects on unlabeled data sets
    corecore