10 research outputs found

    PENGENALAN KARAKTER HURUF HIJAIYAH MENGGUNAKAN METODE MODIFIED DIRECTION FEATURE (MDF) DAN METODE LEARNING VECTOR QUANTIZATION 3 (LVQ 3)

    Get PDF
    Salah satu huruf yang memiliki karakteristik unik adalah huruf Hijaiyah. Karakteristik dari huruf Hijaiyah dapat berubah berdasarkan peletakkan huruf tunggal di awal, tengah dan akhir kata. Cara lain untuk mengenali karakter huruf Hijaiyah selain dengan memperhatikan karakteristik masing-masing huruf adalah dengan memanfaatkan pengenalan pola dan jaringan syaraf tiruan. Dalam penelitian ini, proses ektraksi ciri yang digunakan adalah metode Modified Direction Feature (MDF) dan proses klasifikasi yang digunakan metode Learning Vector Quantization 3 (LVQ 3). Pengujian yang dilakukan yaitu: pengujian ukuran matriks citra yang terdiri dari 80x80 piksel, 100x100 piksel, dan 120x120 piksel; dan pegujian nilailearning rate 0.01, 0.03,0.05, dan 0.07. Dari hasil pengujian yang dilakukan adalah sistem mampu mengenali pola karakter Huruf Hijaiyah dengan akurasi terbaik adalah 82.44% pada matriks citra berukuran 120x120 piksel dengan learning rate 0.03. Kata Kunci: Huruf Hijaiyah, Pengenalan Pola, Modofied Direction Feature, Learning Vector Quantizatio

    Multi-image classification and compression using vector quantization

    Get PDF
    Vector Quantization (VQ) is an image processing technique based on statistical clustering, and designed originally for image compression. In this dissertation, several methods for multi-image classification and compression based on a VQ design are presented. It is demonstrated that VQ can perform joint multi-image classification and compression by associating a class identifier with each multi-spectral signature codevector. We extend the Weighted Bayes Risk VQ (WBRVQ) method, previously used for single-component images, that explicitly incorporates a Bayes risk component into the distortion measure used in the VQ quantizer design and thereby permits a flexible trade-off between classification and compression priorities. In the specific case of multi-spectral images, we investigate the application of the Multi-scale Retinex algorithm as a preprocessing stage, before classification and compression, that performs dynamic range compression, reduces the dependence on lighting conditions, and generally enhances apparent spatial resolution. The goals of this research are four-fold: (1) to study the interrelationship between statistical clustering, classification and compression in a multi-image VQ context; (2) to study mixed-pixel classification and combined classification and compression for simulated and actual, multispectral and hyperspectral multi-images; (3) to study the effects of multi-image enhancement on class spectral signatures; and (4) to study the preservation of scientific data integrity as a function of compression. In this research, a key issue is not just the subjective quality of the resulting images after classification and compression but also the effect of multi-image dimensionality on the complexity of the optimal coder design

    PENERAPAN LEARNING VECTOR QUATIZATION 3 (LVQ 3) DALAM KLASIFIKASI STATUS GIZI BALITA BERDASARKAN INDEKS ANTROPOMETRI DAN FAKTOR MEMPENGARUHI GIZI

    Get PDF
    Status gizi balita adalah keadaan menyeimbangkan asupan zat gizi yang di dapat dari makanan dan kebutuhan pada zat gizi untuk tubuh pada anak yang berusia 1 hingga 5 tahun. Salah satu penilaian status gizi balita dapat dilakukan dengan menggunakan antropometri. Penelitian ini membangun sistem klasifikasi status gizi balita berdasarkan indeks antropometri berat badan menurut umur (BB/U) dan indeks antropometri tinggi badan menurut umur (TB/U) dengan menerapkan Metode LVQ 3. Adapun variabel yang digunakan terdiri dari jenis kelamin, umur, berat badan, tinggi badan, cara ukur, vitamin A, status ekonomi, pendidikan ibu dan pekerjaan ayah. Data yang digunakan merupakan data primer dari penimbangan massal tahun 2019 sebanyak 500 data balita. Pembagian data yang digunakan yaitu 90%:10% dan 80%:20%. Adapun variasi parameter yang digunakan untuk nilai learning rate (0,05; 0,15; 0,3) dan nilai window (0,2; 0,3; 0,4). Berdasarkan pengujian yang telah dilakukan, LVQ 3 berhasil diterapkan untuk penelitian klasifikasi status gizi balita dengan pembagian data 90%:10% dan learning rate 0.05 pada BB/U akurasi tertinggi 90% sedangkan pada TB/U akurasi tertinggi 60%

    Self-organising maps : statistical analysis, treatment and applications.

    Get PDF
    This thesis presents some substantial theoretical analyses and optimal treatments of Kohonen's self-organising map (SOM) algorithm, and explores the practical application potential of the algorithm for vector quantisation, pattern classification, and image processing. It consists of two major parts. In the first part, the SOM algorithm is investigated and analysed from a statistical viewpoint. The proof of its universal convergence for any dimensionality is obtained using a novel and extended form of the Central Limit Theorem. Its feature space is shown to be an approximate multivariate Gaussian process, which will eventually converge and form a mapping, which minimises the mean-square distortion between the feature and input spaces. The diminishing effect of the initial states and implicit effects of the learning rate and neighbourhood function on its convergence and ordering are analysed and discussed. Distinct and meaningful definitions, and associated measures, of its ordering are presented in relation to map's fault-tolerance. The SOM algorithm is further enhanced by incorporating a proposed constraint, or Bayesian modification, in order to achieve optimal vector quantisation or pattern classification. The second part of this thesis addresses the task of unsupervised texture-image segmentation by means of SOM networks and model-based descriptions. A brief review of texture analysis in terms of definitions, perceptions, and approaches is given. Markov random field model-based approaches are discussed in detail. Arising from this a hierarchical self-organised segmentation structure, which consists of a local MRF parameter estimator, a SOM network, and a simple voting layer, is proposed and is shown, by theoretical analysis and practical experiment, to achieve a maximum likelihood or maximum a posteriori segmentation. A fast, simple, but efficient boundary relaxation algorithm is proposed as a post-processor to further refine the resulting segmentation. The class number validation problem in a fully unsupervised segmentation is approached by a classical, simple, and on-line minimum mean-square-error method. Experimental results indicate that this method is very efficient for texture segmentation problems. The thesis concludes with some suggestions for further work on SOM neural networks

    Strategies for neural networks in ballistocardiography with a view towards hardware implementation

    Get PDF
    A thesis submitted for the degree of Doctor of Philosophy at the University of LutonThe work described in this thesis is based on the results of a clinical trial conducted by the research team at the Medical Informatics Unit of the University of Cambridge, which show that the Ballistocardiogram (BCG) has prognostic value in detecting impaired left ventricular function before it becomes clinically overt as myocardial infarction leading to sudden death. The objective of this study is to develop and demonstrate a framework for realising an on-line BCG signal classification model in a portable device that would have the potential to find pathological signs as early as possible for home health care. Two new on-line automatic BeG classification models for time domain BeG classification are proposed. Both systems are based on a two stage process: input feature extraction followed by a neural classifier. One system uses a principal component analysis neural network, and the other a discrete wavelet transform, to reduce the input dimensionality. Results of the classification, dimensionality reduction, and comparison are presented. It is indicated that the combined wavelet transform and MLP system has a more reliable performance than the combined neural networks system, in situations where the data available to determine the network parameters is limited. Moreover, the wavelet transfonn requires no prior knowledge of the statistical distribution of data samples and the computation complexity and training time are reduced. Overall, a methodology for realising an automatic BeG classification system for a portable instrument is presented. A fully paralJel neural network design for a low cost platform using field programmable gate arrays (Xilinx's XC4000 series) is explored. This addresses the potential speed requirements in the biomedical signal processing field. It also demonstrates a flexible hardware design approach so that an instrument's parameters can be updated as data expands with time. To reduce the hardware design complexity and to increase the system performance, a hybrid learning algorithm using random optimisation and the backpropagation rule is developed to achieve an efficient weight update mechanism in low weight precision learning. The simulation results show that the hybrid learning algorithm is effective in solving the network paralysis problem and the convergence is much faster than by the standard backpropagation rule. The hidden and output layer nodes have been mapped on Xilinx FPGAs with automatic placement and routing tools. The static time analysis results suggests that the proposed network implementation could generate 2.7 billion connections per second performance

    Vision-based neural network classifiers and their applications

    Get PDF
    A thesis submitted for the degree of Doctor of Philosophy of University of LutonVisual inspection of defects is an important part of quality assurance in many fields of production. It plays a very useful role in industrial applications in order to relieve human inspectors and improve the inspection accuracy and hence increasing productivity. Research has previously been done in defect classification of wood veneers using techniques such as neural networks, and a certain degree of success has been achieved. However, to improve results in tenus of both classification accuracy and running time are necessary if the techniques are to be widely adopted in industry, which has motivated this research. This research presents a method using rough sets based neural network with fuzzy input (RNNFI). Variable precision rough set (VPRS) method is proposed to remove redundant features utilising the characteristics of VPRS for data analysis and processing. The reduced data is fuzzified to represent the feature data in a more suitable foml for input to an improved BP neural network classifier. The improved BP neural network classifier is improved in three aspects: additional momentum, self-adaptive learning rates and dynamic error segmenting. Finally, to further consummate the classifier, a uniform design CUD) approach is introduced to optimise the key parameters because UD can generate a minimal set of uniform and representative design points scattered within the experiment domain. Optimal factor settings are achieved using a response surface (RSM) model and the nonlinear quadratic programming algorithm (NLPQL). Experiments have shown that the hybrid method is capable of classifying the defects of wood veneers with a fast convergence speed and high classification accuracy, comparing with other methods such as a neural network with fuzzy input and a rough sets based neural network. The research has demonstrated a methodology for visual inspection of defects, especially for situations where there is a large amount of data and a fast running speed is required. It is expected that this method can be applied to automatic visual inspection for production lines of other products such as ceramic tiles and strip steel

    Data mining using neural networks

    Get PDF
    Data mining is about the search for relationships and global patterns in large databases that are increasing in size. Data mining is beneficial for anyone who has a huge amount of data, for example, customer and business data, transaction, marketing, financial, manufacturing and web data etc. The results of data mining are also referred to as knowledge in the form of rules, regularities and constraints. Rule mining is one of the popular data mining methods since rules provide concise statements of potentially important information that is easily understood by end users and also actionable patterns. At present rule mining has received a good deal of attention and enthusiasm from data mining researchers since rule mining is capable of solving many data mining problems such as classification, association, customer profiling, summarization, segmentation and many others. This thesis makes several contributions by proposing rule mining methods using genetic algorithms and neural networks. The thesis first proposes rule mining methods using a genetic algorithm. These methods are based on an integrated framework but capable of mining three major classes of rules. Moreover, the rule mining processes in these methods are controlled by tuning of two data mining measures such as support and confidence. The thesis shows how to build data mining predictive models using the resultant rules of the proposed methods. Another key contribution of the thesis is the proposal of rule mining methods using supervised neural networks. The thesis mathematically analyses the Widrow-Hoff learning algorithm of a single-layered neural network, which results in a foundation for rule mining algorithms using single-layered neural networks. Three rule mining algorithms using single-layered neural networks are proposed for the three major classes of rules on the basis of the proposed theorems. The thesis also looks at the problem of rule mining where user guidance is absent. The thesis proposes a guided rule mining system to overcome this problem. The thesis extends this work further by comparing the performance of the algorithm used in the proposed guided rule mining system with Apriori data mining algorithm. Finally, the thesis studies the Kohonen self-organization map as an unsupervised neural network for rule mining algorithms. Two approaches are adopted based on the way of self-organization maps applied in rule mining models. In the first approach, self-organization map is used for clustering, which provides class information to the rule mining process. In the second approach, automated rule mining takes the place of trained neurons as it grows in a hierarchical structure

    Sparse machine learning models in bioinformatics

    Get PDF
    The meaning of parsimony is twofold in machine learning: either the structure or (and) the parameter of a model can be sparse. Sparse models have many strengths. First, sparsity is an important regularization principle to reduce model complexity and therefore avoid overfitting. Second, in many fields, for example bioinformatics, many high-dimensional data may be generated by a very few number of hidden factors, thus it is more reasonable to use a proper sparse model than a dense model. Third, a sparse model is often easy to interpret. In this dissertation, we investigate the sparse machine learning models and their applications in high-dimensional biological data analysis. We focus our research on five types of sparse models as follows. First, sparse representation is a parsimonious principle that a sample can be approximated by a sparse linear combination of basis vectors. We explore existing sparse representation models and propose our own sparse representation methods for high dimensional biological data analysis. We derive different sparse representation models from a Bayesian perspective. Two generic dictionary learning frameworks are proposed. Also, kernel and supervised dictionary learning approaches are devised. Furthermore, we propose fast active-set and decomposition methods for the optimization of sparse coding models. Second, gene-sample-time data are promising in clinical study, but challenging in computation. We propose sparse tensor decomposition methods and kernel methods for the dimensionality reduction and classification of such data. As the extensions of matrix factorization, tensor decomposition techniques can reduce the dimensionality of the gene-sample-time data dramatically, and the kernel methods can run very efficiently on such data. Third, we explore two sparse regularized linear models for multi-class problems in bioinformatics. Our first method is called the nearest-border classification technique for data with many classes. Our second method is a hierarchical model. It can simultaneously select features and classify samples. Our experiment, on breast tumor subtyping, shows that this model outperforms the one-versus-all strategy in some cases. Fourth, we propose to use spectral clustering approaches for clustering microarray time-series data. The approaches are based on two transformations that have been recently introduced, especially for gene expression time-series data, namely, alignment-based and variation-based transformations. Both transformations have been devised in order to take into account temporal relationships in the data, and have been shown to increase the ability of a clustering method in detecting co-expressed genes. We investigate the performances of these transformations methods, when combined with spectral clustering on two microarray time-series datasets, and discuss their strengths and weaknesses. Our experiments on two well known real-life datasets show the superiority of the alignment-based over the variation-based transformation for finding meaningful groups of co-expressed genes. Fifth, we propose the max-min high-order dynamic Bayesian network (MMHO-DBN) learning algorithm, in order to reconstruct time-delayed gene regulatory networks. Due to the small sample size of the training data and the power-low nature of gene regulatory networks, the structure of the network is restricted by sparsity. We also apply the qualitative probabilistic networks (QPNs) to interpret the interactions learned. Our experiments on both synthetic and real gene expression time-series data show that, MMHO-DBN can obtain better precision than some existing methods, and perform very fast. The QPN analysis can accurately predict types of influences and synergies. Additionally, since many high dimensional biological data are subject to missing values, we survey various strategies for learning models from incomplete data. We extend the existing imputation methods, originally for two-way data, to methods for gene-sample-time data. We also propose a pair-wise weighting method for computing kernel matrices from incomplete data. Computational evaluations show that both approaches work very robustly

    Nuevos métodos para análisis visual de mapas auto-organizativo

    Full text link
    El mapa auto-organizativo (MAO) es un tipo de red neuronal artificial competitiva y no-supervisada. Ha sido utilizado tradicionalmente en tareas de ingeniería como herramienta de clasificación automática (clustering) y especialmente en tareas relacionadas con el análisis exploratorio de datos y la minería de datos, ya que su propósito principal es la visualización de relaciones no-lineales de datos multidimensionales. Sin embargo, a pesar de la importancia de la tarea de visualización, las técnicas gráficas para analizar MAO no son abundantes en la literatura. Esta tesis presenta varias técnicas nuevas que complementan, mejoran y facilitan el anáfisis visual de MAO de Kohonen, tanto desde el punto de vista del análisis exploratorio de datos, como desde el punto de vista de comprender el proceso de adaptación del MAO a una distribución de datos. La motivación para desarrollar técnicas de visualización nuevas surge por los siguientes motivos: IÍL relativa carencia de métodos destinados a la importante tarea de visualización, la necesidad de analizar MAO con diferentes métodos, la necesidad de mejorar varios métodos descritos en la literatura y la posibilidad de innovar desarrollando nuevas estrategias de visualización. De esta manera, se ha hecho hincapié en desarrollar técnicas generalmente no utilizadas con anterioridad en un intento por superar limitaciones de varios métodos descritos en la literatura. El primer nuevo método denominado "método de semejanza de triángulos" consiste en una estrategia de interpolación geométrica donde los patrones de una distribución de entrada son proyectados a un espacio de observación continuo. Está basado en la preservación de la semejanza geométrica entre varios triángulos formados por un patron y dos vectores de referencia del MAO en el espacio de los datos, y por un punto candidato y las dos correspondientes neuronas en el espacio de observación. El método encuentra la proyección minimizando una función de coste que mide distancias o errores entre varios triángulos. El método supera notablemente a otras estrategias de interpolación descritas en la literatura. Puede proyectar todos los datos de manera no-lineal, resulta adecuado cuando el tamaño del MAO es pequeño, es robusto y puede describir adecuadamente ciertos tipos de distribuciones difíciles de visualizar con la mayoría de métodos de visualización. Varios métodos de visualización de MAO generan imágenes monocromáticas las cuales son analizadas individualmente y aportan información específica sobre los datos. Se propone una estrategia para facilitar la labor del analista a la hora de combinar la información de varios métodos mediante una simple superposición de imágenes basada en un modelo aditivo de colores. Las imágenes son definidas con colores diferentes y combinadas mediante una simple suma de sus componentes de color. Las imágenes resultantes son más completas y robustas, especialmente cuando las imágenes a combinar aportan el mismo tipo de información. El estudio llevado a cabo se centra principalmente en la combinación de matrices de distancias con histogramas de datos. Una alternativa a las matrices de distancias, que generan imágenes monocromáticas y son los métodos más populares para visualizar la estructura de clusters de los datos, consiste en emplear estrategias que ilustren los diferentes clusters mediante colores diferentes. Una de estas estrategias consiste en utilizar modelos de contracción de neuronas. Se presenta un eficiente método de contracción, el "algoritmo de agrupación de neuronas", cuya estructura y filosofía es similar a la del algoritmo de entrenamiento de los MAO, donde los conceptos han sido invertidos para actualizar las posiciones de las neuronas en un mapa continuo en vez de los propios vectores de referencia del MAO. De esta manera, las neuronas son atraídas en el mapa en función de la distancia entre sus vectores de referencia en el espacio de los datos. Su principal ventaja es su bajo coste computacional que lo habilita para analizar MAO de tamaño elevado. Finalmente, el trabajo propone una técnica alternativa basada en la visualización explícita en el mapa o espacio de observación de grafos que unen neuronas cuyos vectores de referencia se hallan próximos en el espacio de los datos, como son el árbol generador mínimo o el "grafo Hebbiano" creado con el principio de aprendizaje Hebbiano competitivo. Las imágenes resultantes ayudan a analizar la dimensión intrínseca de los datos en cada zona del mapa y aportan una medida visual e intuitiva de la preservación de la topología del MAO
    corecore