9 research outputs found

    Adaptive Resonance Theory and Diffusion Maps for Clustering Applications in Pattern Analysis

    Get PDF
    Adaptive Resonance is primarily a theory that learning is regulated by resonance phenomena in neural circuits. Diffusion maps are a class of kernel methods on edge-weighted graphs. While either of these approaches have demonstrated success in image analysis, their combination is particularly effective. These techniques are reviewed and some example applications are given

    Prediction of the Thromboembolic Syndrome: an Application of Artificial Neural Networks in Gene Expression Data Analysis

    Get PDF
    The aim of this study was to propose a method for improving the power of recognition and classification of thromboembolic syndrome based on the analysis of ‎ gene expression data using artificial neural networks. The studied method was performed on a dataset which contained data about 117 patients admitted to a hospital in Durham in 2009. Of all the studied patients, 66 patients were suffering from thromboembolic syndrome and 51 people were enrolled in the study as the control group. The gene expression level of 22277 was measured for all the samples and was entered into the model as the main variable. Due to the high number of variables, principal components analysis and auto-encoder neural network methods were used in order to reduce the dimension of data. The results showed that when using auto-encoder networks, the classification accuracy was 93.12. When using the PCA method to reduce the size of the data, the obtained accuracy was 78.26, and hence a significant difference in the accuracy of classification was observed. If auto-encoder network method is used, the sensitivity and specificity will be 92.58 and 93.68 and when PCA method is used, they will be 0.77 and 0.78 respectively. The results suggested that auto-encoder networks, compared with the PCA method, had a higher level of accuracy for the classification of thromboembolic syndrome status

    Deep Learning Causal Attributions of Breast Cancer

    Get PDF
    In this paper, a deep learning-based approach is applied to high dimensional, high-volume, and high-sparsity medical data to identify critical casual attributions that might affect the survival of a breast cancer patient. The Surveillance Epidemiology and End Results (SEER) breast cancer data is explored in this study. The SEER data set contains accumulated patient-level and treatment-level information, such as cancer site, cancer stage, treatment received, and cause of death. Restricted Boltzmann machines (RBMs) are proposed for dimensionality reduction in the analysis. RBM is a popular paradigm of deep learning networks and can be used to extract features from a given data set and transform data in a non-linear manner into a lower dimensional space for further modelling. In this study, a group of RBMs has been trained to sequentially transform the original data into a very low dimensional space, and then the k-means clustering is conducted in this space. Furthermore, the results obtained about the cluster membership of the data samples are mapped back to the original sample space for interpretation and insight creation. The analysis has demonstrated that essential features relating to breast cancer survival can be effectively extracted and brought forward into a much lower dimensional space formed by RBMs

    Automatic segmentation of kidney and renal cell carcinoma using deep learning

    Get PDF
    Hintergrund Die klinische Bildgebung umfasst eine große Bandbreite an Informationen, die zur Diagnostik, Prognose und Therapie verwendet werden können. Allerdings ist die Beurteilung radiologischer Bilddaten in der klinischen Routine oftmals subjektiv und zeitaufwändig. Mit Hilfe der U-Net-Architektur kann die Bildinterpretation objektiver und zeiteffizienter gestaltet werden. Das U-Net ist ein Modell, das der Encoder-Decoder-Struktur eine Skip-Verbindung (Englisch: „skip connection“) hinzufügt. Weil Pixelinformationen beim Downsampling und Upsamping verloren gehen, wird diese beim U-Net durch die Skip-Verbindung vom Encoder an den Decoder weitergeleitet, so dass eine genauere Vorhersage möglich ist. Dadurch kann überprüft werden, wie effizient das U-Net für die Segmentierung der medizinischen Bilder ist. Methoden Im Rahmen dieser Studie soll eine U-Net-Architektur zur automatischen Segmentation zwecks Unterscheidung zwischen Tumor und gesundem Gewebe bei Nierenzellkarzinomen trainiert werden. Um ein Training der U-Net-Architektur zu ermöglichen, wurden alle Schichten aus 502 Computertomographie (CT)-Bildern von Patientinnen und Patienten mit Nierenzellkarzinomen vor und nach Injektion eines iodhaltigen Kontrastmittels segmentiert und anschließend mittels eines histopathologischen Referenzstandards nach Tumorentitäten klassifiziert. Ergebnisse Für die Segmentierung der Niere und Nierenzellkarzinome durch das Deep Learning Modell sind die Ergebnisse in früheren Studien in Bezug auf den Dice Koeffizient (DSC) im Mittelfeld angesiedelt. Es zeigen sich in dieser Studie bessere Ergebnisse als im Durchschnitt der vorherigen Studien für die Segmentation der Tumore. Schlussfolgerung Das in dieser Studie vorliegenden Arbeit verwendete Deep Learning Modells für die Segmentation der Anatomie der Nieren sowie der Detektion und Segmentation der Tumore lässt sich eine Perspektive und großes Potential für weitere Forschung in der medizinische Bildanalyse erkennen. Automatisierte Segmentationen und Tumordetektionen können die klinische Arbeit erleichtern sowie zukünftige Diagnostik und Therapien verbessern.Background Clinical imaging can provide valuable information for the diagnosis and treatment of medical conditions. However, the interpretation of these images can be challenging and time-consuming, often relying on the expertise and experience of radiologists. The U-Net architecture is a deep learning model designed to improve the objectivity and efficiency of image interpretation by using a skip connection to transfer high-resolution information from the encoder to the decoder. The skip connection in the U-Net architecture allows fine details to be preserved during downsampling and upsampling, which can be lost in other image segmentation models. This improves the accuracy of predictions and enables the U-Net to be used for efficient segmentation of medical images. Methods The aim of this study is to train a U-Net architecture for automatic segmentation to distinguish between tumor and healthy tissue in renal cell carcinoma. To train the U-Net architecture for automatic segmentation of renal cell carcinoma, a dataset of computed tomography (CT) images of patients with this disease must be created. This dataset will be generated by segmenting all slices from 502 CT images of patients with RCC before and after injection of an iodine-based contrast agent, and then classifying the segments according to tumor entities using a histopathological reference standard. Results For the segmentation of kidney and renal cell carcinoma using the deep learning model, the results in previous studies are in the middle regarding the Dice Similarity Coefficient (DSC). This study shows better results than the average of previous studies for tumor segmentation. Conclusion Using deep learning models for medical image analysis can greatly improve the efficiency and accuracy of segmentation and tumor detection, leading to improved diagnostic and therapeutic outcomes. In addition, automated segmentation can reduce the workload of clinicians, allowing them to focus on other important tasks. This research highlights the potential for further studies in this area to further advance medical image analysis

    A Study of the Thermodynamics of Small Systems and Phase Transition in Bulk Square Well-Hard Disk Binary Mixture

    Get PDF
    Under the umbrella of statistical mechanics and particle-based simulations, two distinct problems have been discussed in this study. The first part included systems of finite clusters of three and 13 particles, where the particles are interacting via Lennard Jones potential. A machine learning technique, Diffusion Maps (DMap), has been employed to the large datasets of thermodynamically small systems from Monte Carlo simulations in order to identify the structural and energetic changes in these systems. DMap suggests at most three dimensions are required to describe and identify the systems with 9 (N = 3) and 39 (N = 13) dimensions. At the end of the study, a model has been proposed to functionalize the potential energy in terms of geometric variables that are identified with a heuristic screening. Investigation of the thermodynamics of bulk systems was another major focus of this thesis. The phase diagrams of the pure square-well solids and binary mixture of square-well and hard-disk particles, under the assumption of a pseudo-single- component model, have been constructed, and the phase equilibria behaviors were discussed. The datasets were also created in Monte Carlo simulations. The results showed isostructural solid-solid phase transition, which was previously identified that the pure square-well system with a very short range of attraction undergoes, also occurs in the presence of additional hard-disk components, namely for the binary mixture of square-well and hard-disk systems

    Aco-based feature selection algorithm for classification

    Get PDF
    Dataset with a small number of records but big number of attributes represents a phenomenon called “curse of dimensionality”. The classification of this type of dataset requires Feature Selection (FS) methods for the extraction of useful information. The modified graph clustering ant colony optimisation (MGCACO) algorithm is an effective FS method that was developed based on grouping the highly correlated features. However, the MGCACO algorithm has three main drawbacks in producing a features subset because of its clustering method, parameter sensitivity, and the final subset determination. An enhanced graph clustering ant colony optimisation (EGCACO) algorithm is proposed to solve the three (3) MGCACO algorithm problems. The proposed improvement includes: (i) an ACO feature clustering method to obtain clusters of highly correlated features; (ii) an adaptive selection technique for subset construction from the clusters of features; and (iii) a genetic-based method for producing the final subset of features. The ACO feature clustering method utilises the ability of various mechanisms such as intensification and diversification for local and global optimisation to provide highly correlated features. The adaptive technique for ant selection enables the parameter to adaptively change based on the feedback of the search space. The genetic method determines the final subset, automatically, based on the crossover and subset quality calculation. The performance of the proposed algorithm was evaluated on 18 benchmark datasets from the University California Irvine (UCI) repository and nine (9) deoxyribonucleic acid (DNA) microarray datasets against 15 benchmark metaheuristic algorithms. The experimental results of the EGCACO algorithm on the UCI dataset are superior to other benchmark optimisation algorithms in terms of the number of selected features for 16 out of the 18 UCI datasets (88.89%) and the best in eight (8) (44.47%) of the datasets for classification accuracy. Further, experiments on the nine (9) DNA microarray datasets showed that the EGCACO algorithm is superior than the benchmark algorithms in terms of classification accuracy (first rank) for seven (7) datasets (77.78%) and demonstrates the lowest number of selected features in six (6) datasets (66.67%). The proposed EGCACO algorithm can be utilised for FS in DNA microarray classification tasks that involve large dataset size in various application domains

    PROBABILISTIC PREDICTION USING EMBEDDED RANDOM PROJECTIONS OF HIGH DIMENSIONAL DATA

    Get PDF
    The explosive growth of digital data collection and processing demands a new approach to the historical engineering methods of data correlation and model creation. A new prediction methodology based on high dimensional data has been developed. Since most high dimensional data resides on a low dimensional manifold, the new prediction methodology is one of dimensional reduction with embedding into a diffusion space that allows optimal distribution along the manifold. The resulting data manifold space is then used to produce a probability density function which uses spatial weighting to influence predictions i.e. data nearer the query have greater importance than data further away. The methodology also allows data of differing phenomenology e.g. color, shape, temperature, etc to be handled by regression or clustering classification. The new methodology is first developed, validated, then applied to common engineering situations, such as critical heat flux prediction and shuttle pitch angle determination. A number of illustrative examples are given with a significant focus placed on the objective identification of two-phase flow regimes. It is shown that the new methodology is robust through accurate predictions with even a small number of data points in the diffusion space as well as flexible in the ability to handle a wide range of engineering problems

    2008 International Conference on BioMedical Engineering and Informatics Clustering of High-Dimensional Gene Expression Data with Feature Filtering Methods and Diffusion Maps

    Get PDF
    The importance of gene expression data in cancer diagnosis and treatment by now has been widely recognized by cancer researchers in recent years. However, one of the major challenges in the computational analysis of such data is the curse of dimensionality, due to the overwhelming number of measures of gene expression levels versus the small number of samples. Here, we use a two-step method to reduce the dimension of gene expression data. At first, we extract a subset of genes based on the statistical characteristics of their corresponding gene expression measurements. For further dimensionality reduction, we then apply diffusion maps, which interpret the eigenfunctions of Markov matrices as a system of coordinates on the original data set in order to obtain efficient representation of data geometric descriptions, to the reduced data. A neural network clustering theory, Fuzzy ART, is applied to the resulting data to generate clusters of cancer samples. Experimental results on the small round blue-cell tumor (SRBCT) data set, compared with other widely-used clustering algorithms, demonstrate the effectiveness of our proposed method in addressing multidimensional gene expression data. 1
    corecore