21 research outputs found

    Error Metrics for Learning Reliable Manifolds from Streaming Data

    Full text link
    Spectral dimensionality reduction is frequently used to identify low-dimensional structure in high-dimensional data. However, learning manifolds, especially from the streaming data, is computationally and memory expensive. In this paper, we argue that a stable manifold can be learned using only a fraction of the stream, and the remaining stream can be mapped to the manifold in a significantly less costly manner. Identifying the transition point at which the manifold is stable is the key step. We present error metrics that allow us to identify the transition point for a given stream by quantitatively assessing the quality of a manifold learned using Isomap. We further propose an efficient mapping algorithm, called S-Isomap, that can be used to map new samples onto the stable manifold. We describe experiments on a variety of data sets that show that the proposed approach is computationally efficient without sacrificing accuracy

    Розроблення алгоритмів візуалізації властивостей гетерогенних та композиційних матеріалів

    Get PDF
    Today, a lot of different materials are used in human activity. These materials are distinguished by their mechanical properties, colour, and chemical composition. A wide variety of materials raises the problem of keeping information about them and about their visualization. A database of materials and an online application that provides access to information on various types of metals and their alloys, as well as to find the nearest analogues with the help of intelligent search function is created in the work. To create the database, MySQL database management system is used. Filling of this database is done by developing a special parser program. This program scanned pages from the Internet and highlighted the properties of the materials and stored them in the database. Dimensional reduction algorithms are used to visualize the data. These algorithms are aimed at finding such a projection of high-dimensional data into low-dimensional space. When using this projection, all internal relationships between the data are stored. This made it possible to construct and visualize large-dimensional data on a plane. To do this, the isomap, mds, t-sne algorithms are used. Thus, the database of materials was developed in the work. Access to the database provided through a web application. The web application provides the ability to use it from all devices that have access to the Internet and eliminates the need to install additional software. The web application uses a python programming language and a special framework Flask for creating web applications. This technology choice greatly simplifies the creation of web applications and allows us to quickly and flexibly manage your data. Visualizations are built for the collected data. To visualize data, one of the main tasks is to reduce the dimension of data. Reducing dimensionality requires accessible visualization methods that work effectively in two or three dimensional spaces. According to the results of the visualization, clusters of materials that correspond to already known classes of materials are revealed. This confirms the correctness of constructing visualization models and reducing the dimensionality, as well as the correctness of the information gathered.Сьогодні у людській діяльності використовують дуже велику кількість різних матеріалів. Ці матеріали відрізняться між собою механічними властивостями, кольором, хімічним складом. Широке розмаїття матеріалів становить завдання про збереження інформації про них та про її візуалізацію. Створено базу даних матеріалів та веб-застосунок, який дає можливість отримати доступ до інформації про різні види металів та їхні сплави, а також підібрати найближчі аналоги за допомогою функції інтелектуального пошуку. Для створення бази даних використано систему керування базами даних MySQL. Наповнення цієї бази даних відбувалось способом розроблення спеціальної програми-парсера. Ця програма переглядала сторінки з мережі Інтернет та виділяла інформацію про властивості матеріалів і зберігала їх до бази даних. Для візуалізації даних використано алгоритми зниження розмірності. Ці алгоритми спрямовані на відшукання такої проекції високовимірних даних на низьковимірний простір. Під час використання цієї проекції зберігаються всі внутрішні взаємозв'язки між даними. Це дало змогу побудувати відображення великовимірних даних на площину та візуалізувати їх. Для цього використано алгоритми Isomap, MDS, t-SNE. В такий спосіб розроблено базу даних матеріалів. Доступ до бази даних відбувається за допомогою веб-застосунку. Для зібраних даних побудовано візуалізації. За результатами візуалізації виявлено кластери матеріалів, які відповідають вже відомим класам матеріалів. Це підтверджує коректність побудови моделей візуалізації та зниження розмірності, а також правильність зібраної інформації

    Dimensionality Reduction by Weighted Connections between Neighborhoods

    Get PDF
    Dimensionality reduction is the transformation of high-dimensional data into a meaningful representation of reduced dimensionality. This paper introduces a dimensionality reduction technique by weighted connections between neighborhoods to improve K-Isomap method, attempting to preserve perfectly the relationships between neighborhoods in the process of dimensionality reduction. The validity of the proposal is tested by three typical examples which are widely employed in the algorithms based on manifold. The experimental results show that the local topology nature of dataset is preserved well while transforming dataset in high-dimensional space into a new dataset in low-dimensionality by the proposed method

    Information visualization by dimensionality reduction: a review

    Full text link

    딥러닝 기반 군집화 방법을 이용하여 FDG PET에서 알츠하이머병의 공간적 뇌 대사 패턴의 특징적 아형 분류

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 분자의학 및 바이오제약학과, 2022.2. 이동수.알츠하이머병은 아밀로이드와 타우 침착과 같은 병리학적 특징을 공유함에도 불구하고 광범위한 임상병리학적 특성을 보인다. 본 연구에서는 딥러닝 기반 군집화 방법을 이용하여 FDG PET 영상에서 알츠하이머병 특징적 아형을 분류하여 신경 퇴행의 공간적 뇌 대사 패턴을 이해하고자 하였으며, 공간적 뇌 대사 패턴에 의해 정의된 아형의 임상병리학적 특징을 밝히고자 하였다. Alzheimer’s Disease Neuroimaging Initiative(ADNI) 데이터베이스로부터 첫번째 방문 및 추적 방문을 포함한 알츠하이머병, 경도인지장애, 인지 정상군의 총 3620개의 FDG 뇌 양전자단층촬영(PET) 영상을 수집하였다. 알츠하이머병에서 질병의 진행 외의 뇌 대사 패턴을 나타내는 표현(representation)을 찾기 위하여, 조건부 변이형 오토인코더(conditional variational autoencoder)를 사용하였으며, 인코딩된 표현으로부터 군집화(clustering)를 시행하였다. 알츠하이머병의 뇌 FDG PET (n=838)과 CDR-SB(Clinical Demetria Rating Scale Sum of Boxes) 점수가 cVAE 모델의 입력값으로 사용되었으며, 군집화에는 k-means 알고리즘이 사용되었다. 훈련된 딥러닝 모델은 경도인지장애군 (n=1761)의 뇌 FDG PET에 전이(transfer)되어 각 아형의 서로 다른 궤적(trajectory)과 예후를 밝히고자 하였다. 통계적 파라미터 지도작성법(Statistical Parametric Mapping, SPM)을 이용하여 각 군집의 공간적 패턴을 시각화 하였으며, 각 군집의 임상적 및 생물학적 특징을 비교하였다. 또한 아형 별 경도인지장애로부터 알츠하이머병으로 전환되는 비율을 비교하였다. 딥러닝 기반 군집화 방법으로 4개의 특징적 아형이 분류되었다. (i) S1 (angular): 모이랑(angular gyrus)에서 현저한 대사 저하를 보이며 분산된 피질의 대사 저하 패턴, 남성에서 빈도 높음, 더 많은 아밀로이드 침착, 더 적은 타우 침착, 더 심한 해마 위축, 초기 단계의 인지 저하의 특징을 보였다. (ii) S2 (occipital): 후두엽(occipital) 피질에서 현저한 대사 저하를 보이며 후부 우세한 대사 저하 패턴, 더 적은 연령, 더 많은 타우, 더 적은 해마 위축, 더 낮은 집행 및 시공간 점수, 경도인지장애로부터 알츠하이머병으로의 빠른 전환의 특징을 보였다. (iii) S3(orbitofrontal): 안와전두(orbitofrontal) 피질에서 현저한 대사 저하를 보이며 전방 우세한 대사 저하 패턴, 더 높은 연령, 더 적은 아밀로이드 침착, 더 심한 해마 위축, 더 높은 집행 및 시공간 점수의 특징을 보였다. (iv) S4(minimal): 최소의 대사 저하를 보임, 여성에서 빈도 높음, 더 적은 아밀로이드 침착, 더 많은 타우 침착, 더 적은 해마 위축, 더 높은 인지기능 점수의 특징을 보였다. 결론적으로, 본 연구에서 우리는 서로 다른 뇌 병리 및 임상 특성을 가진 알츠하이머병의 특징적 아형을 분류하였다. 또한 우리 딥러닝 모델은 경도인지장애군에 성공적으로 전이되어 아형 별 경도인지장애로부터 알츠하이머병으로 전환되는 예후를 예측할 수 있었다. 본 결과는 FDG PET에서 알츠하이머병의 특징적 아형은 개인의 임상 결과에 영향을 미칠 수 있고, 병태생리학 측면에서 알츠하이머병의 광범위한 스펙트럼을 이해하는데 단서를 제공할 수 있음을 시사한다.Alzheimer’s disease (AD) presents a broad spectrum of clinicopathologic profiles, despite common pathologic features including amyloid and tau deposition. Here, we aimed to identify AD subtypes using deep learning-based clustering on FDG PET images to understand distinct spatial patterns of neurodegeneration. We also aimed to investigate clinicopathologic features of subtypes defined by spatial brain metabolism patterns. A total of 3620 FDG brain PET images with AD, mild cognitive impairment (MCI), and cognitively normal controls (CN) at baseline and follow-up visits were obtained from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. In order to identify representations of brain metabolism patterns different from disease progression in AD, a conditional variational autoencoder (cVAE) was used, followed by clustering using the encoded representations. FDG brain PET images with AD (n=838) and Clinical Demetria Rating Scale Sum of Boxes (CDR-SB) scores were used as inputs of cVAE model and the k-means algorithm was applied for the clustering. The trained deep learning model was also transferred to FDG brain PET image with MCI (n=1761) to identify differential trajectories and prognosis of subtypes. Statistical parametric maps were generated to visualize spatial patterns of clusters, and clinical and biological characteristics were compared among the clusters. The conversion rate from MCI to AD was also compared among the subtypes. Four distinct subtypes were identified by deep learning-based FDG PET clusters: (i) S1 (angular), showing prominent hypometabolism in the angular gyrus with a diffuse cortical hypometabolism pattern; frequent in males; more amyloid; less tau; more hippocampal atrophy; cognitive decline in the earlier stage. (ii) S2 (occipital), showing prominent hypometabolism in the occipital cortex with a posterior-predominant hypometabolism pattern; younger age; more tau; less hippocampal atrophy; lower executive and visuospatial scores; faster conversion from MCI to AD. (iii) S3 (orbitofrontal), showing prominent hypometabolism in the orbitofrontal cortex with an anterior-predominant hypometabolism pattern; older age; less amyloid; more hippocampal atrophy; higher executive and visuospatial scores. (iv) S4 (minimal), showing minimal hypometabolism; frequent in females; less amyloid; more tau; less hippocampal atrophy; higher cognitive scores. In conclusion, we could identify distinct subtypes in AD with different brain pathologies and clinical profiles. Also, our deep learning model was successfully transferred to MCI to predict the prognosis of subtypes for conversion from MCI to AD. Our results suggest that distinct AD subtypes on FDG PET may have implications for the individual clinical outcomes and provide a clue to understanding a broad spectrum of AD in terms of pathophysiology.1. Introduction 1 1.1 Heterogeneity of Alzheimer's disease 1 1.2 FDG PET as a biomarker of Alzheimer's disease 1 1.3 Biologic subtypes of Alzheimer's disease 2 1.4 Dimensionality reduction methods 5 1.5 Variational autoencoder for clustering 8 1.6 Final goal of the study 10 2. Methods 11 2.1 Subjects 11 2.2 FDG PET data acquisition and preprocessing 12 2.3 Deep learning-based model for representations of FDG PET in AD 12 2.4 Clustering method for AD subtypes on FDG PET 17 2.5 Transfer of deep learning-based FDG PET cluster model for MCI subtypes 17 2.6 Visualization of subtype-specific spatial brain metabolism pattern 21 2.7 Clinical and biological characterization 21 2.8 Prognosis prediction of MCI subtypes 22 2.9 Generation of subtype-specific FDG PET images 22 2.10 Statistical analysis 23 3. Results 24 3.1 Deep learning-based FDG PET clusters 24 3.2 Spatial brain metabolism pattern in AD subtypes 27 3.3 Clinical and biological characterization in AD subtypes 32 3.4 Subtype-specific spatial metabolism patterns resemble in MCI 43 3.5 Clinical and biological characterization in MCI subtypes 50 3.6 Prognosis prediction of subtypes for conversion from MCI to AD 56 3.7 Generating FDG PET images of AD subtypes 61 4. Discussion 66 4.1 Limitations of previous subtyping approach 68 4.2 Interpretation of results 68 4.3 Strength of our deep learning-based clustering approach 73 4.4 Strength of our deep learning-based AD subtypes 77 4.5 Limitations and future directions 82 5. Conclusion 83 References 84 Supplementary Figures 99 국문 초록 101박

    A Subspace Projection Methodology for Nonlinear Manifold Based Face Recognition

    Get PDF
    A novel feature extraction method that utilizes nonlinear mapping from the original data space to the feature space is presented in this dissertation. Feature extraction methods aim to find compact representations of data that are easy to classify. Measurements with similar values are grouped to same category, while those with differing values are deemed to be of separate categories. For most practical systems, the meaningful features of a pattern class lie in a low dimensional nonlinear constraint region (manifold) within the high dimensional data space. A learning algorithm to model this nonlinear region and to project patterns to this feature space is developed. Least squares estimation approach that utilizes interdependency between points in training patterns is used to form the nonlinear region. The proposed feature extraction strategy is employed to improve face recognition accuracy under varying illumination conditions and facial expressions. Though the face features show variations under these conditions, the features of one individual tend to cluster together and can be considered as a neighborhood. Low dimensional representations of face patterns in the feature space may lie in a nonlinear constraint region, which when modeled leads to efficient pattern classification. A feature space encompassing multiple pattern classes can be trained by modeling a separate constraint region for each pattern class and obtaining a mean constraint region by averaging all the individual regions. Unlike most other nonlinear techniques, the proposed method provides an easy intuitive way to place new points onto a nonlinear region in the feature space. The proposed feature extraction and classification method results in improved accuracy when compared to the classical linear representations. Face recognition accuracy is further improved by introducing the concepts of modularity, discriminant analysis and phase congruency into the proposed method. In the modular approach, feature components are extracted from different sub-modules of the images and concatenated to make a single vector to represent a face region. By doing this we are able to extract features that are more representative of the local features of the face. When projected onto an arbitrary line, samples from well formed clusters could produce a confused mixture of samples from all the classes leading to poor recognition. Discriminant analysis aims to find an optimal line orientation for which the data classes are well separated. Experiments performed on various databases to evaluate the performance of the proposed face recognition technique have shown improvement in recognition accuracy, especially under varying illumination conditions and facial expressions. This shows that the integration of multiple subspaces, each representing a part of a higher order nonlinear function, could represent a pattern with variability. Research work is progressing to investigate the effectiveness of subspace projection methodology for building manifolds with other nonlinear functions and to identify the optimum nonlinear function from an object classification perspective
    corecore