662 research outputs found

    ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•œ ์ฐจ์› ์ถ•์†Œ๋ฅผ ์œ„ํ•œ UMAP ๊ฐœ์„  ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. ๊ณ ํ˜•๊ถŒ.One e ective way of understanding the characteristics of high-dimensional data is to embed it onto a low-dimensional space. Among many existing dimensionality reduction algorithms, Uniform Manifold Approximation and Projection (UMAP) has gained the most attention because of its fast and stable projection result. However, still it is too slow to be adopted for an interactive visual analytics system as it takes for a few minutes to embed even for a toy dataset (e.g., MNIST). Moreover, UMAP is vulnerable to di erent configurations of yperparameters, especially to the initialization methods and the number of epochs, which can bring about a serious bias mining insights from the embedding result. To achieve the responsiveness, we propose a progressive algorithm for UMAP, called Progressive UMAP, for the exploration of datasets by updating the embedding with a batch of points through a progressive computation. Next, to guarantee less biases and the robustness in the embedding, we present a novel dimensionality reduction algorithm called Uniform Manifold Approximation with Twophase Optimization (UMATO). We discover that the vulnerability comes from the approximation of cross-entropy loss function. UMATO, instead, takes a two-phase optimization approach: global optimization to obtain the overall skeleton of data, and local optimization to identify regional characteristics of a local area. In our experiment with one synthetic and three real-world datasets, UMATO outperformed widely-used baseline algorithms, such as PCA, t-SNE, UMAP, topological autoencoders and Anchor t-SNE, in terms of global quality metrics and 2D projection results. We further examine a case study of UMATO on real-world biological data and the extension to multi-phase optimization. Our work makes the original contributions to the field of dimensionality reduction, as well as the progressive visual analytics. Lastly, the thesis discusses the future research directions for improving the proposed algorithms.๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ์˜ ํŠน์„ฑ์„ ํŒŒ์•…ํ•˜๋Š” ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋Š” ์ €์ฐจ์› ๊ณต๊ฐ„์— ์ž„๋ฒ ๋”ฉ์„ ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋งŽ์€ ์ฐจ์› ์ถ•์†Œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์žˆ์ง€๋งŒ, ๊ท ์ผ ๋งค๋‹ˆํด๋“œ ๊ทผ์‚ฌ ๋ฐ ํˆฌ์˜๋ฒ• (UMAP)์€ ๋น ๋ฅธ ์†๋„์™€ ์•ˆ์ •์ ์ธ ํˆฌ์˜ ๊ฒฐ๊ณผ๋กœ ์ธํ•ด ๋งŽ์€ ์ฃผ๋ชฉ์„ ๋ฐ›์•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ˜„์žฌ์˜ UMAP์€ ์‹คํ—˜์šฉ ๋ฐ์ดํ„ฐ ์…‹์ธ MNIST์—๋„ ์ˆ˜ ๋ถ„์ด ๊ฑธ๋ฆฌ๋Š” ๋“ฑ, ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒ ์‹œ๊ฐ์  ๋ถ„์„ ์‹œ์Šคํ…œ์— ๋„์ž…๋˜๊ธฐ์—๋Š” ๋„ˆ๋ฌด ๋Š๋ฆฌ๋‹ค. ๋˜ํ•œ UMAP์€ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •์ด (ํŠนํžˆ, ์ดˆ๊ธฐํ™” ๋ฐฉ๋ฒ•๊ณผ epoch ์ˆ˜) ๋‹ฌ๋ผ์ง€๋Š” ๊ฒƒ์— ์ทจ์•ฝํ•œ๋ฐ, ์ด๊ฒƒ์€ ์ž„๋ฒ ๋”ฉ ๊ฒฐ๊ณผ๋กœ ๋ถ€ํ„ฐ ํ†ต์ฐฐ์„ ์–ป๋Š” ๊ณผ์ •์—์„œ ํฐ ์˜ค๋ฅ˜๋ฅผ ๋ฒ”ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•œ๋‹ค. UMAP์˜ ์ฆ‰๊ฐ์ ์ธ ๋ฐ˜์‘์„ฑ์„ ์–ป๊ธฐ ์œ„ํ•ด์„œ, UMAP์˜ ์ ์ง„์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ Progressive UMAP์„ ์ œ์•ˆํ•œ๋‹ค. ์ด๋กœ์จ ํ•œ ๋ฐฐ์น˜์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•  ๋•Œ๋งˆ๋‹ค ์ž„๋ฒ ๋”ฉ ๊ฒฐ๊ณผ๋ฅผ ์—…๋ฐ์ดํŠธ ํ•˜๊ฒŒ๋˜๋Š” ์ ์ง„์ ์ธ ๊ณ„์‚ฐ์ด ๊ฐ€๋Šฅํ•ด์ง„๋‹ค. ๋‹ค์Œ์œผ๋กœ ์ ์€ ํŽธํ–ฅ๊ณผ ๊ฐ•๊ฑดํ•œ ์ž„๋ฒ ๋”ฉ์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด UMATO๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋จผ์ € ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์ทจ์•ฝํ•จ์ด ์ตœ์ ํ™”๋ฅผ ๊ทผ์‚ฌํ•˜๋Š” ๊ณผ์ •์—์„œ ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์„ ๋ฐํžŒ๋‹ค. UMATO๋Š”, UMAP๊ณผ ๋‹ค๋ฅด๊ฒŒ, ๋‘ ๋‹จ๊ณ„์— ๊ฑธ์นœ ์ตœ์ ํ™”๋ฅผ ํ†ตํ•ด์„œ ์ฒ˜์Œ์œผ๋กœ ์ „์ฒด์ ์ธ ๊ตฌ์กฐ๋ฅผ ์žก๊ณ , ๊ทธ ๋‹ค์Œ ์ง€์—ญ์  ํŠน์„ฑ์„ ํŒŒ์•…ํ•œ๋‹ค. ์‹คํ—˜์„ ํ†ตํ•ด UMATO๊ฐ€ PCA, t-SNE, UMAP, topological autoencoders ๊ทธ๋ฆฌ๊ณ  Anchort-SNE์™€ ๊ฐ™์€ ๊ธฐ์กด ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋น„ํ•ด ์ „์ฒด ๊ตฌ์กฐ ํ‰๊ฐ€ ์ง€ํ‘œ์™€ 2์ฐจ์› ์ž„๋ฒ ๋”ฉ ๊ฒฐ๊ณผ์—์„œ ๋” ๋‚˜์Œ์„ ๋ณด์ธ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ ์—ฌ๋Ÿฌ ๋‹จ๊ณ„๋กœ ์ตœ์ ํ™” ํ•˜๋Š” ๊ฒƒ๊ณผ ์ž„๋ฒ ๋”ฉ์˜ ์•ˆ์ •์„ฑ ์—ญ์‹œ ์‹คํ—˜์œผ๋กœ ํŒŒ์•…ํ•œ๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” ์ฐจ์› ์ถ•์†Œ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ ์ง„์  ์‹œ๊ฐํ™” ๋ถ„์•ผ์—๋„ ๋…์ฐฝ์ ์ธ ๊ณตํ—Œ์„ ํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์—ฐ๊ตฌ์˜ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์„ ๋„๋ชจํ•œ๋‹ค.CHAPTER 1 Introduction 1 1.1 Motivation 1 1.2 Research Questions and Approaches 2 1.2.1 Progressive Algorithm for UMAP 3 1.2.2 Less Biased and Robust Dimensionality Reduction Algorithm 4 1.3 Contributions 4 1.4 Thesis Overview 5 CHAPTER 2 Background: UMAP 6 2.1 Graph Construction 6 2.2 Layout Optimization 7 CHAPTER 3 Progressive UMAP: A Progressive Algorithm for UMAP 10 3.1 Introduction 10 3.2 Related Work 11 3.2.1 Progressive Visual Analytics 11 3.3 Progressive UMAP 13 3.3.1 Computing Ni 14 3.3.2 Computing ฯi and ฯƒi 14 3.3.3 Layout Initialization 14 3.3.4 Layout Optimization 15 3.4 Evaluation and Discussion 15 3.5 Summary 18 CHAPTER 4 UMATO: A Less Biased and Robust Dimensionality Reduction Algorithm Based on UMAP 19 4.1 Introduction 19 4.2 Related Work 22 4.2.1 Dimensionality Reduction 22 4.2.2 Hubs, landmarks, and anchors 23 4.3 The Meaning of Using Di erent Loss Functions in Dimensionality Reduction 25 4.3.1 t-SNE 25 4.4 UMATO 27 4.4.1 Points Classification 28 4.4.2 Global Optimization 29 4.4.3 Local Optimization 30 4.4.4 Outliers Arrangement 32 4.5 Experiments 33 4.5.1 Quantitative and Qualitative Evaluation of UMATO Compared to Six Baseline Algorithms 33 4.5.2 Case Study: UMATO on Real-world Biological Data 39 4.6 Discussion 41 4.7 Summary 46 CHAPTER 5 Discussion 48 5.1 Lessons Learned 48 5.2 Limitations 49 CHAPTER 6 Conclusion 50 Abstract (Korean) 58์„

    Projection-Based Clustering through Self-Organization and Swarm Intelligence

    Get PDF
    It covers aspects of unsupervised machine learning used for knowledge discovery in data science and introduces a data-driven approach to cluster analysis, the Databionic swarm (DBS). DBS consists of the 3D landscape visualization and clustering of data. The 3D landscape enables 3D printing of high-dimensional data structures. The clustering and number of clusters or an absence of cluster structure are verified by the 3D landscape at a glance. DBS is the first swarm-based technique that shows emergent properties while exploiting concepts of swarm intelligence, self-organization and the Nash equilibrium concept from game theory. It results in the elimination of a global objective function and the setting of parameters. By downloading the R package DBS can be applied to data drawn from diverse research fields and used even by non-professionals in the field of data mining

    Projection-Based Clustering through Self-Organization and Swarm Intelligence: Combining Cluster Analysis with the Visualization of High-Dimensional Data

    Get PDF
    Cluster Analysis; Dimensionality Reduction; Swarm Intelligence; Visualization; Unsupervised Machine Learning; Data Science; Knowledge Discovery; 3D Printing; Self-Organization; Emergence; Game Theory; Advanced Analytics; High-Dimensional Data; Multivariate Data; Analysis of Structured Dat

    The World We Want to Live In

    Get PDF
    Digitalisation, digital networks, and artificial intelligence are fundamentally changing our lives! We must understand the various developments and assess how they interact and how they affect our regular, analogue lives. What are the consequences of such changes for me personally and for our society? Digital networks and artificial intelligence are seminal innovations that are going to permeate all areas of society and trigger a comprehensive, disruptive structural change that will evoke numerous new advances in research and development in the coming years. Even though there are numerous books on this subject matter, most of them cover only specific aspects of the profound and multifaceted effects of the digital transformation. An overarching assessment is missing. In 2016, the Federation of German Scientists (VDW) has founded a study group to assess the technological impacts of digitalisation holistically. Now we present this compendium to you. We address the interrelations and feedbacks of digital innovation on policy, law, economics, science, and society from various scientific perspectives. Please consider this book as an invitation to contemplate with other people and with us, what kind of world we want to live in

    Optimization and Energy Maximizing Control Systems for Wave Energy Converters

    Get PDF
    The book, โ€œOptimization and Energy Maximizing Control Systems for Wave Energy Convertersโ€, presents eleven contributions on the latest scientific advancements of 2020-2021 in wave energy technology optimization and control, including holistic techno-economic optimization, inclusion of nonlinear effects, and real-time implementations of estimation and control algorithms
    • โ€ฆ
    corecore