225 research outputs found
ROBUST NONNEGATIVE MATRIX FACTORIZATION WITH DISCRIMINABILITY FOR IMAGE REPRESENTATION
ABSTRACT Due to its psychological and physiological interpretation of naturally occurring data, Nonnegative Matrix Factorization (NMF) has attracted considerable attention for learning effective representation for images. And its graph-regularized extensions have shown promising results by exploiting the low dimensional manifold structure of data. Actually, their performance can be further improved because they still suffer from several important problems, i.e., sensitivity to noise in data, trivial solution problem, and ignoring the discriminative information. In this paper, we propose a novel method, referred to as Robust Nonnegative Matrix Factorization with Discriminability (RNMFD), for image representation, which can effectively and simultaneously cope with problems mentioned above by imposing a sparse noise matrix for data reconstruction and approximate orthogonal constraints. We carried out extensive experiments on five benchmark image datasets and the results demonstrate the superiority of our RNMFD in comparison with several state-of-the-art methods
Unsupervised Attributed Graph Learning: Models and Applications
abstract: Graph is a ubiquitous data structure, which appears in a broad range of real-world scenarios. Accordingly, there has been a surge of research to represent and learn from graphs in order to accomplish various machine learning and graph analysis tasks. However, most of these efforts only utilize the graph structure while nodes in real-world graphs usually come with a rich set of attributes. Typical examples of such nodes and their attributes are users and their profiles in social networks, scientific articles and their content in citation networks, protein molecules and their gene sets in biological networks as well as web pages and their content on the Web. Utilizing node features in such graphs---attributed graphs---can alleviate the graph sparsity problem and help explain various phenomena (e.g., the motives behind the formation of communities in social networks). Therefore, further study of attributed graphs is required to take full advantage of node attributes.
In the wild, attributed graphs are usually unlabeled. Moreover, annotating data is an expensive and time-consuming process, which suffers from many limitations such as annotatorsโ subjectivity, reproducibility, and consistency. The challenges of data annotation and the growing increase of unlabeled attributed graphs in various real-world applications significantly demand unsupervised learning for attributed graphs.
In this dissertation, I propose a set of novel models to learn from attributed graphs in an unsupervised manner. To better understand and represent nodes and communities in attributed graphs, I present different models in node and community levels. In node level, I utilize node features as well as the graph structure in attributed graphs to learn distributed representations of nodes, which can be useful in a variety of downstream machine learning applications. In community level, with a focus on social media, I take advantage of both node attributes and the graph structure to discover not only communities but also their sentiment-driven profiles and inter-community relations (i.e., alliance, antagonism, or no relation). The discovered community profiles and relations help to better understand the structure and dynamics of social media.Dissertation/ThesisDoctoral Dissertation Computer Science 201
๋ณ๋ ฌํ ์ฉ์ดํ ํต๊ณ๊ณ์ฐ ๋ฐฉ๋ฒ๋ก ๊ณผ ํ๋ ๊ณ ์ฑ๋ฅ ์ปดํจํ ํ๊ฒฝ์์ ์ ์ฉ
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ์์ฐ๊ณผํ๋ํ ํต๊ณํ๊ณผ, 2020. 8. ์์คํธ.Technological advances in the past decade, hardware and software alike, have made access to high-performance computing (HPC) easier than ever. In this dissertation, easily-parallelizable, inversion-free, and variable-separated algorithms and their implementation in statistical computing are discussed. The first part considers statistical estimation problems under structured sparsity posed as minimization of a sum of two or three convex functions, one of which is a composition of non-smooth and linear functions. Examples include graph-guided sparse fused lasso and overlapping group lasso. Two classes of inversion-free primal-dual algorithms are considered and unified from a perspective of monotone operator theory. From this unification, a continuum of preconditioned forward-backward operator splitting algorithms amenable to parallel and distributed computing is proposed. The unification is further exploited to introduce a continuum of accelerated algorithms on which the theoretically optimal asymptotic rate of convergence is obtained. For the second part, easy-to-use distributed matrix data structures in PyTorch and Julia are presented. They enable users to write code once and run it anywhere from a laptop to a workstation with multiple graphics processing units (GPUs) or a supercomputer in a cloud. With these data structures, various parallelizable statistical applications, including nonnegative matrix factorization, positron emission tomography, multidimensional scaling, and โ1-regularized Cox regression, are demonstrated. The examples scale up to an 8-GPU workstation and a 720-CPU-core cluster in a cloud. As a case in point, the onset of type-2 diabetes from the UK Biobank with 400,000 subjects and about 500,000 single nucleotide polymorphisms is analyzed using the HPC โ1-regularized Cox regression. Fitting a half-million variate model took about 50 minutes, reconfirming known associations. To my knowledge, the feasibility of a joint genome-wide association analysis of survival
outcomes at this scale is first demonstrated.์ง๋ 10๋
๊ฐ์ ํ๋์จ์ด์ ์ํํธ์จ์ด์ ๊ธฐ์ ์ ์ธ ๋ฐ์ ์ ๊ณ ์ฑ๋ฅ ์ปดํจํ
์ ์ ๊ทผ์ฅ๋ฒฝ์ ๊ทธ ์ด๋ ๋๋ณด๋ค ๋ฎ์ถ์๋ค. ์ด ํ์๋
ผ๋ฌธ์์๋ ๋ณ๋ ฌํ ์ฉ์ดํ๊ณ ์ญํ๋ ฌ ์ฐ์ฐ์ด ์๋ ๋ณ์ ๋ถ๋ฆฌ ์๊ณ ๋ฆฌ์ฆ๊ณผ ๊ทธ ํต๊ณ๊ณ์ฐ์์์ ๊ตฌํ์ ๋
ผ์ํ๋ค. ์ฒซ ๋ถ๋ถ์ ๋ณผ๋ก ํจ์ ๋ ๊ฐ ๋๋ ์ธ ๊ฐ์ ํฉ์ผ๋ก ๋ํ๋๋ ๊ตฌ์กฐํ๋ ํฌ์ ํต๊ณ ์ถ์ ๋ฌธ์ ์ ๋ํด ๋ค๋ฃฌ๋ค. ์ด ๋ ํจ์๋ค ์ค ํ๋๋ ๋นํํ ํจ์์ ์ ํ ํจ์์ ํฉ์ฑ์ผ๋ก ๋ํ๋๋ค. ๊ทธ ์์๋ก๋ ๊ทธ๋ํ ๊ตฌ์กฐ๋ฅผ ํตํด ์ ๋๋๋ ํฌ์ ์ตํฉ Lasso ๋ฌธ์ ์ ํ ๋ณ์๊ฐ ์ฌ๋ฌ ๊ทธ๋ฃน์ ์ํ ์ ์๋ ๊ทธ๋ฃน Lasso ๋ฌธ์ ๊ฐ ์๋ค. ์ด๋ฅผ ํ๊ธฐ ์ํด ์ญํ๋ ฌ ์ฐ์ฐ์ด ์๋ ๋ ์ข
๋ฅ์ ์์-์๋ (primal-dual) ์๊ณ ๋ฆฌ์ฆ์ ๋จ์กฐ ์ฐ์ฐ์ ์ด๋ก ๊ด์ ์์ ํตํฉํ๋ฉฐ ์ด๋ฅผ ํตํด ๋ณ๋ ฌํ ์ฉ์ดํ precondition๋ ์ ๋ฐฉ-ํ๋ฐฉ ์ฐ์ฐ์ ๋ถํ ์๊ณ ๋ฆฌ์ฆ์ ์งํฉ์ ์ ์ํ๋ค. ์ด ํตํฉ์ ์ ๊ทผ์ ์ผ๋ก ์ต์ ์๋ ด๋ฅ ์ ๊ฐ๋ ๊ฐ์ ์๊ณ ๋ฆฌ์ฆ์ ์งํฉ์ ๊ตฌ์ฑํ๋ ๋ฐ ํ์ฉ๋๋ค. ๋ ๋ฒ์งธ ๋ถ๋ถ์์๋ PyTorch์ Julia๋ฅผ ํตํด ์ฌ์ฉํ๊ธฐ ์ฌ์ด ๋ถ์ฐ ํ๋ ฌ ์๋ฃ ๊ตฌ์กฐ๋ฅผ ์ ์ํ๋ค. ์ด ๊ตฌ์กฐ๋ ์ฌ์ฉ์๋ค์ด ์ฝ๋๋ฅผ ํ ๋ฒ ์์ฑํ๋ฉด
์ด๊ฒ์ ๋
ธํธ๋ถ ํ ๋์์๋ถํฐ ์ฌ๋ฌ ๋์ ๊ทธ๋ํฝ ์ฒ๋ฆฌ ์ฅ์น (GPU)๋ฅผ ๊ฐ์ง ์ํฌ์คํ
์ด์
, ๋๋ ํด๋ผ์ฐ๋ ์์ ์๋ ์ํผ์ปดํจํฐ๊น์ง ๋ค์ํ ์ค์ผ์ผ์์ ์คํํ ์ ์๊ฒ ํด ์ค๋ค. ์์ธ๋ฌ, ์ด ์๋ฃ ๊ตฌ์กฐ๋ฅผ ๋น์ ํ๋ ฌ ๋ถํด, ์์ ์ ๋จ์ธต ์ดฌ์, ๋ค์ฐจ์ ์ฒ
๋๋ฒ, โ1-๋ฒ์ ํ Cox ํ๊ท ๋ถ์ ๋ฑ ๋ค์ํ ๋ณ๋ ฌํ ๊ฐ๋ฅํ ํต๊ณ์ ๋ฌธ์ ์ ์ ์ฉํ๋ค. ์ด ์์๋ค์ 8๋์ GPU๊ฐ ์๋ ์ํฌ์คํ
์ด์
๊ณผ 720๊ฐ์ ์ฝ์ด๊ฐ ์๋ ํด๋ผ์ฐ๋ ์์ ๊ฐ์ ํด๋ฌ์คํฐ์์ ํ์ฅ ๊ฐ๋ฅํ๋ค. ํ ์ฌ๋ก๋ก 400,000๋ช
์ ๋์๊ณผ 500,000๊ฐ์ ๋จ์ผ ์ผ๊ธฐ ๋คํ์ฑ ์ ๋ณด๊ฐ ์๋ UK Biobank ์๋ฃ์์์ ์ 2ํ ๋น๋จ๋ณ (T2D) ๋ฐ๋ณ ๋์ด๋ฅผ โ1-๋ฒ์ ํ Cox ํ๊ท ๋ชจํ์ ํตํด ๋ถ์ํ๋ค. 500,000๊ฐ์ ๋ณ์๊ฐ ์๋ ๋ชจํ์ ์ ํฉ์ํค๋ ๋ฐ 50๋ถ ๊ฐ๋์ ์๊ฐ์ด ๊ฑธ๋ ธ์ผ๋ฉฐ ์๋ ค์ง T2D ๊ด๋ จ ๋คํ์ฑ๋ค์ ์ฌํ์ธํ ์ ์์๋ค. ์ด๋ฌํ ๊ท๋ชจ์ ์ ์ ์ ์ฒด ๊ฒฐํฉ ์์กด ๋ถ์์ ์ต์ด๋ก ์๋๋ ๊ฒ์ด๋ค.Chapter1Prologue 1
1.1 Introduction 1
1.2 Accessible High-Performance Computing Systems 4
1.2.1 Preliminaries 4
1.2.2 Multiple CPU nodes: clusters, supercomputers, and clouds 7
1.2.3 Multi-GPU node 9
1.3 Highly Parallelizable Algorithms 12
1.3.1 MM algorithms 12
1.3.2 Proximal gradient descent 14
1.3.3 Proximal distance algorithm 16
1.3.4 Primal-dual methods 17
Chapter 2 Easily Parallelizable and Distributable Class of Algorithms for Structured Sparsity, with Optimal Acceleration 20
2.1 Introduction 20
2.2 Unification of Algorithms LV and CV (g โก 0) 30
2.2.1 Relation between Algorithms LV and CV 30
2.2.2 Unified algorithm class 34
2.2.3 Convergence analysis 35
2.3 Optimal acceleration 39
2.3.1 Algorithms 40
2.3.2 Convergence analysis 41
2.4 Stochastic optimal acceleration 45
2.4.1 Algorithm 45
2.4.2 Convergence analysis 47
2.5 Numerical experiments 50
2.5.1 Model problems 50
2.5.2 Convergence behavior 52
2.5.3 Scalability 62
2.6 Discussion 63
Chapter 3 Towards Unified Programming for High-Performance Statistical Computing Environments 66
3.1 Introduction 66
3.2 Related Software 69
3.2.1 Message-passing interface and distributed array interfaces 69
3.2.2 Unified array interfaces for CPU and GPU 69
3.3 Easy-to-use Software Libraries for HPC 70
3.3.1 Deep learning libraries and HPC 70
3.3.2 Case study: PyTorch versus TensorFlow 73
3.3.3 A brief introduction to PyTorch 76
3.3.4 A brief introduction to Julia 80
3.3.5 Methods and multiple dispatch 80
3.3.6 Multidimensional arrays 82
3.3.7 Matrix multiplication 83
3.3.8 Dot syntax for vectorization 86
3.4 Distributed matrix data structure 87
3.4.1 Distributed matrices in PyTorch: distmat 87
3.4.2 Distributed arrays in Julia: MPIArray 90
3.5 Examples 98
3.5.1 Nonnegative matrix factorization 100
3.5.2 Positron emission tomography 109
3.5.3 Multidimensional scaling 113
3.5.4 L1-regularized Cox regression 117
3.5.5 Genome-wide survival analysis of the UK Biobank dataset 121
3.6 Discussion 126
Chapter 4 Conclusion 131
Appendix A Monotone Operator Theory 134
Appendix B Proofs for Chapter II 139
B.1 Preconditioned forward-backward splitting 139
B.2 Optimal acceleration 147
B.3 Optimal stochastic acceleration 158
Appendix C AWS EC2 and ParallelCluster 168
C.1 Overview 168
C.2 Glossary 169
C.3 Prerequisites 172
C.4 Installation 173
C.5 Configuration 173
C.6 Creating, accessing, and destroying the cluster 178
C.7 Installation of libraries 178
C.8 Running a job 179
C.9 Miscellaneous 180
Appendix D Code for memory-efficient L1-regularized Cox proportional hazards model 182
Appendix E Details of SNPs selected in L1-regularized Cox regression 184
Bibliography 188
๊ตญ๋ฌธ์ด๋ก 212Docto
Recommended from our members
Depth-adaptive methodologies for 3D image caregorization.
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London.Image classification is an active topic of computer vision research. This topic
deals with the learning of patterns in order to allow efficient classification of visual
information. However, most research efforts have focused on 2D image classification.
In recent years, advances of 3D imaging enabled the development of applications and
provided new research directions. In this thesis, we present methodologies and techniques for image classification using 3D image data. We conducted our research focusing on the attributes and
limitations of depth information regarding possible uses. This research led us to the
development of depth feature extraction methodologies that contribute to the representation
of images thus enhancing the recognition efficiency. We proposed a new
classification algorithm that adapts to the need of image representations by implementing
a scale-based decision that exploits discriminant parts of representations.
Learning from the design of image representation methods, we introduced our own
which describes each image by its depicting content providing more discriminative image
representation. We also propose a dictionary learning method that exploits the
relation of training features by assessing the similarity of features originating from
similar context regions. Finally, we present our research on deep learning algorithms
combined with data and techniques used in 3D imaging. Our novel methods provide
state-of-the-art results, thus contributing to the research of 3D image classificatio
No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling
Extracting knowledge from unlabeled texts using machine learning algorithms
can be complex. Document categorization and information retrieval are two
applications that may benefit from unsupervised learning (e.g., text clustering
and topic modeling), including exploratory data analysis. However, the
unsupervised learning paradigm poses reproducibility issues. The initialization
can lead to variability depending on the machine learning algorithm.
Furthermore, the distortions can be misleading when regarding cluster geometry.
Amongst the causes, the presence of outliers and anomalies can be a determining
factor. Despite the relevance of initialization and outlier issues for text
clustering and topic modeling, the authors did not find an in-depth analysis of
them. This survey provides a systematic literature review (2011-2022) of these
subareas and proposes a common terminology since similar procedures have
different terms. The authors describe research opportunities, trends, and open
issues. The appendices summarize the theoretical background of the text
vectorization, the factorization, and the clustering algorithms that are
directly or indirectly related to the reviewed works
- โฆ