100 research outputs found

    No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling

    Full text link
    Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variability depending on the machine learning algorithm. Furthermore, the distortions can be misleading when regarding cluster geometry. Amongst the causes, the presence of outliers and anomalies can be a determining factor. Despite the relevance of initialization and outlier issues for text clustering and topic modeling, the authors did not find an in-depth analysis of them. This survey provides a systematic literature review (2011-2022) of these subareas and proposes a common terminology since similar procedures have different terms. The authors describe research opportunities, trends, and open issues. The appendices summarize the theoretical background of the text vectorization, the factorization, and the clustering algorithms that are directly or indirectly related to the reviewed works

    동쒅, 이쒅, 그리고 λ‚˜λ¬΄ ν˜•νƒœμ˜ κ·Έλž˜ν”„λ₯Ό μœ„ν•œ 비지도 ν‘œν˜„ ν•™μŠ΅

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2022. 8. μ΅œμ§„μ˜.κ·Έλž˜ν”„ 데이터에 λŒ€ν•œ 비지도 ν‘œν˜„ ν•™μŠ΅μ˜ λͺ©μ μ€ κ·Έλž˜ν”„μ˜ ꡬ쑰와 λ…Έλ“œμ˜ 속성을 잘 λ°˜μ˜ν•˜λŠ” μœ μš©ν•œ λ…Έλ“œ λ‹¨μœ„ ν˜Ήμ€ κ·Έλž˜ν”„ λ‹¨μœ„μ˜ 벑터 ν˜•νƒœ ν‘œν˜„μ„ ν•™μŠ΅ν•˜λŠ” 것이닀. 졜근, κ·Έλž˜ν”„ 데이터에 λŒ€ν•΄ κ°•λ ₯ν•œ ν‘œν˜„ ν•™μŠ΅ λŠ₯λ ₯을 κ°–μΆ˜ κ·Έλž˜ν”„ 신경망을 ν™œμš©ν•œ 비지도 κ·Έλž˜ν”„ ν‘œν˜„ ν•™μŠ΅ λͺ¨λΈμ˜ 섀계가 μ£Όλͺ©μ„ λ°›κ³  μžˆλ‹€. λ§Žμ€ 방법듀은 ν•œ μ’…λ₯˜μ˜ 엣지와 ν•œ μ’…λ₯˜μ˜ λ…Έλ“œκ°€ μ‘΄μž¬ν•˜λŠ” 동쒅 κ·Έλž˜ν”„μ— λŒ€ν•œ ν•™μŠ΅μ— 집쀑을 ν•œλ‹€. ν•˜μ§€λ§Œ 이 세상에 μˆ˜λ§Žμ€ μ’…λ₯˜μ˜ 관계가 μ‘΄μž¬ν•˜κΈ° λ•Œλ¬Έμ—, κ·Έλž˜ν”„ λ˜ν•œ ꡬ쑰적, 의미둠적 속성을 톡해 λ‹€μ–‘ν•œ μ’…λ₯˜λ‘œ λΆ„λ₯˜ν•  수 μžˆλ‹€. κ·Έλž˜μ„œ, κ·Έλž˜ν”„λ‘œλΆ€ν„° μœ μš©ν•œ ν‘œν˜„μ„ ν•™μŠ΅ν•˜κΈ° μœ„ν•΄μ„œλŠ” 비지도 ν•™μŠ΅ ν”„λ ˆμž„μ›Œν¬λŠ” μž…λ ₯ κ·Έλž˜ν”„μ˜ νŠΉμ§•μ„ μ œλŒ€λ‘œ κ³ λ €ν•΄μ•Όλ§Œ ν•œλ‹€. λ³Έ ν•™μœ„λ…Όλ¬Έμ—μ„œ μš°λ¦¬λŠ” 널리 μ ‘ν•  수 μžˆλŠ” 세가지 κ·Έλž˜ν”„ ꡬ쑰인 동쒅 κ·Έλž˜ν”„, 트리 ν˜•νƒœμ˜ κ·Έλž˜ν”„, 그리고 이쒅 κ·Έλž˜ν”„μ— λŒ€ν•œ κ·Έλž˜ν”„ 신경망을 ν™œμš©ν•˜λŠ” 비지도 ν•™μŠ΅ λͺ¨λΈλ“€μ„ μ œμ•ˆν•œλ‹€. 처음으둜, μš°λ¦¬λŠ” 동쒅 κ·Έλž˜ν”„μ˜ λ…Έλ“œμ— λŒ€ν•˜μ—¬ 저차원 ν‘œν˜„μ„ ν•™μŠ΅ν•˜λŠ” κ·Έλž˜ν”„ μ»¨λ³Όλ£¨μ…˜ μ˜€ν† μΈμ½”λ” λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. 기쑴의 κ·Έλž˜ν”„ μ˜€ν† μΈμ½”λ”λŠ” ꡬ쑰의 전체가 ν•™μŠ΅μ΄ λΆˆκ°€λŠ₯ν•΄μ„œ μ œν•œμ μΈ ν‘œν˜„ ν•™μŠ΅ λŠ₯λ ₯을 κ°€μ§ˆ 수 μžˆλŠ” λ°˜λ©΄μ—, μ œμ•ˆν•˜λŠ” μ˜€ν† μΈμ½”λ”λŠ” λ…Έλ“œμ˜ 피쳐λ₯Ό λ³΅μ›ν•˜λ©°,ꡬ쑰의 전체가 ν•™μŠ΅μ΄ κ°€λŠ₯ν•˜λ‹€. λ…Έλ“œμ˜ 피쳐λ₯Ό λ³΅μ›ν•˜κΈ° μœ„ν•΄μ„œ, μš°λ¦¬λŠ” 인코더 λΆ€λΆ„μ˜ 역할이 μ΄μ›ƒν•œ λ…Έλ“œλΌλ¦¬ μœ μ‚¬ν•œ ν‘œν˜„μ„ κ°€μ§€κ²Œ ν•˜λŠ” λΌν”ŒλΌμ‹œμ•ˆ μŠ€λ¬΄λ”©μ΄λΌλŠ” 것에 μ£Όλͺ©ν•˜μ—¬ 디코더 λΆ€λΆ„μ—μ„œλŠ” 이웃 λ…Έλ“œμ˜ ν‘œν˜„κ³Ό λ©€μ–΄μ§€κ²Œ ν•˜λŠ” λΌν”ŒλΌμ‹œμ•ˆ 샀프닝을 ν•˜λ„λ‘ μ„€κ³„ν•˜μ˜€λ‹€. λ˜ν•œ λΌν”ŒλΌμ‹œμ•ˆ 샀프닝을 κ·ΈλŒ€λ‘œ μ μš©ν•˜λ©΄ λΆˆμ•ˆμ •μ„±μ„ μœ λ°œν•  수 있기 λ•Œλ¬Έμ—, μ—£μ§€μ˜ κ°€μ€‘μΉ˜ 값에 음의 값을 쀄 수 μžˆλŠ” λΆ€ν˜Έν˜• κ·Έλž˜ν”„λ₯Ό ν™œμš©ν•˜μ—¬ μ•ˆμ •μ μΈ λΌν”ŒλΌμ‹œμ•ˆ μƒ€ν”„λ‹μ˜ ν˜•νƒœλ₯Ό μ œμ•ˆν•˜μ˜€λ‹€. 동쒅 κ·Έλž˜ν”„μ— λŒ€ν•œ λ…Έλ“œ ν΄λŸ¬μŠ€ν„°λ§κ³Ό 링크 예츑 μ‹€ν—˜μ„ ν†΅ν•˜μ—¬ μ œμ•ˆν•˜λŠ” 방법이 μ•ˆμ •μ μœΌλ‘œ μš°μˆ˜ν•œ μ„±λŠ₯을 λ³΄μž„μ„ ν™•μΈν•˜μ˜€λ‹€. λ‘˜μ§Έλ‘œ, μš°λ¦¬λŠ” 트리의 ν˜•νƒœλ₯Ό κ°€μ§€λŠ” 계측적인 관계λ₯Ό 가지고 μžˆλŠ” κ·Έλž˜ν”„μ˜ λ…Έλ“œ ν‘œν˜„μ„ μ •ν™•ν•˜κ²Œ ν•™μŠ΅ν•˜κΈ° μœ„ν•˜μ—¬ μŒκ³‘μ„  κ³΅κ°„μ—μ„œ λ™μž‘ν•˜λŠ” μ˜€ν† μΈμ½”λ” λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. μœ ν΄λ¦¬λ””μ–Έ 곡간은 트리λ₯Ό μ‚¬μƒν•˜κΈ°μ— λΆ€μ μ ˆν•˜λ‹€λŠ” 졜근의 뢄석을 ν†΅ν•˜μ—¬, μŒκ³‘μ„  κ³΅κ°„μ—μ„œ κ·Έλž˜ν”„ μ‹ κ²½λ§μ˜ λ ˆμ΄μ–΄λ₯Ό ν™œμš©ν•˜μ—¬ λ…Έλ“œμ˜ 저차원 ν‘œν˜„μ„ ν•™μŠ΅ν•˜κ²Œ λœλ‹€. 이 λ•Œ, κ·Έλž˜ν”„ 신경망이 μŒκ³‘μ„  κΈ°ν•˜ν•™μ—μ„œ 계측 정보λ₯Ό λ‹΄κ³  μžˆλŠ” 거리의 값을 ν™œμš©ν•˜μ—¬ λ…Έλ“œμ˜ μ΄μ›ƒμ‚¬μ΄μ˜ μ€‘μš”λ„λ₯Ό ν™œμš©ν•˜λ„λ‘ μ„€κ³„ν•˜μ˜€λ‹€. μš°λ¦¬λŠ” λ…Όλ¬Έ 인용 관계 λ„€νŠΈμ›Œν¬, 계톡도, 이미지 μ‚¬μ΄μ˜ λ„€νŠΈμ›Œν¬λ“±μ— λŒ€ν•΄ μ œμ•ˆν•œ λͺ¨λΈμ„ μ μš©ν•˜μ—¬ λ…Έλ“œ ν΄λŸ¬μŠ€ν„°λ§κ³Ό 링크 예츑 μ‹€ν—˜μ„ ν•˜μ˜€μœΌλ©°, 트리의 ν˜•νƒœλ₯Ό κ°€μ§€λŠ” κ·Έλž˜ν”„μ— λŒ€ν•΄μ„œ μ œμ•ˆν•œ λͺ¨λΈμ΄ μœ ν΄λ¦¬λ””μ–Έ κ³΅κ°„μ—μ„œ μˆ˜ν–‰ν•˜λŠ” λͺ¨λΈμ— λΉ„ν•΄ ν–₯μƒλœ μ„±λŠ₯을 λ³΄μ˜€λ‹€λŠ” 것을 ν™•μΈν•˜μ˜€λ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ, μš°λ¦¬λŠ” μ—¬λŸ¬ μ’…λ₯˜μ˜ λ…Έλ“œμ™€ 엣지λ₯Ό κ°€μ§€λŠ” μ΄μ’…κ·Έλž˜ν”„μ— λŒ€ν•œ λŒ€μ‘° ν•™μŠ΅ λͺ¨λΈμ„ μ œμ•ˆν•œλ‹€. μš°λ¦¬λŠ” 기쑴의 방법듀이 ν•™μŠ΅ν•˜κΈ° 이전에 μΆ©λΆ„ν•œ 도메인 지식을 μ‚¬μš©ν•˜μ—¬ μ„€κ³„ν•œ λ©”νƒ€νŒ¨μŠ€λ‚˜ λ©”νƒ€κ·Έλž˜ν”„μ— μ˜μ‘΄ν•œλ‹€λŠ” 단점과 λ§Žμ€ μ΄μ’…κ·Έλž˜ν”„μ˜ 엣지가 λ‹€λ₯Έ λ…Έλ“œ μ’…λ₯˜μ‚¬μ΄μ˜ 관계에 μ§‘μ€‘ν•˜κ³  μžˆλ‹€λŠ” 점을 μ£Όλͺ©ν•˜μ˜€λ‹€. 이λ₯Ό 톡해 μš°λ¦¬λŠ” 사전과정이 ν•„μš”μ—†μœΌλ©° λ‹€λ₯Έ μ’…λ₯˜ μ‚¬μ΄μ˜ 관계에 λ”ν•˜μ—¬ 같은 μ’…λ₯˜ μ‚¬μ΄μ˜ 관계도 λ™μ‹œμ— 효율적으둜 ν•™μŠ΅ν•˜κ²Œ ν•˜λŠ” λ©”νƒ€λ…Έλ“œλΌλŠ” κ°œλ…μ„ μ œμ•ˆν•˜μ˜€λ‹€. λ˜ν•œ λ©”νƒ€λ…Έλ“œλ₯Ό κΈ°λ°˜μœΌλ‘œν•˜λŠ” κ·Έλž˜ν”„ 신경망과 λŒ€μ‘° ν•™μŠ΅ λͺ¨λΈμ„ μ œμ•ˆν•˜μ˜€λ‹€. μš°λ¦¬λŠ” μ œμ•ˆν•œ λͺ¨λΈμ„ λ©”νƒ€νŒ¨μŠ€λ₯Ό μ‚¬μš©ν•˜λŠ” μ΄μ’…κ·Έλž˜ν”„ ν•™μŠ΅ λͺ¨λΈκ³Ό λ…Έλ“œ ν΄λŸ¬μŠ€ν„°λ§ λ“±μ˜ μ‹€ν—˜ μ„±λŠ₯으둜 λΉ„κ΅ν•΄λ³΄μ•˜μ„ λ•Œ, λΉ„λ“±ν•˜κ±°λ‚˜ 높은 μ„±λŠ₯을 λ³΄μ˜€μŒμ„ ν™•μΈν•˜μ˜€λ‹€.The goal of unsupervised graph representation learning is extracting useful node-wise or graph-wise vector representation that is aware of the intrinsic structures of the graph and its attributes. These days, designing methodology of unsupervised graph representation learning based on graph neural networks has growing attention due to their powerful representation ability. Many methods are focused on a homogeneous graph that is a network with a single type of node and a single type of edge. However, as many types of relationships exist in this world, graphs can also be classified into various types by structural and semantic properties. For this reason, to learn useful representations from graphs, the unsupervised learning framework must consider the characteristics of the input graph. In this dissertation, we focus on designing unsupervised learning models using graph neural networks for three graph structures that are widely available: homogeneous graphs, tree-like graphs, and heterogeneous graphs. First, we propose a symmetric graph convolutional autoencoder which produces a low-dimensional latent representation from a homogeneous graph. In contrast to the existing graph autoencoders with asymmetric decoder parts, the proposed autoencoder has a newly designed decoder which builds a completely symmetric autoencoder form. For the reconstruction of node features, the decoder is designed based on Laplacian sharpening as the counterpart of Laplacian smoothing of the encoder, which allows utilizing the graph structure in the whole processes of the proposed autoencoder architecture. In order to prevent the numerical instability of the network caused by the Laplacian sharpening introduction, we further propose a new numerically stable form of the Laplacian sharpening by incorporating the signed graphs. The experimental results of clustering, link prediction and visualization tasks on homogeneous graphs strongly support that the proposed model is stable and outperforms various state-of-the-art algorithms. Second, we analyze how unsupervised tasks can benefit from learned representations in hyperbolic space. To explore how well the hierarchical structure of unlabeled data can be represented in hyperbolic spaces, we design a novel hyperbolic message passing autoencoder whose overall auto-encoding is performed in hyperbolic space. The proposed model conducts auto-encoding the networks via fully utilizing hyperbolic geometry in message passing. Through extensive quantitative and qualitative analyses, we validate the properties and benefits of the unsupervised hyperbolic representations of tree-like graphs. Third, we propose the novel concept of metanode for message passing to learn both heterogeneous and homogeneous relationships between any two nodes without meta-paths and meta-graphs. Unlike conventional methods, metanodes do not require a predetermined step to manipulate the given relations between different types to enrich relational information. Going one step further, we propose a metanode-based message passing layer and a contrastive learning model using the proposed layer. In our experiments, we show the competitive performance of the proposed metanode-based message passing method on node clustering and node classification tasks, when compared to state-of-the-art methods for message passing networks for heterogeneous graphs.1 Introduction 1 2 Representation Learning on Graph-Structured Data 4 2.1 Basic Introduction 4 2.1.1 Notations 5 2.2 Traditional Approaches 5 2.2.1 Graph Statistic 5 2.2.2 Neighborhood Overlap 7 2.2.3 Graph Kernel 9 2.2.4 Spectral Approaches 10 2.3 Node Embeddings I: Factorization and Random Walks 15 2.3.1 Factorization-based Methods 15 2.3.2 Random Walk-based Methods 16 2.4 Node Embeddings II: Graph Neural Networks 17 2.4.1 Overview of Framework 17 2.4.2 Representative Models 18 2.5 Learning in Unsupervised Environments 21 2.5.1 Predictive Coding 21 2.5.2 Contrastive Coding 22 2.6 Applications 24 2.6.1 Classifications 24 2.6.2 Link Prediction 26 3 Autoencoder Architecture for Homogeneous Graphs 27 3.1 Overview 27 3.2 Preliminaries 30 3.2.1 Spectral Convolution on Graphs 30 3.2.2 Laplacian Smoothing 32 3.3 Methodology 33 3.3.1 Laplacian Sharpening 33 3.3.2 Numerically Stable Laplacian Sharpening 34 3.3.3 Subspace Clustering Cost for Image Clustering 37 3.3.4 Training 39 3.4 Experiments 40 3.4.1 Datasets 40 3.4.2 Experimental Settings 42 3.4.3 Comparing Methods 42 3.4.4 Node Clustering 43 3.4.5 Image Clustering 45 3.4.6 Ablation Studies 46 3.4.7 Link Prediction 47 3.4.8 Visualization 47 3.5 Summary 49 4 Autoencoder Architecture for Tree-like Graphs 50 4.1 Overview 50 4.2 Preliminaries 52 4.2.1 Hyperbolic Embeddings 52 4.2.2 Hyperbolic Geometry 53 4.3 Methodology 55 4.3.1 Geometry-Aware Message Passing 56 4.3.2 Nonlinear Activation 57 4.3.3 Loss Function 58 4.4 Experiments 58 4.4.1 Datasets 59 4.4.2 Compared Methods 61 4.4.3 Experimental Details 62 4.4.4 Node Clustering and Link Prediction 64 4.4.5 Image Clustering 66 4.4.6 Structure-Aware Unsupervised Embeddings 68 4.4.7 Hyperbolic Distance to Filter Training Samples 71 4.4.8 Ablation Studies 74 4.5 Further Discussions 75 4.5.1 Connection to Contrastive Learning 75 4.5.2 Failure Cases of Hyperbolic Embedding Spaces 75 4.6 Summary 77 5 Contrastive Learning for Heterogeneous Graphs 78 5.1 Overview 78 5.2 Preliminaries 82 5.2.1 Meta-path 82 5.2.2 Representation Learning on Heterogeneous Graphs 82 5.2.3 Contrastive methods for Heterogeneous Graphs 83 5.3 Methodology 84 5.3.1 Definitions 84 5.3.2 Metanode-based Message Passing Layer 86 5.3.3 Contrastive Learning Framework 88 5.4 Experiments 89 5.4.1 Experimental Details 90 5.4.2 Node Classification 94 5.4.3 Node Clustering 96 5.4.4 Visualization 96 5.4.5 Effectiveness of Metanodes 97 5.5 Summary 99 6 Conclusions 101λ°•

    Algorithms, applications and systems towards interpretable pattern mining from multi-aspect data

    Get PDF
    How do humans move around in the urban space and how do they differ when the city undergoes terrorist attacks? How do users behave in Massive Open Online courses~(MOOCs) and how do they differ if some of them achieve certificates while some of them not? What areas in the court elite players, such as Stephen Curry, LeBron James, like to make their shots in the course of the game? How can we uncover the hidden habits that govern our online purchases? Are there unspoken agendas in how different states pass legislation of certain kinds? At the heart of these seemingly unconnected puzzles is this same mystery of multi-aspect mining, i.g., how can we mine and interpret the hidden pattern from a dataset that simultaneously reveals the associations, or changes of the associations, among various aspects of the data (e.g., a shot could be described with three aspects, player, time of the game, and area in the court)? Solving this problem could open gates to a deep understanding of underlying mechanisms for many real-world phenomena. While much of the research in multi-aspect mining contribute broad scope of innovations in the mining part, interpretation of patterns from the perspective of users (or domain experts) is often overlooked. Questions like what do they require for patterns, how good are the patterns, or how to read them, have barely been addressed. Without efficient and effective ways of involving users in the process of multi-aspect mining, the results are likely to lead to something difficult for them to comprehend. This dissertation proposes the M^3 framework, which consists of multiplex pattern discovery, multifaceted pattern evaluation, and multipurpose pattern presentation, to tackle the challenges of multi-aspect pattern discovery. Based on this framework, we develop algorithms, applications, and analytic systems to enable interpretable pattern discovery from multi-aspect data. Following the concept of meaningful multiplex pattern discovery, we propose PairFac to close the gap between human information needs and naive mining optimization. We demonstrate its effectiveness in the context of impact discovery in the aftermath of urban disasters. We develop iDisc to target the crossing of multiplex pattern discovery with multifaceted pattern evaluation. iDisc meets the specific information need in understanding multi-level, contrastive behavior patterns. As an example, we use iDisc to predict student performance outcomes in Massive Open Online Courses given users' latent behaviors. FacIt is an interactive visual analytic system that sits at the intersection of all three components and enables for interpretable, fine-tunable, and scrutinizable pattern discovery from multi-aspect data. We demonstrate each work's significance and implications in its respective problem context. As a whole, this series of studies is an effort to instantiate the M^3 framework and push the field of multi-aspect mining towards a more human-centric process in real-world applications

    Temporal Link Prediction: A Unified Framework, Taxonomy, and Review

    Full text link
    Dynamic graphs serve as a generic abstraction and description of the evolutionary behaviors of various complex systems (e.g., social networks and communication networks). Temporal link prediction (TLP) is a classic yet challenging inference task on dynamic graphs, which predicts possible future linkage based on historical topology. The predicted future topology can be used to support some advanced applications on real-world systems (e.g., resource pre-allocation) for better system performance. This survey provides a comprehensive review of existing TLP methods. Concretely, we first give the formal problem statements and preliminaries regarding data models, task settings, and learning paradigms that are commonly used in related research. A hierarchical fine-grained taxonomy is further introduced to categorize existing methods in terms of their data models, learning paradigms, and techniques. From a generic perspective, we propose a unified encoder-decoder framework to formulate all the methods reviewed, where different approaches only differ in terms of some components of the framework. Moreover, we envision serving the community with an open-source project OpenTLP that refactors or implements some representative TLP methods using the proposed unified framework and summarizes other public resources. As a conclusion, we finally discuss advanced topics in recent research and highlight possible future directions

    Unsupervised Learning of Latent Structure from Linear and Nonlinear Measurements

    Get PDF
    University of Minnesota Ph.D. dissertation. June 2019. Major: Electrical Engineering. Advisor: Nicholas Sidiropoulos. 1 computer file (PDF); xii, 118 pages.The past few decades have seen a rapid expansion of our digital world. While early dwellers of the Internet exchanged simple text messages via email, modern citizens of the digital world conduct a much richer set of activities online: entertainment, banking, booking for restaurants and hotels, just to name a few. In our digitally enriched lives, we not only enjoy great convenience and efficiency, but also leave behind massive amounts of data that offer ample opportunities for improving these digital services, and creating new ones. Meanwhile, technical advancements have facilitated the emergence of new sensors and networks, that can measure, exchange and log data about real world events. These technologies have been applied to many different scenarios, including environmental monitoring, advanced manufacturing, healthcare, and scientific research in physics, chemistry, bio-technology and social science, to name a few. Leveraging the abundant data, learning-based and data-driven methods have become a dominating paradigm across different areas, with data analytics driving many of the recent developments. However, the massive amount of data also bring considerable challenges for analytics. Among them, the collected data are often high-dimensional, with the true knowledge and signal of interest hidden underneath. It is of great importance to reduce data dimension, and transform the data into the right space. In some cases, the data are generated from certain generative models that are identifiable, making it possible to reduce the data back to the original space. In addition, we are often interested in performing some analysis on the data after dimensionality reduction (DR), and it would be helpful to be mindful about these subsequent analysis steps when performing DR, as latent structures can serve as a valuable prior. Based on this reasoning, we develop two methods, one for the linear generative model case, and the other one for the nonlinear case. In a related setting, we study parameter estimation under unknown nonlinear distortion. In this case, the unknown nonlinearity in measurements poses a severe challenge. In practice, various mechanisms can introduce nonlinearity in the measured data. To combat this challenge, we put forth a nonlinear mixture model, which is well-grounded in real world applications. We show that this model is in fact identifiable up to some trivial indeterminancy. We develop an efficient algorithm to recover latent parameters of this model, and confirm the effectiveness of our theory and algorithm via numerical experiments

    Structured representation learning from complex data

    Full text link
    This thesis advances several theoretical and practical aspects of the recently introduced restricted Boltzmann machine - a powerful probabilistic and generative framework for modelling data and learning representations. The contributions of this study represent a systematic and common theme in learning structured representations from complex data
    • …
    corecore