1,064 research outputs found

    Interpretable Hyperspectral AI: When Non-Convex Modeling meets Hyperspectral Remote Sensing

    Full text link
    Hyperspectral imaging, also known as image spectrometry, is a landmark technique in geoscience and remote sensing (RS). In the past decade, enormous efforts have been made to process and analyze these hyperspectral (HS) products mainly by means of seasoned experts. However, with the ever-growing volume of data, the bulk of costs in manpower and material resources poses new challenges on reducing the burden of manual labor and improving efficiency. For this reason, it is, therefore, urgent to develop more intelligent and automatic approaches for various HS RS applications. Machine learning (ML) tools with convex optimization have successfully undertaken the tasks of numerous artificial intelligence (AI)-related applications. However, their ability in handling complex practical problems remains limited, particularly for HS data, due to the effects of various spectral variabilities in the process of HS imaging and the complexity and redundancy of higher dimensional HS signals. Compared to the convex models, non-convex modeling, which is capable of characterizing more complex real scenes and providing the model interpretability technically and theoretically, has been proven to be a feasible solution to reduce the gap between challenging HS vision tasks and currently advanced intelligent data processing models

    Maximum Covariance Unfolding Regression: A Novel Covariate-based Manifold Learning Approach for Point Cloud Data

    Full text link
    Point cloud data are widely used in manufacturing applications for process inspection, modeling, monitoring and optimization. The state-of-art tensor regression techniques have effectively been used for analysis of structured point cloud data, where the measurements on a uniform grid can be formed into a tensor. However, these techniques are not capable of handling unstructured point cloud data that are often in the form of manifolds. In this paper, we propose a nonlinear dimension reduction approach named Maximum Covariance Unfolding Regression that is able to learn the low-dimensional (LD) manifold of point clouds with the highest correlation with explanatory covariates. This LD manifold is then used for regression modeling and process optimization based on process variables. The performance of the proposed method is subsequently evaluated and compared with benchmark methods through simulations and a case study of steel bracket manufacturing

    High dimensional data analysis for anomaly detection and quality improvement

    Get PDF
    Analysis of large-scale high-dimensional data with a complex heterogeneous data structure to extract information or useful features is vital for the purpose of data fusion for assessment of system performance, early detection of system anomalies, intelligent sampling and sensing for data collection and decision making to achieve optimal system performance. Chapter 3 focuses on detecting anomalies from high-dimensional data. Traditionally, most of the image-based anomaly detection methods perform denoising and detection sequentially, which affects detection accuracy and efficiency. In this chapter, A novel methodology, named smooth-sparse decomposition (SSD), is proposed to exploit regularized high-dimensional regression to decompose an image and separate anomalous regions simultaneously by solving a large-scale optimization problem. Chapter 4 extends this to spatial-temporal functional data by extending SSD to spatiotemporal smooth-sparse decomposition (ST-SSD), with a likelihood ratio test to detect the time of change accurately based on the detected anomaly. To enable real-time implementation of the proposed methodology, recursive estimation procedures for ST-SSD are also developed. The proposed methodology is also applied to tonnage signals, rolling inspection data and solar flare monitoring. Chapter 5 considers the adaptive sampling problem for high-dimensional data. A novel adaptive sampling framework, named Adaptive Kernelized Maximum-Minimum Distance is proposed to adaptively estimate the sparse anomalous region. The proposed method balances the sampling efforts between the space filling sampling (exploration) and focused sampling near the anomalous region (exploitation). The proposed methodology is also applied to a case study of anomaly detection in composite sheets using a guided wave test. Chapter 6 explores the penalized tensor regression to model the tensor response data with the process variables. Regularized Tucker decomposition and regularized tensor regression methods are developed, which model the structured point cloud data as tensors and link the point cloud data with the process variables. The performance of the proposed method is evaluated through simulation and a real case study of turning process optimization.Ph.D

    Integrating prior knowledge into factorization approaches for relational learning

    Get PDF
    An efficient way to represent the domain knowledge is relational data, where information is recorded in form of relationships between entities. Relational data is becoming ubiquitous over the years for knowledge representation due to the fact that many real-word data is inherently interlinked. Some well-known examples of relational data are: the World Wide Web (WWW), a system of interlinked hypertext documents; the Linked Open Data (LOD) cloud of the Semantic Web, a collection of published data and their interlinks; and finally the Internet of Things (IoT), a network of physical objects with internal states and communications ability. Relational data has been addressed by many different machine learning approaches, the most promising ones are in the area of relational learning, which is the focus of this thesis. While conventional machine learning algorithms consider entities as being independent instances randomly sampled from some statistical distribution and being represented as data points in a vector space, relational learning takes into account the overall network environment when predicting the label of an entity, an attribute value of an entity or the existence of a relationship between entities. An important feature is that relational learning can exploit contextual information that is more distant in the relational network. As the volume and structural complexity of the relational data increase constantly in the era of Big Data, scalability and the modeling power become crucial for relational learning algorithms. Previous relational learning algorithms either provide an intuitive representation of the model, such as Inductive Logic Programming (ILP) and Markov Logic Networks (MLNs), or assume a set of latent variables to explain the observed data, such as the Infinite Hidden Relational Model (IHRM), the Infinite Relational Model (IRM) and factorization approaches. Models with intuitive representations often involve some form of structure learning which leads to scalability problems due to a typically large search space. Factorizations are among the best-performing approaches for large-scale relational learning since the algebraic computations can easily be parallelized and since they can exploit data sparsity. Previous factorization approaches exploit only patterns in the relational data itself and the focus of the thesis is to investigate how additional prior information (comprehensive information), either in form of unstructured data (e.g., texts) or structured patterns (e.g., in form of rules) can be considered in the factorization approaches. The goal is to enhance the predictive power of factorization approaches by involving prior knowledge for the learning, and on the other hand to reduce the model complexity for efficient learning. This thesis contains two main contributions: The first contribution presents a general and novel framework for predicting relationships in multirelational data using a set of matrices describing the various instantiated relations in the network. The instantiated relations, derived or learnt from prior knowledge, are integrated as entities' attributes or entity-pairs' attributes into different adjacency matrices for the learning. All the information available is then combined in an additive way. Efficient learning is achieved using an alternating least squares approach exploiting sparse matrix algebra and low-rank approximation. As an illustration, several algorithms are proposed to include information extraction, deductive reasoning and contextual information in matrix factorizations for the Semantic Web scenario and for recommendation systems. Experiments on various data sets are conducted for each proposed algorithm to show the improvement in predictive power by combining matrix factorizations with prior knowledge in a modular way. In contrast to a matrix, a 3-way tensor si a more natural representation for the multirelational data where entities are connected by different types of relations. A 3-way tensor is a three dimensional array which represents the multirelational data by using the first two dimensions for entities and using the third dimension for different types of relations. In the thesis, an analysis on the computational complexity of tensor models shows that the decomposition rank is key for the success of an efficient tensor decomposition algorithm, and that the factorization rank can be reduced by including observable patterns. Based on these theoretical considerations, a second contribution of this thesis develops a novel tensor decomposition approach - an Additive Relational Effects (ARE) model - which combines the strengths of factorization approaches and prior knowledge in an additive way to discover different relational effects from the relational data. As a result, ARE consists of a decomposition part which derives the strong relational leaning effects from a highly scalable tensor decomposition approach RESCAL and a Tucker 1 tensor which integrates the prior knowledge as instantiated relations. An efficient least squares approach is proposed to compute the combined model ARE. The additive model contains weights that reflect the degree of reliability of the prior knowledge, as evaluated by the data. Experiments on several benchmark data sets show that the inclusion of prior knowledge can lead to better performing models at a low tensor rank, with significant benefits for run-time and storage requirements. In particular, the results show that ARE outperforms state-of-the-art relational learning algorithms including intuitive models such as MRC, which is an approach based on Markov Logic with structure learning, factorization approaches such as Tucker, CP, Bayesian Clustered Tensor Factorization (BCTF), the Latent Factor Model (LFM), RESCAL, and other latent models such as the IRM. A final experiment on a Cora data set for paper topic classification shows the improvement of ARE over RESCAL in both predictive power and runtime performance, since ARE requires a significantly lower rank

    병렬화 용이한 통계계산 방법론과 현대 고성능 컴퓨팅 환경에의 적용

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 자연과학대학 통계학과, 2020. 8. 원중호.Technological advances in the past decade, hardware and software alike, have made access to high-performance computing (HPC) easier than ever. In this dissertation, easily-parallelizable, inversion-free, and variable-separated algorithms and their implementation in statistical computing are discussed. The first part considers statistical estimation problems under structured sparsity posed as minimization of a sum of two or three convex functions, one of which is a composition of non-smooth and linear functions. Examples include graph-guided sparse fused lasso and overlapping group lasso. Two classes of inversion-free primal-dual algorithms are considered and unified from a perspective of monotone operator theory. From this unification, a continuum of preconditioned forward-backward operator splitting algorithms amenable to parallel and distributed computing is proposed. The unification is further exploited to introduce a continuum of accelerated algorithms on which the theoretically optimal asymptotic rate of convergence is obtained. For the second part, easy-to-use distributed matrix data structures in PyTorch and Julia are presented. They enable users to write code once and run it anywhere from a laptop to a workstation with multiple graphics processing units (GPUs) or a supercomputer in a cloud. With these data structures, various parallelizable statistical applications, including nonnegative matrix factorization, positron emission tomography, multidimensional scaling, and ℓ1-regularized Cox regression, are demonstrated. The examples scale up to an 8-GPU workstation and a 720-CPU-core cluster in a cloud. As a case in point, the onset of type-2 diabetes from the UK Biobank with 400,000 subjects and about 500,000 single nucleotide polymorphisms is analyzed using the HPC ℓ1-regularized Cox regression. Fitting a half-million variate model took about 50 minutes, reconfirming known associations. To my knowledge, the feasibility of a joint genome-wide association analysis of survival outcomes at this scale is first demonstrated.지난 10년간의 하드웨어와 소프트웨어의 기술적인 발전은 고성능 컴퓨팅의 접근장벽을 그 어느 때보다 낮추었다. 이 학위논문에서는 병렬화 용이하고 역행렬 연산이 없는 변수 분리 알고리즘과 그 통계계산에서의 구현을 논의한다. 첫 부분은 볼록 함수 두 개 또는 세 개의 합으로 나타나는 구조화된 희소 통계 추정 문제에 대해 다룬다. 이 때 함수들 중 하나는 비평활 함수와 선형 함수의 합성으로 나타난다. 그 예시로는 그래프 구조를 통해 유도되는 희소 융합 Lasso 문제와 한 변수가 여러 그룹에 속할 수 있는 그룹 Lasso 문제가 있다. 이를 풀기 위해 역행렬 연산이 없는 두 종류의 원시-쌍대 (primal-dual) 알고리즘을 단조 연산자 이론 관점에서 통합하며 이를 통해 병렬화 용이한 precondition된 전방-후방 연산자 분할 알고리즘의 집합을 제안한다. 이 통합은 점근적으로 최적 수렴률을 갖는 가속 알고리즘의 집합을 구성하는 데 활용된다. 두 번째 부분에서는 PyTorch와 Julia를 통해 사용하기 쉬운 분산 행렬 자료 구조를 제시한다. 이 구조는 사용자들이 코드를 한 번 작성하면 이것을 노트북 한 대에서부터 여러 대의 그래픽 처리 장치 (GPU)를 가진 워크스테이션, 또는 클라우드 상에 있는 슈퍼컴퓨터까지 다양한 스케일에서 실행할 수 있게 해 준다. 아울러, 이 자료 구조를 비음 행렬 분해, 양전자 단층 촬영, 다차원 척 도법, ℓ1-벌점화 Cox 회귀 분석 등 다양한 병렬화 가능한 통계적 문제에 적용한다. 이 예시들은 8대의 GPU가 있는 워크스테이션과 720개의 코어가 있는 클라우드 상의 가상 클러스터에서 확장 가능했다. 한 사례로 400,000명의 대상과 500,000개의 단일 염기 다형성 정보가 있는 UK Biobank 자료에서의 제2형 당뇨병 (T2D) 발병 나이를 ℓ1-벌점화 Cox 회귀 모형을 통해 분석했다. 500,000개의 변수가 있는 모형을 적합시키는 데 50분 가량의 시간이 걸렸으며 알려진 T2D 관련 다형성들을 재확인할 수 있었다. 이러한 규모의 전유전체 결합 생존 분석은 최초로 시도된 것이다.Chapter1Prologue 1 1.1 Introduction 1 1.2 Accessible High-Performance Computing Systems 4 1.2.1 Preliminaries 4 1.2.2 Multiple CPU nodes: clusters, supercomputers, and clouds 7 1.2.3 Multi-GPU node 9 1.3 Highly Parallelizable Algorithms 12 1.3.1 MM algorithms 12 1.3.2 Proximal gradient descent 14 1.3.3 Proximal distance algorithm 16 1.3.4 Primal-dual methods 17 Chapter 2 Easily Parallelizable and Distributable Class of Algorithms for Structured Sparsity, with Optimal Acceleration 20 2.1 Introduction 20 2.2 Unification of Algorithms LV and CV (g ≡ 0) 30 2.2.1 Relation between Algorithms LV and CV 30 2.2.2 Unified algorithm class 34 2.2.3 Convergence analysis 35 2.3 Optimal acceleration 39 2.3.1 Algorithms 40 2.3.2 Convergence analysis 41 2.4 Stochastic optimal acceleration 45 2.4.1 Algorithm 45 2.4.2 Convergence analysis 47 2.5 Numerical experiments 50 2.5.1 Model problems 50 2.5.2 Convergence behavior 52 2.5.3 Scalability 62 2.6 Discussion 63 Chapter 3 Towards Unified Programming for High-Performance Statistical Computing Environments 66 3.1 Introduction 66 3.2 Related Software 69 3.2.1 Message-passing interface and distributed array interfaces 69 3.2.2 Unified array interfaces for CPU and GPU 69 3.3 Easy-to-use Software Libraries for HPC 70 3.3.1 Deep learning libraries and HPC 70 3.3.2 Case study: PyTorch versus TensorFlow 73 3.3.3 A brief introduction to PyTorch 76 3.3.4 A brief introduction to Julia 80 3.3.5 Methods and multiple dispatch 80 3.3.6 Multidimensional arrays 82 3.3.7 Matrix multiplication 83 3.3.8 Dot syntax for vectorization 86 3.4 Distributed matrix data structure 87 3.4.1 Distributed matrices in PyTorch: distmat 87 3.4.2 Distributed arrays in Julia: MPIArray 90 3.5 Examples 98 3.5.1 Nonnegative matrix factorization 100 3.5.2 Positron emission tomography 109 3.5.3 Multidimensional scaling 113 3.5.4 L1-regularized Cox regression 117 3.5.5 Genome-wide survival analysis of the UK Biobank dataset 121 3.6 Discussion 126 Chapter 4 Conclusion 131 Appendix A Monotone Operator Theory 134 Appendix B Proofs for Chapter II 139 B.1 Preconditioned forward-backward splitting 139 B.2 Optimal acceleration 147 B.3 Optimal stochastic acceleration 158 Appendix C AWS EC2 and ParallelCluster 168 C.1 Overview 168 C.2 Glossary 169 C.3 Prerequisites 172 C.4 Installation 173 C.5 Configuration 173 C.6 Creating, accessing, and destroying the cluster 178 C.7 Installation of libraries 178 C.8 Running a job 179 C.9 Miscellaneous 180 Appendix D Code for memory-efficient L1-regularized Cox proportional hazards model 182 Appendix E Details of SNPs selected in L1-regularized Cox regression 184 Bibliography 188 국문초록 212Docto

    Scalable statistical learning for relation prediction on structured data

    Get PDF
    Relation prediction seeks to predict unknown but potentially true relations by revealing missing relations in available data, by predicting future events based on historical data, and by making predicted relations retrievable by query. The approach developed in this thesis can be used for a wide variety of purposes, including to predict likely new friends on social networks, attractive points of interest for an individual visiting an unfamiliar city, and associations between genes and particular diseases. In recent years, relation prediction has attracted significant interest in both research and application domains, partially due to the increasing volume of published structured data and background knowledge. In the Linked Open Data initiative of the Semantic Web, for instance, entities are uniquely identified such that the published information can be integrated into applications and services, and the rapid increase in the availability of such structured data creates excellent opportunities as well as challenges for relation prediction. This thesis focuses on the prediction of potential relations by exploiting regularities in data using statistical relational learning algorithms and applying these methods to relational knowledge bases, in particular in Linked Open Data in particular. We review representative statistical relational learning approaches, e.g., Inductive Logic Programming and Probabilistic Relational Models. While logic-based reasoning can infer and include new relations via deduction by using ontologies, machine learning can be exploited to predict new relations (with some degree of certainty) via induction, purely based on the data. Because the application of machine learning approaches to relation prediction usually requires handling large datasets, we also discuss the scalability of machine learning as a solution to relation prediction, as well as the significant challenge posed by incomplete relational data (such as social network data, which is often much more extensive for some users than others). The main contribution of this thesis is to develop a learning framework called the Statistical Unit Node Set (SUNS) and to propose a multivariate prediction approach used in the framework. We argue that multivariate prediction approaches are most suitable for dealing with large, sparse data matrices. According to the characteristics and intended application of the data, the approach can be extended in different ways. We discuss and test two extensions of the approach--kernelization and a probabilistic method of handling complex n-ary relationships--in empirical studies based on real-world data sets. Additionally, this thesis contributes to the field of relation prediction by applying the SUNS framework to various domains. We focus on three applications: 1. In social network analysis, we present a combined approach of inductive and deductive reasoning for recommending movies to users. 2. In the life sciences, we address the disease gene prioritization problem. 3. In the recommendation system, we describe and investigate the back-end of a mobile app called BOTTARI, which provides personalized location-based recommendations of restaurants.Die Beziehungsvorhersage strebt an, unbekannte aber potenziell wahre Beziehungen vorherzusagen, indem fehlende Relationen in verfügbaren Daten aufgedeckt, zukünftige Ereignisse auf der Grundlage historischer Daten prognostiziert und vorhergesagte Relationen durch Anfragen abrufbar gemacht werden. Der in dieser Arbeit entwickelte Ansatz lässt sich für eine Vielzahl von Zwecken einschließlich der Vorhersage wahrscheinlicher neuer Freunde in sozialen Netzen, der Empfehlung attraktiver Sehenswürdigkeiten für Touristen in fremden Städten und der Priorisierung möglicher Assoziationen zwischen Genen und bestimmten Krankheiten, verwenden. In den letzten Jahren hat die Beziehungsvorhersage sowohl in Forschungs- als auch in Anwendungsbereichen eine enorme Aufmerksamkeit erregt, aufgrund des Zuwachses veröffentlichter strukturierter Daten und von Hintergrundwissen. In der Linked Open Data-Initiative des Semantischen Web werden beispielsweise Entitäten eindeutig identifiziert, sodass die veröffentlichten Informationen in Anwendungen und Dienste integriert werden können. Diese rapide Erhöhung der Verfügbarkeit strukturierter Daten bietet hervorragende Gelegenheiten sowie Herausforderungen für die Beziehungsvorhersage. Diese Arbeit fokussiert sich auf die Vorhersage potenzieller Beziehungen durch Ausnutzung von Regelmäßigkeiten in Daten unter der Verwendung statistischer relationaler Lernalgorithmen und durch Einsatz dieser Methoden in relationale Wissensbasen, insbesondere in den Linked Open Daten. Wir geben einen Überblick über repräsentative statistische relationale Lernansätze, z.B. die Induktive Logikprogrammierung und Probabilistische Relationale Modelle. Während das logikbasierte Reasoning neue Beziehungen unter der Nutzung von Ontologien ableiten und diese einbeziehen kann, kann maschinelles Lernen neue Beziehungen (mit gewisser Wahrscheinlichkeit) durch Induktion ausschließlich auf der Basis der vorliegenden Daten vorhersagen. Da die Verarbeitung von massiven Datenmengen in der Regel erforderlich ist, wenn maschinelle Lernmethoden in die Beziehungsvorhersage eingesetzt werden, diskutieren wir auch die Skalierbarkeit des maschinellen Lernens sowie die erhebliche Herausforderung, die sich aus unvollständigen relationalen Daten ergibt (z. B. Daten aus sozialen Netzen, die oft für manche Benutzer wesentlich umfangreicher sind als für Anderen). Der Hauptbeitrag der vorliegenden Arbeit besteht darin, ein Lernframework namens Statistical Unit Node Set (SUNS) zu entwickeln und einen im Framework angewendeten multivariaten Prädiktionsansatz einzubringen. Wir argumentieren, dass multivariate Vorhersageansätze am besten für die Bearbeitung von großen und dünnbesetzten Datenmatrizen geeignet sind. Je nach den Eigenschaften und der beabsichtigten Anwendung der Daten kann der Ansatz auf verschiedene Weise erweitert werden. In empirischen Studien werden zwei Erweiterungen des Ansatzes--ein kernelisierter Ansatz sowie ein probabilistischer Ansatz zur Behandlung komplexer n-stelliger Beziehungen-- diskutiert und auf realen Datensätzen untersucht. Ein weiterer Beitrag dieser Arbeit ist die Anwendung des SUNS Frameworks auf verschiedene Bereiche. Wir konzentrieren uns auf drei Anwendungen: 1. In der Analyse sozialer Netze stellen wir einen kombinierten Ansatz von induktivem und deduktivem Reasoning vor, um Benutzern Filme zu empfehlen. 2. In den Biowissenschaften befassen wir uns mit dem Problem der Priorisierung von Krankheitsgenen. 3. In den Empfehlungssystemen beschreiben und untersuchen wir das Backend einer mobilen App "BOTTARI", das personalisierte ortsbezogene Empfehlungen von Restaurants bietet

    Tensor Regression

    Full text link
    Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies such as neuroimaging, computer vision, climatology and social networks, has brought challenges to traditional data representation methods. Tensors, as high dimensional extensions of vectors, are considered as natural representations of high dimensional data. In this book, the authors provide a systematic study and analysis of tensor-based regression models and their applications in recent years. It groups and illustrates the existing tensor-based regression methods and covers the basics, core ideas, and theoretical characteristics of most tensor-based regression methods. In addition, readers can learn how to use existing tensor-based regression methods to solve specific regression tasks with multiway data, what datasets can be selected, and what software packages are available to start related work as soon as possible. Tensor Regression is the first thorough overview of the fundamentals, motivations, popular algorithms, strategies for efficient implementation, related applications, available datasets, and software resources for tensor-based regression analysis. It is essential reading for all students, researchers and practitioners of working on high dimensional data.Comment: 187 pages, 32 figures, 10 table
    corecore