268 research outputs found
Recommended from our members
Advances in Deep Generative Modeling With Applications to Image Generation and Neuroscience
Deep generative modeling is an increasingly popular area of machine learning that takes advantage of recent developments in neural networks in order to estimate the distribution of observed data. In this dissertation we introduce three advances in this area. The first one, Maximum Entropy Flow Networks, allows to do maximum entropy modeling by combining normalizing flows with the augmented Lagrangian optimization method. The second one is the continuous Bernoulli, a new [0,1]-supported distribution which we introduce with the motivation of fixing the pervasive error in variational autoencoders of using a Bernoulli likelihood for non-binary data. The last one, Deep Random Splines, is a novel distribution over functions, where samples are obtained by sampling Gaussian noise and transforming it through a neural network to obtain the parameters of a spline. We apply these to model texture images, natural images and neural population data, respectively; and observe significant improvements over current state of the art alternatives
B-μ€νλΌμΈ κ³ΌμλΉ μ²΄κ³λ₯Ό μ΄μ©ν λΉλͺ¨μ λ² μ΄μ¦ νκ· λͺ¨ν μ°κ΅¬
νμλ
Όλ¬Έ(λ°μ¬) -- μμΈλνκ΅λνμ : μμ°κ³Όνλν ν΅κ³νκ³Ό, 2021.8. μ΄μ¬μ©.λ³Έ νμ λ
Όλ¬Έμμλ ν¨μμ λ³ννλ λΆλλ¬μμ μΆμ νκΈ° μν΄ LARK λͺ¨νμ νμ₯ν βλ λΉ μ μ B-μ€νλΌμΈ νκ· λͺ¨νβ (LABS) μ μ μνλ€. μ¦, μ μν λͺ¨νμ B-μ€νλΌμΈ κΈ°μ λ€μ΄ μμ± μ»€λλ‘ κ°λ LARK λͺ¨νμ΄λ€. μ μν λͺ¨νμ B-μ€νλΌμΈ κΈ°μ μ μ°¨μλ₯Ό μ‘°μ νλ©΄μ λΆμ°μνκ±°λ μ΅κ³ μ λ±μ μ§λ ν¨μμ λΆλλ¬μμ 체κ³μ μΌλ‘ μ μνλ€. λͺ¨μ μ€νλ€κ³Ό μ€μ μλ£ λΆμμ ν΅ν΄μ μ μν λͺ¨νμ΄ λΆμ°μμ , μ΅κ³ μ , 곑μ λΆλΆμ λͺ¨λ μ μΆμ νκ³ μμμ μ
μ¦νκ³ , κ±°μ λͺ¨λ μ€νμμ μ΅κ³ μ μ±λ₯μ λ°ννλ€. λν, B-μ€νλΌμΈ μ°¨μμ λ°λΌ LABS λͺ¨νμ νκ· ν¨μκ° νΉμ λ² μν 곡κ°μ μ‘΄μ¬νκ³ , LABS λͺ¨νμ μ¬μ λΆν¬κ° ν΄λΉ λ² μν 곡κ°μ μλΉν λμ λ°μΉ¨μ κ°λλ€λ κ²μ λ°νλ€.
μΆκ°μ μΌλ‘, ν
μκ³± B-μ€νλΌμΈ κΈ°μ λ₯Ό λμ
νμ¬ λ€μ°¨μ μλ£λ₯Ό λΆμν μ μλ LABS λͺ¨νμ κ°λ°νλ€. μ μν λͺ¨νμ βλ€μ°¨μ λ λΉ μ μ B-μ€νλΌμΈ νκ· λͺ¨νβ (MLABS) μ΄λΌκ³ λͺ
λͺ
νλ€. MLABS λͺ¨νμ νκ· λ° λΆλ₯ λ¬Έμ λ€μμ μ΅μ λͺ¨νλ€κ³Ό νμ ν λ§ν μ±λ₯μ κ°μΆκ³ μλ€. νΉν, MLABS λͺ¨νμ΄ μ μ°¨μ νκ· λ¬Έμ λ€μμ μ΅μ λΉλͺ¨μ νκ· λͺ¨νλ€λ³΄λ€ μμ μ μ΄κ³ μ νν μμΈ‘ λ₯λ ₯μ μ§λκ³ μμμ μ€νλ€μ ν΅ν΄ 보μΈλ€.In this dissertation, we propose the LΓ©vy Adaptive B-Spline regression (LABS) model, an extension of the LARK models, to estimate functions with varying degrees of smoothness. LABS model is a LARK with B-spline bases as generating kernels. By changing the degrees of the B-spline basis, LABS can systematically adapt the smoothness of functions, i.e., jump discontinuities, sharp peaks, etc. Results of simulation studies and real data examples support that this model catches not only smooth areas but also jumps and sharp peaks of functions. The LABS model has the best performance in almost all examples. We also provide theoretical results that the mean function for the LABS model belongs to the specific Besov spaces based on the degrees of the B-spline basis and that the prior of the model has the full support on the Besov spaces.
Furthermore, we develop a multivariate version of the LABS model by introducing tensor product of B-spline bases named Multivariate LΓ©vy Adaptive B-Spline regression (MLABS). MLABS model has comparable performance on both regression and classification problems. Especially, empirical results demonstrate that MLABS has more stable and accurate predictive abilities than state-of-the-art nonparametric regression models in relatively low-dimensional data.1 Introduction 1
1.1 Nonparametric regression model 1
1.2 Literature Review 2
1.2.1 Literature review of nonparametric function estimation 2
1.2.2 Literature review of multivariate nonparametric regression 5
1.3 Outline 7
2 Bayesian nonparametric function estimation using overcomplete systems with B-spline bases 9
2.1 Introduction 9
2.2 LΓ©vy adaptive regression kernels 11
2.3 LΓ©vy adaptive B-spline regression 14
2.3.1 B-spline basis 15
2.3.2 Model specification 17
2.3.3 Support of LABS model 19
2.4 Algorithm 22
2.5 Simulation studies 25
2.5.1 Simulation 1 : DJ test functions 27
2.5.2 Simulation 2 : Smooth functions with jumps and peaks 30
2.6 Real data applications 35
2.6.1 Example 1: Minimum legal drinking age 35
2.6.2 Example 2: Bitcoin prices on Bitstamp 37
2.6.3 Example 3: Fine particulate matter in Seoul 39
2.7 Discussion 42
3 Bayesian multivariate nonparametric regression using overcomplete systems with tensor products of B-spline bases 43
3.1 Introduction 43
3.2 Multivariate LΓ©vy adaptive B-spline regression 44
3.2.1 Model specifications 45
3.2.2 Comparisons between basis fucntions of MLABS and MARS 47
3.2.3 Posterior inference 50
3.2.4 Binomial regressions for MLABS 53
3.3 Simulation studies 55
3.3.1 Surface examples 58
3.3.2 Friedman's examples 60
3.4 Real data applications 63
3.4.1 Regression examples 64
3.4.2 Classification examples 66
3.5 Discussion 67
4 Concluding Remarks 70
A Appendix 72
A.1 Appendix for Chapter 2 72
A.1.1 Proof of Theorem 2.3.1 72
A.1.2 Proof of Theorem 2.3.2 75
A.1.3 Proof of Theorem 2.3.3 75
A.1.4 Full simulation results for Simulation 1 79
A.1.5 Derivation of the full conditionals for LABS 83
Bibliography 87
Abstract in Korean 95λ°
STATISTICAL MACHINE LEARNING BASED MODELING FRAMEWORK FOR DESIGN SPACE EXPLORATION AND RUN-TIME CROSS-STACK ENERGY OPTIMIZATION FOR MANY-CORE PROCESSORS
The complexity of many-core processors continues to grow as a larger number of heterogeneous cores are integrated on a single chip. Such systems-on-chip contains computing structures ranging from complex out-of-order cores, simple in-order cores, digital signal processors (DSPs), graphic processing units (GPUs), application specific processors, hardware accelerators, I/O subsystems, network-on-chip interconnects, and large caches arranged in complex hierarchies. While the industry focus is on putting higher number of cores on a single chip, the key challenge is to optimally architect these many-core processors such that performance, energy and area constraints are satisfied. The traditional approach to processor design through extensive cycle accurate simulations are ill-suited for designing many-core processors due to the large microarchitecture design space that must be explored. Additionally it is hard to optimize such complex processors and the applications that run on them statically at design time such that performance and energy constraints are met under dynamically changing operating conditions.
The dissertation establishes statistical machine learning based modeling framework that enables the efficient design and operation of many-core processors that meets performance, energy and area constraints. We apply the proposed framework to rapidly design the microarchitecture of a many-core processor for multimedia, computer graphics rendering, finance, and data mining applications derived from the Parsec benchmark. We further demonstrate the application of the framework in the joint run-time adaptation of both the application and microarchitecture such that energy availability
constraints are met
Spatio-Temporal Modeling Of Anatomic Motion For Radiation Therapy
In radiation therapy, it is imperative to deliver high doses of radiation to the tumor while reducing radiation to the healthy tissue. Respiratory motion is the most significant source of errors during treatment. Therefore, it is essential to accurately model respiratory motion for precise and effective radiation delivery. Many approaches exist to account for respiratory motion, such as controlled breath hold and respiratory gating, and they have been relatively successful. They still present many drawbacks. Thus, research has been expanded to tumor tracking.
The overall goal of 4D-CT is to predict tumor motion in real time, and this work attempts to move in that direction. The following work addresses both the temporal and the spatial aspects of four-dimensional CT reconstruction. The aims of the paper are to (1) estimate the temporal parameters of 4D models for anatomy deformation using a novel neural network approach and (2) to use intelligently chosen non-uniform, non-separable splines to improve the spatial resolution of the deformation models in image registration
Learning Bijective Feature Maps for Linear ICA
Separating high-dimensional data like images into independent latent factors,
i.e independent component analysis (ICA), remains an open research problem. As
we show, existing probabilistic deep generative models (DGMs), which are
tailor-made for image data, underperform on non-linear ICA tasks. To address
this, we propose a DGM which combines bijective feature maps with a linear ICA
model to learn interpretable latent structures for high-dimensional data. Given
the complexities of jointly training such a hybrid model, we introduce novel
theory that constrains linear ICA to lie close to the manifold of orthogonal
rectangular matrices, the Stiefel manifold. By doing so we create models that
converge quickly, are easy to train, and achieve better unsupervised latent
factor discovery than flow-based models, linear ICA, and Variational
Autoencoders on images.Comment: 8 page
A feature-based reverse engineering system using artificial neural networks
Reverse Engineering (RE) is the process of reconstructing CAD models from
scanned data of a physical part acquired using 3D scanners. RE has attracted a
great deal of research interest over the last decade. However, a review of the
literature reveals that most research work have focused on creation of free form
surfaces from point cloud data. Representing geometry in terms of surface patches
is adequate to represent positional information, but can not capture any of the
higher level structure of the part. Reconstructing solid models is of importance
since the resulting solid models can be directly imported into commercial solid
modellers for various manufacturing activities such as process planning, integral
property computation, assembly analysis, and other applications.
This research discusses the novel methodology of extracting geometric features
directly from a data set of 3D scanned points, which utilises the concepts of
artificial neural networks (ANNs). In order to design and develop a generic
feature-based RE system for prismatic parts, the following five main tasks were
investigated. (1) point data processing algorithms; (2) edge detection strategies;
(3) a feature recogniser using ANNs; (4) a feature extraction module; (5) a CAD
model exchanger into other CAD/CAM systems via IGES.
A key feature of this research is the incorporation of ANN in feature recognition.
The use of ANN approach has enabled the development of a flexible feature-based
RE methodology that can be trained to deal with new features. ANNs
require parallel input patterns. In this research, four geometric attributes extracted
from a point set are input to the ANN module for feature recognition: chain codes,
convex/concave, circular/rectangular and open/closed attribute. Recognising each
feature requires the determination of these attributes. New and robust algorithms
are developed for determining these attributes for each of the features.
This feature-based approach currently focuses on solving the feature recognition
problem based on 2.5D shapes such as block pocket, step, slot, hole, and boss,
which are common and crucial in mechanical engineering products. This approach
is validated using a set of industrial components. The test results show that the
strategy for recognising features is reliable
- β¦