205 research outputs found
Digital scaling of binary images
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1979.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING.Includes bibliographical references.by Robert A. Ulichney.M.S
Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages
Arrays are such a rich and fundamental data type that they tend to be built
into a language, either in the compiler or in a large low-level library.
Defining this functionality at the user level instead provides greater
flexibility for application domains not envisioned by the language designer.
Only a few languages, such as C++ and Haskell, provide the necessary power to
define -dimensional arrays, but these systems rely on compile-time
abstraction, sacrificing some flexibility. In contrast, dynamic languages make
it straightforward for the user to define any behavior they might want, but at
the possible expense of performance.
As part of the Julia language project, we have developed an approach that
yields a novel trade-off between flexibility and compile-time analysis. The
core abstraction we use is multiple dispatch. We have come to believe that
while multiple dispatch has not been especially popular in most kinds of
programming, technical computing is its killer application. By expressing key
functions such as array indexing using multi-method signatures, a surprising
range of behaviors can be obtained, in a way that is both relatively easy to
write and amenable to compiler analysis. The compact factoring of concerns
provided by these methods makes it easier for user-defined types to behave
consistently with types in the standard library.Comment: 6 pages, 2 figures, workshop paper for the ARRAY '14 workshop, June
11, 2014, Edinburgh, United Kingdo
Advances in scalable learning and sampling of unnormalised models
We study probabilistic models that are known incompletely, up to an intractable normalising constant. To reap the full benefit of such models, two
tasks must be solved: learning and sampling. These two tasks have been
subject to decades of research, and yet significant challenges still persist.
Traditional approaches often suffer from poor scalability with respect to
dimensionality and model-complexity, generally rendering them inapplicable to models parameterised by deep neural networks. In this thesis, we
contribute a new set of methods for addressing this scalability problem.
We first explore the problem of learning unnormalised models. Our investigation begins with a well-known learning principle, Noise-contrastive
Estimation, whose underlying mechanism is that of density-ratio estimation.
By examining why existing density-ratio estimators scale poorly, we identify a new framework, telescoping density-ratio estimation (TRE), that can
learn ratios between highly dissimilar densities in high-dimensional spaces.
Our experiments demonstrate that TRE not only yields substantial improvements for the learning of deep unnormalised models, but can do the
same for a broader set of tasks including mutual information estimation and
representation learning.
Subsequently, we explore the problem of sampling unnormalised models.
A large literature on Markov chain Monte Carlo (MCMC) can be leveraged here, and in continuous domains, gradient-based samplers such as
Metropolis-adjusted Langevin algorithm (MALA) and Hamiltonian Monte
Carlo are excellent options. However, there has been substantially less
progress in MCMC for discrete domains. To advance this subfield, we introduce several discrete Metropolis-Hastings samplers that are conceptually
inspired by MALA, and demonstrate their strong empirical performance
across a range of challenging sampling tasks
Recommended from our members
Neuroanatomy of Individual Differences in Language in Adult Males with Autism.
One potential source of heterogeneity within autism spectrum conditions (ASC) is language development and ability. In 80 high-functioning male adults with ASC, we tested if variations in developmental and current structural language are associated with current neuroanatomy. Groups with and without language delay differed behaviorally in early social reciprocity, current language, but not current autistic features. Language delay was associated with larger total gray matter (GM) volume, smaller relative volume at bilateral insula, ventral basal ganglia, and right superior, middle, and polar temporal structures, and larger relative volume at pons and medulla oblongata in adulthood. Despite this heterogeneity, those with and without language delay showed significant commonality in morphometric features when contrasted with matched neurotypical individuals (n = 57). In ASC, better current language was associated with increased GM volume in bilateral temporal pole, superior temporal regions, dorsolateral fronto-parietal and cerebellar structures, and increased white matter volume in distributed frontal and insular regions. Furthermore, current language-neuroanatomy correlation patterns were similar across subgroups with or without language delay. High-functioning adult males with ASC show neuroanatomical variations associated with both developmental and current language characteristics. This underscores the importance of including both developmental and current language as specifiers for ASC, to help clarify heterogeneity.This work was supported by the Waterloo Foundation [grant number 921/1247 to S.B-C. and M-C.L.], the UK Medical Research Council [grant number GO 400061 to D.G.M.M., S.B-C. and E.T.B.], and the European Autism Interventions - A Multicentre Study for Developing New Medications (EU-AIMS); EU-AIMS receives support from the Innovative Medicines Initiative Joint Undertaking under grant agreement n° 115300, resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013), from the EFPIA companies in kind contribution and from Autism Speaks. During the period of this work M-C.L. was supported by the Waterloo Foundation, the Ministry of Education, Taiwan, Wolfson College, Cambridge, EU-AIMS, and the William Binks Autism Neuroscience Fellowship; M.V.L. by the Shirley Foundation, the Wellcome Trust, the British Academy, and Jesus College, Cambridge; B.C. by the UK Medical Research Council; S.B-C. by the Wellcome Trust, the UK Medical Research Council, the Waterloo Foundation, the Autism Research Trust, and EU-AIMS.This is the final published version. It was originally published by OUP at http://cercor.oxfordjournals.org/content/early/2014/09/18/cercor.bhu211
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
We consider monotone variational inequality (VI) problems in multi-GPU
settings where multiple processors/workers/clients have access to local
stochastic dual vectors. This setting includes a broad range of important
problems from distributed convex minimization to min-max and games.
Extra-gradient, which is a de facto algorithm for monotone VI problems, has not
been designed to be communication-efficient. To this end, we propose a
quantized generalized extra-gradient (Q-GenX), which is an unbiased and
adaptive compression method tailored to solve VIs. We provide an adaptive
step-size rule, which adapts to the respective noise profiles at hand and
achieve a fast rate of under relative noise, and an
order-optimal under absolute noise and show
distributed training accelerates convergence. Finally, we validate our
theoretical results by providing real-world experiments and training generative
adversarial networks on multiple GPUs.Comment: International Conference on Learning Representations (ICLR 2023
Learning to Predict Combinatorial Structures
The major challenge in designing a discriminative learning algorithm for
predicting structured data is to address the computational issues arising from
the exponential size of the output space. Existing algorithms make different
assumptions to ensure efficient, polynomial time estimation of model
parameters. For several combinatorial structures, including cycles, partially
ordered sets, permutations and other graph classes, these assumptions do not
hold. In this thesis, we address the problem of designing learning algorithms
for predicting combinatorial structures by introducing two new assumptions: (i)
The first assumption is that a particular counting problem can be solved
efficiently. The consequence is a generalisation of the classical ridge
regression for structured prediction. (ii) The second assumption is that a
particular sampling problem can be solved efficiently. The consequence is a new
technique for designing and analysing probabilistic structured prediction
models. These results can be applied to solve several complex learning problems
including but not limited to multi-label classification, multi-category
hierarchical classification, and label ranking.Comment: PhD thesis, Department of Computer Science, University of Bonn
(submitted, December 2009
AUTOMATING DATA-LAYOUT DECISIONS IN DOMAIN-SPECIFIC LANGUAGES
A long-standing challenge in High-Performance Computing (HPC) is the simultaneous achievement of programmer productivity and hardware computational efficiency. The challenge has been exacerbated by the onset of multi- and many-core CPUs and accelerators. Only a few expert programmers have been able to hand-code domain-specific data transformations and vectorization schemes needed to extract the best possible performance on such architectures. In this research, we examined the possibility of automating these methods by developing a Domain-Specific Language (DSL) framework. Our DSL approach extends C++14 by embedding into it a high-level data-parallel array language, and by using a domain-specific compiler to compile to hybrid-parallel code. We also implemented an array index-space transformation algebra within this high-level array language to manipulate array data-layouts and data-distributions. The compiler introduces a novel method for SIMD auto-vectorization based on array data-layouts. Our new auto-vectorization technique is shown to outperform the default auto-vectorization strategy by up to 40% for stencil computations. The compiler also automates distributed data movement with overlapping of local compute with remote data movement using polyhedral integer set analysis. Along with these main innovations, we developed a new technique using C++ template metaprogramming for developing embedded DSLs using C++. We also proposed a domain-specific compiler intermediate representation that simplifies data flow analysis of abstract DSL constructs. We evaluated our framework by constructing a DSL for the HPC grand-challenge domain of lattice quantum chromodynamics. Our DSL yielded performance gains of up to twice the flop rate over existing production C code for selected kernels. This gain in performance was obtained while using less than one-tenth the lines of code. The performance of this DSL was also competitive with the best hand-optimized and hand-vectorized code, and is an order of magnitude better than existing production DSLs.Doctor of Philosoph
Recommended from our members
Shape Design and Optimization for 3D Printing
In recent years, the 3D printing technology has become increasingly popular, with wide-spread uses in rapid prototyping, design, art, education, medical applications, food and fashion industries. It enables distributed manufacturing, allowing users to easily produce customized 3D objects in office or at home. The investment in 3D printing technology continues to drive down the cost of 3D printers, making them more affordable to consumers.
As 3D printing becomes more available, it also demands better computer algorithms to assist users in quickly and easily generating 3D content for printing. Creating 3D content often requires considerably more efforts and skills than creating 2D content. In this work, I will study several aspects of 3D shape design and optimization for 3D printing. I start by discussing my work in geometric puzzle design, which is a popular application of 3D printing in recreational math and art. Given user-provided input figures, the goal is to compute the minimum (or best) set of geometric shapes that can satisfy the given constraints (such as dissection constraints). The puzzle design also has to consider feasibility, such as avoiding interlocking pieces. I present two optimization-based algorithms to automatically generate customized 3D geometric puzzles, which can be directly printed for users to enjoy. They are also great tools for geometry education.
Next, I discuss shape optimization for printing functional tools and parts. Although current 3D modeling software allows a novice user to easily design 3D shapes, the resulting shapes are not guaranteed to meet required physical strength. For example, a poorly designed stool may easily collapse when a person sits on the stool; a poorly designed wrench may easily break under force. I study new algorithms to help users strengthen functional shapes in order to meet specific physical properties. The algorithm uses an optimization-based framework â it performs geometric shape deformation and structural optimization iteratively to minimize mechanical stresses in the presence of forces assuming typical use scenarios. Physically-based simulation is performed at run-time to evaluate the functional properties of the shape (e.g., mechanical stresses based on finite element methods), and the optimizer makes use of this information to improve the shape. Experimental results show that my algorithm can successfully optimize various 3D shapes, such as chairs, tables, utility tools, to withstand higher forces, while preserving the original shape as much as possible.
To improve the efficiency of physics simulation for general shapes, I also introduce a novel, SPH-based sampling algorithm, which can provide better tetrahedralization for use in the physics simulator. My new modeling algorithm can greatly reduce the design time, allowing users to quickly generate functional shapes that meet required physical standards
Frank-Wolfe Methods for Optimization and Machine Learning
In Chapter 2, we present the Frank-Wolfe algorithm (FW) and all necessary background material. We explain the projection-free and sparsity properties of the algorithm, provide motivation for real-world problems, and analyze the convergence rates and a lower bound on the complexity.
In Chapter 3, we review the complexity bounds of linear minimizations and projections on several sets commonly used in optimization, providing a rigorous support to the use of FW. We also propose two methods for projecting onto the lp-ball and the Birkhoff polytope respectively, and we analyze their complexity. Computational experiments for the l1-ball and the nuclear norm-ball are presented.
In Chapter 4, we identify the well-known drawback in FW, a naive zig-zagging phenomenon that slows down the algorithm. In response to this issue, we propose a boosting procedure generating descent directions better aligned with the negative gradients and preserving the projection-free property. Although the method is relatively simple and intuitive, it provides significant computational speedups over the state of the art on a variety of experiments.
In Chapter 5, we address the large-scale finite-sum optimization setting arising in many tasks of machine learning. Based on a sliding technique, we propose a generic template to integrate adaptive gradients into stochastic Frank-Wolfe algorithms in a practical way. Computational experiments on standard convex optimization problems and on the nonconvex training of neural networks demonstrate that the blend of the two methods is successful.
Both developments in Chapters 4 and 5 are motivated by the projection-free property of FW. In Chapter 6, we leverage the natural sparsity of the iterates generated by FW and study an application to the approximate Carathéodory problem. We show that FW generates a simple solution to the problem and that with no modification of the algorithm, better cardinality bounds can be established using existing convergence analysis of FW in different scenarios. We also consider a nonsmooth variant of FW.
In Chapter 7, we carry on with the sparsity property and we consider an extension of the Frank-Wolfe algorithm to the unconstrained setting. It addresses smooth convex optimization problems over the linear span of a given set and resembles the matching pursuit algorithm. We propose a blending method that combines fast convergence and high sparsity of the iterates. Computational experiments validate the purpose of our method.Ph.D
- âŠ