6,656 research outputs found
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
In this work, we propose a novel robot learning framework called Neural Task
Programming (NTP), which bridges the idea of few-shot learning from
demonstration and neural program induction. NTP takes as input a task
specification (e.g., video demonstration of a task) and recursively decomposes
it into finer sub-task specifications. These specifications are fed to a
hierarchical neural program, where bottom-level programs are callable
subroutines that interact with the environment. We validate our method in three
robot manipulation tasks. NTP achieves strong generalization across sequential
tasks that exhibit hierarchal and compositional structures. The experimental
results show that NTP learns to generalize well to- wards unseen tasks with
increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201
Data-Augmented Structure-Property Mapping for Accelerating Computational Design of Advanced Material Systems
abstract: Advanced material systems refer to materials that are comprised of multiple traditional constituents but complex microstructure morphologies, which lead to their superior properties over conventional materials. This dissertation is motivated by the grand challenge in accelerating the design of advanced material systems through systematic optimization with respect to material microstructures or processing settings. While optimization techniques have mature applications to a large range of engineering systems, their application to material design meets unique challenges due to the high dimensionality of microstructures and the high costs in computing process-structure-property (PSP) mappings. The key to addressing these challenges is the learning of material representations and predictive PSP mappings while managing a small data acquisition budget. This dissertation thus focuses on developing learning mechanisms that leverage context-specific meta-data and physics-based theories. Two research tasks will be conducted: In the first, we develop a statistical generative model that learns to characterize high-dimensional microstructure samples using low-dimensional features. We improve the data efficiency of a variational autoencoder by introducing a morphology loss to the training. We demonstrate that the resultant microstructure generator is morphology-aware when trained on a small set of material samples, and can effectively constrain the microstructure space during material design. In the second task, we investigate an active learning mechanism where new samples are acquired based on their violation to a theory-driven constraint on the physics-based model. We demonstrate using a topology optimization case that while data acquisition through the physics-based model is often expensive (e.g., obtaining microstructures through simulation or optimization processes), the evaluation of the constraint can be far more affordable (e.g., checking whether a solution is optimal or equilibrium). We show that this theory-driven learning algorithm can lead to much improved learning efficiency and generalization performance when such constraints can be derived. The outcomes of this research is a better understanding of how physics knowledge about material systems can be integrated into machine learning frameworks, in order to achieve more cost-effective and reliable learning of material representations and predictive models, which are essential to accelerate computational material design.Dissertation/ThesisDoctoral Dissertation Mechanical Engineering 201
Large-Scale Optical Neural Networks based on Photoelectric Multiplication
Recent success in deep neural networks has generated strong interest in
hardware accelerators to improve speed and energy consumption. This paper
presents a new type of photonic accelerator based on coherent detection that is
scalable to large () networks and can be operated at high (GHz)
speeds and very low (sub-aJ) energies per multiply-and-accumulate (MAC), using
the massive spatial multiplexing enabled by standard free-space optical
components. In contrast to previous approaches, both weights and inputs are
optically encoded so that the network can be reprogrammed and trained on the
fly. Simulations of the network using models for digit- and
image-classification reveal a "standard quantum limit" for optical neural
networks, set by photodetector shot noise. This bound, which can be as low as
50 zJ/MAC, suggests performance below the thermodynamic (Landauer) limit for
digital irreversible computation is theoretically possible in this device. The
proposed accelerator can implement both fully-connected and convolutional
networks. We also present a scheme for back-propagation and training that can
be performed in the same hardware. This architecture will enable a new class of
ultra-low-energy processors for deep learning.Comment: Text: 10 pages, 5 figures, 1 table. Supplementary: 8 pages, 5,
figures, 2 table
Optimisation for Optical Data Centre Switching and Networking with Artificial Intelligence
Cloud and cluster computing platforms have become standard across almost every domain of business, and their scale quickly approaches servers in a single warehouse. However, the tier-based opto-electronically packet switched network infrastructure that is standard across these systems gives way to several scalability bottlenecks including resource fragmentation and high energy requirements. Experimental results show that optical circuit switched networks pose a promising alternative that could avoid these.
However, optimality challenges are encountered at realistic commercial scales. Where exhaustive optimisation techniques are not applicable for problems at the scale of Cloud-scale computer networks, and expert-designed heuristics are performance-limited and typically biased in their design, artificial intelligence can discover more scalable and better performing optimisation strategies.
This thesis demonstrates these benefits through experimental and theoretical work spanning all of component, system and commercial optimisation problems which stand in the way of practical Cloud-scale computer network systems. Firstly, optical components are optimised to gate in and are demonstrated in a proof-of-concept switching architecture for optical data centres with better wavelength and component scalability than previous demonstrations. Secondly, network-aware resource allocation schemes for optically composable data centres are learnt end-to-end with deep reinforcement learning and graph neural networks, where less networking resources are required to achieve the same resource efficiency compared to conventional methods. Finally, a deep reinforcement learning based method for optimising PID-control parameters is presented which generates tailored parameters for unseen devices in . This method is demonstrated on a market leading optical switching product based on piezoelectric actuation, where switching speed is improved with no compromise to optical loss and the manufacturing yield of actuators is improved. This method was licensed to and integrated within the manufacturing pipeline of this company. As such, crucial public and private infrastructure utilising these products will benefit from this work
Real-time topology optimization via learnable mappings
In traditional topology optimization, the computing time required to
iteratively update the material distribution within a design domain strongly
depends on the complexity or size of the problem, limiting its application in
real engineering contexts. This work proposes a multi-stage machine learning
strategy that aims to predict an optimal topology and the related stress fields
of interest, either in 2D or 3D, without resorting to any iterative analysis
and design process. The overall topology optimization is treated as regression
task in a low-dimensional latent space, that encodes the variability of the
target designs. First, a fully-connected model is employed to surrogate the
functional link between the parametric input space characterizing the design
problem and the latent space representation of the corresponding optimal
topology. The decoder branch of an autoencoder is then exploited to reconstruct
the desired optimal topology from its latent representation. The deep learning
models are trained on a dataset generated through a standard method of topology
optimization implementing the solid isotropic material with penalization, for
varying boundary and loading conditions. The underlying hypothesis behind the
proposed strategy is that optimal topologies share enough common patterns to be
compressed into small latent space representations without significant
information loss. Results relevant to a 2D Messerschmitt-B\"olkow-Blohm beam
and a 3D bridge case demonstrate the capabilities of the proposed framework to
provide accurate optimal topology predictions in a fraction of a second
Anomaly Detection, Rule Adaptation and Rule Induction Methodologies in the Context of Automated Sports Video Annotation.
Automated video annotation is a topic of considerable interest in computer vision due to its applications in video search, object based video encoding and enhanced broadcast content. The domain of sport broadcasting is, in particular, the subject of current research attention due to its fixed, rule governed, content. This research work aims to develop, analyze and demonstrate novel methodologies that can be useful in the context of adaptive and automated video annotation systems. In this thesis, we present methodologies for addressing the problems of anomaly detection, rule adaptation and rule induction for court based sports such as tennis and badminton. We first introduce an HMM induction strategy for a court-model based method that uses the court structure in the form of a lattice for two related modalities of singles and doubles tennis to tackle the problems of anomaly detection and rectification. We also introduce another anomaly detection methodology that is based on the disparity between the low-level vision based classifiers and the high-level contextual classifier. Another approach to address the problem of rule adaptation is also proposed that employs Convex hulling of the anomalous states. We also investigate a number of novel hierarchical HMM generating methods for stochastic induction of game rules. These methodologies include, Cartesian product Label-based Hierarchical Bottom-up Clustering (CLHBC) that employs prior information within the label structures. A new constrained variant of the classical Chinese Restaurant Process (CRP) is also introduced that is relevant to sports games. We also propose two hybrid methodologies in this context and a comparative analysis is made against the flat Markov model. We also show that these methods are also generalizable to other rule based environments
- …