94,494 research outputs found
Permutation Equivariant Neural Functionals
This work studies the design of neural networks that can process the weights
or gradients of other neural networks, which we refer to as neural functional
networks (NFNs). Despite a wide range of potential applications, including
learned optimization, processing implicit neural representations, network
editing, and policy evaluation, there are few unifying principles for designing
effective architectures that process the weights of other networks. We approach
the design of neural functionals through the lens of symmetry, in particular by
focusing on the permutation symmetries that arise in the weights of deep
feedforward networks because hidden layer neurons have no inherent order. We
introduce a framework for building permutation equivariant neural functionals,
whose architectures encode these symmetries as an inductive bias. The key
building blocks of this framework are NF-Layers (neural functional layers) that
we constrain to be permutation equivariant through an appropriate parameter
sharing scheme. In our experiments, we find that permutation equivariant neural
functionals are effective on a diverse set of tasks that require processing the
weights of MLPs and CNNs, such as predicting classifier generalization,
producing "winning ticket" sparsity masks for initializations, and classifying
or editing implicit neural representations (INRs). In addition, we provide code
for our models and experiments at https://github.com/AllanYangZhou/nfn.Comment: To appear in Neural Information Processing Systems (NeurIPS), 202
A Genetic Programming Approach to Designing Convolutional Neural Network Architectures
The convolutional neural network (CNN), which is one of the deep learning
models, has seen much success in a variety of computer vision tasks. However,
designing CNN architectures still requires expert knowledge and a lot of trial
and error. In this paper, we attempt to automatically construct CNN
architectures for an image classification task based on Cartesian genetic
programming (CGP). In our method, we adopt highly functional modules, such as
convolutional blocks and tensor concatenation, as the node functions in CGP.
The CNN structure and connectivity represented by the CGP encoding method are
optimized to maximize the validation accuracy. To evaluate the proposed method,
we constructed a CNN architecture for the image classification task with the
CIFAR-10 dataset. The experimental result shows that the proposed method can be
used to automatically find the competitive CNN architecture compared with
state-of-the-art models.Comment: This is the revised version of the GECCO 2017 paper. The code of our
method is available at https://github.com/sg-nm/cgp-cn
Recommended from our members
Targeted sequence design within the coarse-grained polymer genome
The chemical design of polymers with target structural and/or functional properties represents a grand challenge in materials science. While data-driven design approaches are promising, success with polymers has been limited, largely due to limitations in data availability. Here, we demonstrate the targeted sequence design of single-chain structure in polymers by combining coarse-grained modeling, machine learning, and model optimization. Nearly 2000 unique coarse-grained polymers are simulated to construct and analyze machine learning models. We find that deep neural networks inexpensively and reliably predict structural properties with limited sequence information as input. By coupling trained ML models with sequential model-based optimization, polymer sequences are proposed to exhibit globular, swollen, or rod-like behaviors, which are verified by explicit simulations. This work highlights the promising integration of coarse-grained modeling with data-driven design and represents a necessary and crucial step toward more complex polymer design efforts
Machine learning-guided directed evolution for protein engineering
Machine learning (ML)-guided directed evolution is a new paradigm for
biological design that enables optimization of complex functions. ML methods
use data to predict how sequence maps to function without requiring a detailed
model of the underlying physics or biological pathways. To demonstrate
ML-guided directed evolution, we introduce the steps required to build ML
sequence-function models and use them to guide engineering, making
recommendations at each stage. This review covers basic concepts relevant to
using ML for protein engineering as well as the current literature and
applications of this new engineering paradigm. ML methods accelerate directed
evolution by learning from information contained in all measured variants and
using that information to select sequences that are likely to be improved. We
then provide two case studies that demonstrate the ML-guided directed evolution
process. We also look to future opportunities where ML will enable discovery of
new protein functions and uncover the relationship between protein sequence and
function.Comment: Made significant revisions to focus on aspects most relevant to
applying machine learning to speed up directed evolutio
PowerPlanningDL: Reliability-Aware Framework for On-Chip Power Grid Design using Deep Learning
With the increase in the complexity of chip designs, VLSI physical design has
become a time-consuming task, which is an iterative design process. Power
planning is that part of the floorplanning in VLSI physical design where power
grid networks are designed in order to provide adequate power to all the
underlying functional blocks. Power planning also requires multiple iterative
steps to create the power grid network while satisfying the allowed worst-case
IR drop and Electromigration (EM) margin. For the first time, this paper
introduces Deep learning (DL)-based framework to approximately predict the
initial design of the power grid network, considering different reliability
constraints. The proposed framework reduces many iterative design steps and
speeds up the total design cycle. Neural Network-based multi-target regression
technique is used to create the DL model. Feature extraction is done, and the
training dataset is generated from the floorplans of some of the power grid
designs extracted from the IBM processor. The DL model is trained using the
generated dataset. The proposed DL-based framework is validated using a new set
of power grid specifications (obtained by perturbing the designs used in the
training phase). The results show that the predicted power grid design is
closer to the original design with minimal prediction error (~2%). The proposed
DL-based approach also improves the design cycle time with a speedup of ~6X for
standard power grid benchmarks.Comment: Published in proceedings of IEEE/ACM Design, Automation and Test in
Europe Conference (DATE) 2020, 6 page
- …