13 research outputs found
Automatic Design of Neural Architectures for Recommendation and Ranking Tasks
Designing the right neural network architecture for a given machine-learning task is critical for performance. For example, the most appropriate neural networks for tasks such as image classification, speech recognition, click-through-rate prediction, etc. are different from each other. This disclosure describes a framework for conducting searches for neural architectures that perform recommendation and ranking tasks
Carbon-Efficient Neural Architecture Search
This work presents a novel approach to neural architecture search (NAS) that
aims to reduce energy costs and increase carbon efficiency during the model
design process. The proposed framework, called carbon-efficient NAS (CE-NAS),
consists of NAS evaluation algorithms with different energy requirements, a
multi-objective optimizer, and a heuristic GPU allocation strategy. CE-NAS
dynamically balances energy-efficient sampling and energy-consuming evaluation
tasks based on current carbon emissions. Using a recent NAS benchmark dataset
and two carbon traces, our trace-driven simulations demonstrate that CE-NAS
achieves better carbon and search efficiency than the three baselines
Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction
Click-Through Rate (CTR) prediction is one of the most important machine
learning tasks in recommender systems, driving personalized experience for
billions of consumers. Neural architecture search (NAS), as an emerging field,
has demonstrated its capabilities in discovering powerful neural network
architectures, which motivates us to explore its potential for CTR predictions.
Due to 1) diverse unstructured feature interactions, 2) heterogeneous feature
space, and 3) high data volume and intrinsic data randomness, it is challenging
to construct, search, and compare different architectures effectively for
recommendation models. To address these challenges, we propose an automated
interaction architecture discovering framework for CTR prediction named
AutoCTR. Via modularizing simple yet representative interactions as virtual
building blocks and wiring them into a space of direct acyclic graphs, AutoCTR
performs evolutionary architecture exploration with learning-to-rank guidance
at the architecture level and achieves acceleration using low-fidelity model.
Empirical analysis demonstrates the effectiveness of AutoCTR on different
datasets comparing to human-crafted architectures. The discovered architecture
also enjoys generalizability and transferability among different datasets
BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search
Over the past half-decade, many methods have been considered for neural
architecture search (NAS). Bayesian optimization (BO), which has long had
success in hyperparameter optimization, has recently emerged as a very
promising strategy for NAS when it is coupled with a neural predictor. Recent
work has proposed different instantiations of this framework, for example,
using Bayesian neural networks or graph convolutional networks as the
predictive model within BO. However, the analyses in these papers often focus
on the full-fledged NAS algorithm, so it is difficult to tell which individual
components of the framework lead to the best performance.
In this work, we give a thorough analysis of the "BO + neural predictor"
framework by identifying five main components: the architecture encoding,
neural predictor, uncertainty calibration method, acquisition function, and
acquisition optimization strategy. We test several different methods for each
component and also develop a novel path-based encoding scheme for neural
architectures, which we show theoretically and empirically scales better than
other encodings. Using all of our analyses, we develop a final algorithm called
BANANAS, which achieves state-of-the-art performance on NAS search spaces. We
adhere to the NAS research checklist (Lindauer and Hutter 2019) to facilitate
best practices, and our code is available at
https://github.com/naszilla/naszilla