126,434 research outputs found
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
The strength of modern generative models lies in their ability to be
controlled through text-based prompts. Typical "hard" prompts are made from
interpretable words and tokens, and must be hand-crafted by humans. There are
also "soft" prompts, which consist of continuous feature vectors. These can be
discovered using powerful optimization methods, but they cannot be easily
interpreted, re-used across models, or plugged into a text-based interface.
We describe an approach to robustly optimize hard text prompts through
efficient gradient-based optimization. Our approach automatically generates
hard text-based prompts for both text-to-image and text-to-text applications.
In the text-to-image setting, the method creates hard prompts for diffusion
models, allowing API users to easily generate, discover, and mix and match
image concepts without prior knowledge on how to prompt the model. In the
text-to-text setting, we show that hard prompts can be automatically discovered
that are effective in tuning LMs for classification.Comment: 15 pages, 12 figures, Code is available at
https://github.com/YuxinWenRick/hard-prompts-made-eas
A Comparison of Discrete and Continuous Neural Network Approaches to Solve the Class/Teacher Timetabling Problem
This study explores the application of neural network-based heuristics to the class/teacher timetabling problem (CTTP). The paper begins by presenting the basic CTTP characteristics in terms of hard and soft constraints and proposing a formulation for the energy function required to map the problem within the artificial neural network model. There follow two distinct approaches to simulating neural network evolution. The first uses a Potts mean-field annealing simulation based on continuous Potts neurons, which has obtained favorable results in various combi¬natorial optimization problems. Afterwards, a discrete neural network simulation, based on discrete winner-take-all neurons, is proposed. The paper concludes with a comparison of the computational results taken from the application of both heuris¬tics to hard hypothetical and real CTTP instances. This experiment demonstrates that the discrete approach performs better, in terms of solution quality as well as execution time
A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models
Beam search is a desirable choice of test-time decoding algorithm for neural
sequence models because it potentially avoids search errors made by simpler
greedy methods. However, typical cross entropy training procedures for these
models do not directly consider the behaviour of the final decoding method. As
a result, for cross-entropy trained models, beam decoding can sometimes yield
reduced test performance when compared with greedy decoding. In order to train
models that can more effectively make use of beam search, we propose a new
training procedure that focuses on the final loss metric (e.g. Hamming loss)
evaluated on the output of beam search. While well-defined, this "direct loss"
objective is itself discontinuous and thus difficult to optimize. Hence, in our
approach, we form a sub-differentiable surrogate objective by introducing a
novel continuous approximation of the beam search decoding procedure. In
experiments, we show that optimizing this new training objective yields
substantially better results on two sequence tasks (Named Entity Recognition
and CCG Supertagging) when compared with both cross entropy trained greedy
decoding and cross entropy trained beam decoding baselines.Comment: Updated for clarity and notational consistenc
Optimization of Excitation in FDTD Method and Corresponding Source Modeling
Source and excitation modeling in FDTD formulation has a significant impact on the method performance and the required simulation time. Since the abrupt source introduction yields intensive numerical variations in whole computational domain, a generally accepted solution is to slowly introduce the source, using appropriate shaping functions in time. The main goal of the optimization presented in this paper is to find balance between two opposite demands: minimal required computation time and acceptable degradation of simulation performance. Reducing the time necessary for source activation and deactivation is an important issue, especially in design of microwave structures, when the simulation is intensively repeated in the process of device parameter optimization. Here proposed optimized source models are realized and tested within an own developed FDTD simulation environment
- …