Search CORE

126,434 research outputs found

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

Author: Geiping Jonas
Goldblum Micah
Goldstein Tom
Jain Neel
Kirchenbauer John
Wen Yuxin
Publication venue
Publication date: 01/06/2023
Field of study

The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which consist of continuous feature vectors. These can be discovered using powerful optimization methods, but they cannot be easily interpreted, re-used across models, or plugged into a text-based interface. We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization. Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications. In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model. In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.Comment: 15 pages, 12 figures, Code is available at https://github.com/YuxinWenRick/hard-prompts-made-eas

arXiv.org e-Print Archive

A Comparison of Discrete and Continuous Neural Network Approaches to Solve the Class/Teacher Timetabling Problem

Author: Carrasco Marco Paulo
Pato Margarida Vaz
Publication venue: Centro de Investigação Operacional - Universidade de Lisboa
Publication date: 01/01/2001
Field of study

This study explores the application of neural network-based heuristics to the class/teacher timetabling problem (CTTP). The paper begins by presenting the basic CTTP characteristics in terms of hard and soft constraints and proposing a formulation for the energy function required to map the problem within the artificial neural network model. There follow two distinct approaches to simulating neural network evolution. The first uses a Potts mean-field annealing simulation based on continuous Potts neurons, which has obtained favorable results in various combi¬natorial optimization problems. Afterwards, a discrete neural network simulation, based on discrete winner-take-all neurons, is proposed. The paper concludes with a comparison of the computational results taken from the application of both heuris¬tics to hard hypothetical and real CTTP instances. This experiment demonstrates that the discrete approach performs better, in terms of solution quality as well as execution time

UTL Repository

A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models

Author: Berg-Kirkpatrick Taylor
Dyer Chris
Goyal Kartik
Neubig Graham
Publication venue
Publication date: 06/10/2017
Field of study

Beam search is a desirable choice of test-time decoding algorithm for neural sequence models because it potentially avoids search errors made by simpler greedy methods. However, typical cross entropy training procedures for these models do not directly consider the behaviour of the final decoding method. As a result, for cross-entropy trained models, beam decoding can sometimes yield reduced test performance when compared with greedy decoding. In order to train models that can more effectively make use of beam search, we propose a new training procedure that focuses on the final loss metric (e.g. Hamming loss) evaluated on the output of beam search. While well-defined, this "direct loss" objective is itself discontinuous and thus difficult to optimize. Hence, in our approach, we form a sub-differentiable surrogate objective by introducing a novel continuous approximation of the beam search decoding procedure. In experiments, we show that optimizing this new training objective yields substantially better results on two sequence tasks (Named Entity Recognition and CCG Supertagging) when compared with both cross entropy trained greedy decoding and cross entropy trained beam decoding baselines.Comment: Updated for clarity and notational consistenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Optimization of Excitation in FDTD Method and Corresponding Source Modeling

Author: Aleksic S.
Dimitrijevic B.
Nikolic B.
Raicevic N.
Publication venue: 'Brno University of Technology'
Publication date: 01/04/2015
Field of study

Source and excitation modeling in FDTD formulation has a significant impact on the method performance and the required simulation time. Since the abrupt source introduction yields intensive numerical variations in whole computational domain, a generally accepted solution is to slowly introduce the source, using appropriate shaping functions in time. The main goal of the optimization presented in this paper is to find balance between two opposite demands: minimal required computation time and acceptable degradation of simulation performance. Reducing the time necessary for source activation and deactivation is an important issue, especially in design of microwave structures, when the simulation is intensively repeated in the process of device parameter optimization. Here proposed optimized source models are realized and tested within an own developed FDTD simulation environment

Directory of Open Access Journals

Digital library of Brno University of Technology