126,434 research outputs found

    Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

    Full text link
    The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which consist of continuous feature vectors. These can be discovered using powerful optimization methods, but they cannot be easily interpreted, re-used across models, or plugged into a text-based interface. We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization. Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications. In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model. In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.Comment: 15 pages, 12 figures, Code is available at https://github.com/YuxinWenRick/hard-prompts-made-eas

    A Comparison of Discrete and Continuous Neural Network Approaches to Solve the Class/Teacher Timetabling Problem

    Get PDF
    This study explores the application of neural network-based heuristics to the class/teacher timetabling problem (CTTP). The paper begins by presenting the basic CTTP characteristics in terms of hard and soft constraints and proposing a formulation for the energy function required to map the problem within the artificial neural network model. There follow two distinct approaches to simulating neural network evolution. The first uses a Potts mean-field annealing simulation based on continuous Potts neurons, which has obtained favorable results in various combi¬natorial optimization problems. Afterwards, a discrete neural network simulation, based on discrete winner-take-all neurons, is proposed. The paper concludes with a comparison of the computational results taken from the application of both heuris¬tics to hard hypothetical and real CTTP instances. This experiment demonstrates that the discrete approach performs better, in terms of solution quality as well as execution time

    A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models

    Full text link
    Beam search is a desirable choice of test-time decoding algorithm for neural sequence models because it potentially avoids search errors made by simpler greedy methods. However, typical cross entropy training procedures for these models do not directly consider the behaviour of the final decoding method. As a result, for cross-entropy trained models, beam decoding can sometimes yield reduced test performance when compared with greedy decoding. In order to train models that can more effectively make use of beam search, we propose a new training procedure that focuses on the final loss metric (e.g. Hamming loss) evaluated on the output of beam search. While well-defined, this "direct loss" objective is itself discontinuous and thus difficult to optimize. Hence, in our approach, we form a sub-differentiable surrogate objective by introducing a novel continuous approximation of the beam search decoding procedure. In experiments, we show that optimizing this new training objective yields substantially better results on two sequence tasks (Named Entity Recognition and CCG Supertagging) when compared with both cross entropy trained greedy decoding and cross entropy trained beam decoding baselines.Comment: Updated for clarity and notational consistenc

    Optimization of Excitation in FDTD Method and Corresponding Source Modeling

    Get PDF
    Source and excitation modeling in FDTD formulation has a significant impact on the method performance and the required simulation time. Since the abrupt source introduction yields intensive numerical variations in whole computational domain, a generally accepted solution is to slowly introduce the source, using appropriate shaping functions in time. The main goal of the optimization presented in this paper is to find balance between two opposite demands: minimal required computation time and acceptable degradation of simulation performance. Reducing the time necessary for source activation and deactivation is an important issue, especially in design of microwave structures, when the simulation is intensively repeated in the process of device parameter optimization. Here proposed optimized source models are realized and tested within an own developed FDTD simulation environment
    • …
    corecore