2,285 research outputs found
KL-Divergence Guided Two-Beam Viterbi Algorithm on Factorial HMMs
This thesis addresses the problem of the high computation complexity issue that arises when decoding hidden Markov models (HMMs) with a large number of states. A novel approach, the two-beam Viterbi, with an extra forward beam, for decoding HMMs is implemented on a system that uses factorial HMM to simultaneously recognize a pair of isolated digits on one audio channel. The two-beam Viterbi algorithm uses KL-divergence and hierarchical clustering to reduce the overall decoding complexity. This novel approach achieves 60% less computation compared to the baseline algorithm, the Viterbi beam search, while maintaining 82.5% recognition accuracy.Ope
On relational homomorphisms of automata
This paper investigates the concepts of relational homomorphisms and their closely associated concepts of generalized congruence relations on automata which are in general incomplete, nondeterministic, and infinite. The concept of generalized isomorphism, which is a natural extension of the isomorphism concept in dealing with nondeterministic automata, is also studied
IMMA: Immunizing text-to-image Models against Malicious Adaptation
Advancements in text-to-image models and fine-tuning methods have led to the
increasing risk of malicious adaptation, i.e., fine-tuning to generate harmful
unauthorized content. Recent works, e.g., Glaze or MIST, have developed
data-poisoning techniques which protect the data against adaptation methods. In
this work, we consider an alternative paradigm for protection. We propose to
``immunize'' the model by learning model parameters that are difficult for the
adaptation methods when fine-tuning malicious content; in short IMMA. Empirical
results show IMMA's effectiveness against malicious adaptations, including
mimicking the artistic style and learning of inappropriate/unauthorized
content, over three adaptation methods: LoRA, Textual-Inversion, and
DreamBooth
AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model
Ambigrams are calligraphic designs that have different meanings depending on
the viewing orientation. Creating ambigrams is a challenging task even for
skilled artists, as it requires maintaining the meaning under two different
viewpoints at the same time. In this work, we propose to generate ambigrams by
distilling a large-scale vision and language diffusion model, namely DeepFloyd
IF, to optimize the letters' outline for legibility in the two viewing
orientations. Empirically, we demonstrate that our approach outperforms
existing ambigram generation methods. On the 500 most common words in English,
our method achieves more than an 11.6% increase in word accuracy and at least a
41.9% reduction in edit distance.Comment: Project page: https://raymond-yeh.com/AmbiGen
Stable and symmetric convolutional neural network
First we present a proof that convolutional neural networks (CNNs) with max-norm regularization, max-pooling, and Relu non-linearity are stable to additive noise. Second, we explore the use of symmetric and antisymmetric filters in a baseline CNN model on digit classification, which enjoys the stability to additive noise. Experimental results indicate that the symmetric CNN outperforms the baseline model for nearly all training sizes and matches the state-of-the-art deep-net in the cases of limited training examples
- …