21,631 research outputs found
Few-Shot Bayesian Imitation Learning with Logical Program Policies
Humans can learn many novel tasks from a very small number (1--5) of
demonstrations, in stark contrast to the data requirements of nearly tabula
rasa deep learning methods. We propose an expressive class of policies, a
strong but general prior, and a learning algorithm that, together, can learn
interesting policies from very few examples. We represent policies as logical
combinations of programs drawn from a domain-specific language (DSL), define a
prior over policies with a probabilistic grammar, and derive an approximate
Bayesian inference algorithm to learn policies from demonstrations. In
experiments, we study five strategy games played on a 2D grid with one shared
DSL. After a few demonstrations of each game, the inferred policies generalize
to new game instances that differ substantially from the demonstrations. Our
policy learning is 20--1,000x more data efficient than convolutional and fully
convolutional policy learning and many orders of magnitude more computationally
efficient than vanilla program induction. We argue that the proposed method is
an apt choice for tasks that have scarce training data and feature significant,
structured variation between task instances.Comment: AAAI 202
Spoof detection using time-delay shallow neural network and feature switching
Detecting spoofed utterances is a fundamental problem in voice-based
biometrics. Spoofing can be performed either by logical accesses like speech
synthesis, voice conversion or by physical accesses such as replaying the
pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based
speaker verification approach, this paper proposes a time-delay shallow neural
network (TD-SNN) for spoof detection for both logical and physical access. The
novelty of the proposed TD-SNN system vis-a-vis conventional DNN systems is
that it can handle variable length utterances during testing. Performance of
the proposed TD-SNN systems and the baseline Gaussian mixture models (GMMs) is
analyzed on the ASV-spoof-2019 dataset. The performance of the systems is
measured in terms of the minimum normalized tandem detection cost function
(min-t-DCF). When studied with individual features, the TD-SNN system
consistently outperforms the GMM system for physical access. For logical
access, GMM surpasses TD-SNN systems for certain individual features. When
combined with the decision-level feature switching (DLFS) paradigm, the best
TD-SNN system outperforms the best baseline GMM system on evaluation data with
a relative improvement of 48.03\% and 49.47\% for both logical and physical
access, respectively
Speech vocoding for laboratory phonology
Using phonological speech vocoding, we propose a platform for exploring
relations between phonology and speech processing, and in broader terms, for
exploring relations between the abstract and physical structures of a speech
signal. Our goal is to make a step towards bridging phonology and speech
processing and to contribute to the program of Laboratory Phonology. We show
three application examples for laboratory phonology: compositional phonological
speech modelling, a comparison of phonological systems and an experimental
phonological parametric text-to-speech (TTS) system. The featural
representations of the following three phonological systems are considered in
this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English
(SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded
speech, we conclude that the latter achieves slightly better results than the
former. However, GP - the most compact phonological speech representation -
performs comparably to the systems with a higher number of phonological
features. The parametric TTS based on phonological speech representation, and
trained from an unlabelled audiobook in an unsupervised manner, achieves
intelligibility of 85% of the state-of-the-art parametric speech synthesis. We
envision that the presented approach paves the way for researchers in both
fields to form meaningful hypotheses that are explicitly testable using the
concepts developed and exemplified in this paper. On the one hand, laboratory
phonologists might test the applied concepts of their theoretical models, and
on the other hand, the speech processing community may utilize the concepts
developed for the theoretical phonological models for improvements of the
current state-of-the-art applications
Process Mining of Programmable Logic Controllers: Input/Output Event Logs
This paper presents an approach to model an unknown Ladder Logic based
Programmable Logic Controller (PLC) program consisting of Boolean logic and
counters using Process Mining techniques. First, we tap the inputs and outputs
of a PLC to create a data flow log. Second, we propose a method to translate
the obtained data flow log to an event log suitable for Process Mining. In a
third step, we propose a hybrid Petri net (PN) and neural network approach to
approximate the logic of the actual underlying PLC program. We demonstrate the
applicability of our proposed approach on a case study with three simulated
scenarios
- …