5,171 research outputs found
Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling
Fixed-vocabulary language models fail to account for one of the most
characteristic statistical facts of natural language: the frequent creation and
reuse of new word types. Although character-level language models offer a
partial solution in that they can create word types not attested in the
training corpus, they do not capture the "bursty" distribution of such words.
In this paper, we augment a hierarchical LSTM language model that generates
sequences of word tokens character by character with a caching mechanism that
learns to reuse previously generated words. To validate our model we construct
a new open-vocabulary language modeling corpus (the Multilingual Wikipedia
Corpus, MWC) from comparable Wikipedia articles in 7 typologically diverse
languages and demonstrate the effectiveness of our model across this range of
languages.Comment: ACL 201
Parametric Modelling of Multivariate Count Data Using Probabilistic Graphical Models
Multivariate count data are defined as the number of items of different
categories issued from sampling within a population, which individuals are
grouped into categories. The analysis of multivariate count data is a recurrent
and crucial issue in numerous modelling problems, particularly in the fields of
biology and ecology (where the data can represent, for example, children counts
associated with multitype branching processes), sociology and econometrics. We
focus on I) Identifying categories that appear simultaneously, or on the
contrary that are mutually exclusive. This is achieved by identifying
conditional independence relationships between the variables; II)Building
parsimonious parametric models consistent with these relationships; III)
Characterising and testing the effects of covariates on the joint distribution
of the counts. To achieve these goals, we propose an approach based on
graphical probabilistic models, and more specifically partially directed
acyclic graphs
Staging Transformations for Multimodal Web Interaction Management
Multimodal interfaces are becoming increasingly ubiquitous with the advent of
mobile devices, accessibility considerations, and novel software technologies
that combine diverse interaction media. In addition to improving access and
delivery capabilities, such interfaces enable flexible and personalized dialogs
with websites, much like a conversation between humans. In this paper, we
present a software framework for multimodal web interaction management that
supports mixed-initiative dialogs between users and websites. A
mixed-initiative dialog is one where the user and the website take turns
changing the flow of interaction. The framework supports the functional
specification and realization of such dialogs using staging transformations --
a theory for representing and reasoning about dialogs based on partial input.
It supports multiple interaction interfaces, and offers sessioning, caching,
and co-ordination functions through the use of an interaction manager. Two case
studies are presented to illustrate the promise of this approach.Comment: Describes framework and software architecture for multimodal web
interaction managemen
- …