6,991 research outputs found
An Intelligent Text Extraction and Navigation System
We present sppc, a high-performance system for intelligent text extraction and navigation from German free text documents. The main purpose of sppc is to extract as much linguistic structure as possible for performing domain-specific processing. sppc consists of a set of domain-independent shallow core components which are realized by means of cascaded weighted finite state machines and generic dynamic tries. All extracted information is represented uniformly in one data structure (called the text chart) in a highly compact and linked form in order to support indexing and navigation through the set of solutions. Germa
Working with Constrained Systems: A Review of A. K. Joshi's IJCAI-97 Research Excellence Award Acceptance Lecture
This is a brief review of Joshi's award acceptance lecture published in <I>AI Magazine</I>. This review appeared in the AI Watch column in <I>Computers and Society</I>, a quarterly magazine
Effective Theories for Circuits and Automata
Abstracting an effective theory from a complicated process is central to the
study of complexity. Even when the underlying mechanisms are understood, or at
least measurable, the presence of dissipation and irreversibility in
biological, computational and social systems makes the problem harder. Here we
demonstrate the construction of effective theories in the presence of both
irreversibility and noise, in a dynamical model with underlying feedback. We
use the Krohn-Rhodes theorem to show how the composition of underlying
mechanisms can lead to innovations in the emergent effective theory. We show
how dissipation and irreversibility fundamentally limit the lifetimes of these
emergent structures, even though, on short timescales, the group properties may
be enriched compared to their noiseless counterparts.Comment: 11 pages, 9 figure
SgpDec : Cascade (de)compositions of finite transformation semigroups and permutation groups
We describe how the SgpDec computer algebra package can be used for composing and decomposing permutation groups and transformation semigroups hierarchically by directly constructing substructures of wreath products, the so called cascade products.Final Accepted Versio
A Deep Representation for Invariance And Music Classification
Representations in the auditory cortex might be based on mechanisms similar
to the visual ventral stream; modules for building invariance to
transformations and multiple layers for compositionality and selectivity. In
this paper we propose the use of such computational modules for extracting
invariant and discriminative audio representations. Building on a theory of
invariance in hierarchical architectures, we propose a novel, mid-level
representation for acoustical signals, using the empirical distributions of
projections on a set of templates and their transformations. Under the
assumption that, by construction, this dictionary of templates is composed from
similar classes, and samples the orbit of variance-inducing signal
transformations (such as shift and scale), the resulting signature is
theoretically guaranteed to be unique, invariant to transformations and stable
to deformations. Modules of projection and pooling can then constitute layers
of deep networks, for learning composite representations. We present the main
theoretical and computational aspects of a framework for unsupervised learning
of invariant audio representations, empirically evaluated on music genre
classification.Comment: 5 pages, CBMM Memo No. 002, (to appear) IEEE 2014 International
Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014
Normalization of Dutch user-generated content
Abstract This paper describes a phrase-based machine translation approach to normalize Dutch user-generated content (UGC). We compiled a corpus of three different social media genres (text messages, message board posts and tweets) to have a sample of this recent domain. We describe the various characteristics of this noisy text material and explain how it has been manually normalized using newly developed guidelines. For the automatic normalization task we focus on text messages, and find that a cascaded SMT system where a token-based module is followed by a translation at the character level gives the best word error rate reduction. After these initial experiments, we investigate the system's robustness on the complete domain of UGC by testing it on the other two social media genres, and find that the cascaded approach performs best on these genres as well. To our knowledge, we deliver the first proof-of-concept system for Dutch UGC normalization, which can serve as a baseline for future work
- …