14,582 research outputs found
Generative grammar
Generative Grammar is the label of the most influential research program in linguistics and related fields in the second half of the 20. century. Initiated by a short book, Noam Chomsky's Syntactic Structures (1957), it became one of the driving forces among the disciplines jointly called the cognitive sciences. The term generative grammar refers to an explicit, formal characterization of the (largely implicit) knowledge determining the formal aspect of all kinds of language behavior. The program had a strong mentalist orientation right from the beginning, documented e.g. in a fundamental critique of Skinner's Verbal behavior (1957) by Chomsky (1959), arguing that behaviorist stimulus-response-theories could in no way account for the complexities of ordinary language use. The "Generative Enterprise", as the program was called in 1982, went through a number of stages, each of which was accompanied by discussions of specific problems and consequences within the narrower domain of linguistics as well as the wider range of related fields, such as ontogenetic development, psychology of language use, or biological evolution. Four stages of the Generative Enterprise can be marked off for expository purposes
Vowel duration issue in Civili
The main goal of this article is to define the problem of vowel duration in Civili (H12a). It shows that the so-called Civili vowel-length desperately needs to be re-examined, because previous works on the sound system of this language hardly explain a number of phonological phenomena, such as vowel lengthening, on the basis of data at hand. Demonstrating the problem in question, the author first reviews previous works that all identify a vowel lengthening in Civili. From different analyses the complexity of the phenomenon is found out by observing differences from an analysis to another, and by regarding difficulties the different phonologists came up against. Then, the problem is also seen through the weakness of each analysis results. This eventually shows more aspects of the vowel duration issue, and leads the author to make a clear distinction between vowel length and vowel lengthening that can be all regarded as only vowel duration. Finally, the article shares a possible way for a solution through an experimental approach of the Civili sound system
The Unsupervised Acquisition of a Lexicon from Continuous Speech
We present an unsupervised learning algorithm that acquires a
natural-language lexicon from raw speech. The algorithm is based on the optimal
encoding of symbol sequences in an MDL framework, and uses a hierarchical
representation of language that overcomes many of the problems that have
stymied previous grammar-induction procedures. The forward mapping from symbol
sequences to the speech stream is modeled using features based on articulatory
gestures. We present results on the acquisition of lexicons and language models
from raw speech, text, and phonetic transcripts, and demonstrate that our
algorithm compares very favorably to other reported results with respect to
segmentation performance and statistical efficiency.Comment: 27 page technical repor
A Formal Framework for Linguistic Annotation
`Linguistic annotation' covers any descriptive or analytic notations applied
to raw language data. The basic data may be in the form of time functions --
audio, video and/or physiological recordings -- or it may be textual. The added
notations may include transcriptions of all sorts (from phonetic features to
discourse structures), part-of-speech and sense tagging, syntactic analysis,
`named entity' identification, co-reference annotation, and so on. While there
are several ongoing efforts to provide formats and tools for such annotations
and to publish annotated linguistic databases, the lack of widely accepted
standards is becoming a critical problem. Proposed standards, to the extent
they exist, have focussed on file formats. This paper focuses instead on the
logical structure of linguistic annotations. We survey a wide variety of
existing annotation formats and demonstrate a common conceptual core, the
annotation graph. This provides a formal framework for constructing,
maintaining and searching linguistic annotations, while remaining consistent
with many alternative data structures and file formats.Comment: 49 page
Prosody and melody in vowel disorder
The paper explores the syllabic and segmental dimensions of phonological vowel disorder. The independence of the two dimensions is illustrated by the case study of an English-speaking child presenting with an impairment which can be shown to have a specifically syllabic basis. His production of adult long vowels displays three main patterns of deviance - shortening, bisyllabification and the hardening of a target off-glide to a stop. Viewed phonemically, these patterns appear as unconnected substitutions and distortions. Viewed syllabically, however, they can be traced to a single underlying deficit, namely a failure to secure the complex nuclear structure necessary for the coding of vowel length contrasts
Multilayer Network of Language: a Unified Framework for Structural Analysis of Linguistic Subsystems
Recently, the focus of complex networks research has shifted from the
analysis of isolated properties of a system toward a more realistic modeling of
multiple phenomena - multilayer networks. Motivated by the prosperity of
multilayer approach in social, transport or trade systems, we propose the
introduction of multilayer networks for language. The multilayer network of
language is a unified framework for modeling linguistic subsystems and their
structural properties enabling the exploration of their mutual interactions.
Various aspects of natural language systems can be represented as complex
networks, whose vertices depict linguistic units, while links model their
relations. The multilayer network of language is defined by three aspects: the
network construction principle, the linguistic subsystem and the language of
interest. More precisely, we construct a word-level (syntax, co-occurrence and
its shuffled counterpart) and a subword level (syllables and graphemes) network
layers, from five variations of original text (in the modeled language). The
obtained results suggest that there are substantial differences between the
networks structures of different language subsystems, which are hidden during
the exploration of an isolated layer. The word-level layers share structural
properties regardless of the language (e.g. Croatian or English), while the
syllabic subword level expresses more language dependent structural properties.
The preserved weighted overlap quantifies the similarity of word-level layers
in weighted and directed networks. Moreover, the analysis of motifs reveals a
close topological structure of the syntactic and syllabic layers for both
languages. The findings corroborate that the multilayer network framework is a
powerful, consistent and systematic approach to model several linguistic
subsystems simultaneously and hence to provide a more unified view on language
Review of H. Marquardt, Hethitische Logogramme. Funktion und Verwendung. (DBH 34, Wiesbaden, 2011).
A review of H. Marquardt's book on the function and use of logograms in Hittite cuneiform, which isolates two main motivations for logogram-use: tachygraphy and the avoidance of varying syllabic writings. Broad agreement is found with these results. A slightly different statistical model for interpretation of results relating to the chronological aspects of logogram-use in Hittite texts is suggested
Research on speech understanding and related areas at SRI
Research capabilities on speech understanding, speech recognition, and voice control are described. Research activities and the activities which involve text input rather than speech are discussed
- …