1,492 research outputs found
Syntactic annotation of non-canonical linguistic structures
This paper deals with the syntactic annotation of corpora that contain both âcanonicalâ and ânon-canonicalâ sentences
Information structure
The guidelines for Information Structure include instructions for the annotation of Information Status (or âgivennessâ), Topic, and Focus, building upon a basic syntactic annotation of nominal phrases and sentences. A procedure for the annotation of these features is proposed
Modelling Discourse-related terminology in OntoLingAnnotâs ontologies
Recently, computational linguists have shown great interest in discourse annotation in an attempt to capture the internal relations in texts. With this aim, we have formalized the linguistic knowledge associated to discourse into different linguistic ontologies. In this paper, we present the most prominent discourse-related terms and concepts included in the ontologies of the OntoLingAnnot annotation model. They show the different units, values, attributes, relations, layers and strata included in the discourse annotation level of the OntoLingAnnot model, within which these ontologies are included, used and evaluated
RDF/S)XML Linguistic Annotation of Semantic Web Pages
Although with the Semantic Web initiative much research on web pages semantic annotation has already done by AI researchers, linguistic text annotation, including the semantic one, was originally developed in Corpus Linguistics and its results have been somehow neglected by AI. ..
What linguists always wanted to know about german and did not know how to estimate
This paper profiles significant differences in syntactic distribution and differences in word class frequencies for two treebanks of spoken and written German: the TĂŒBa-D/S, a treebank of transliterated spontaneous dialogues, and the TĂŒBa-D/Z treebank of newspaper articles published in the German daily newspaper die tageszeitungÂŽ(taz). The approach can be used more generally as a means of distinguishing and classifying language corpora of different genres
- âŠ