11 research outputs found

    Pumping lemma and Ogden lemma for displacement context-free grammars

    Full text link
    The pumping lemma and Ogden lemma offer a powerful method to prove that a particular language is not context-free. In 2008 Kanazawa proved an analogue of pumping lemma for well-nested multiple-context free languages. However, the statement of lemma is too weak for practical usage. We prove a stronger variant of pumping lemma and an analogue of Ogden lemma for this language family. We also use these statements to prove that some natural context-sensitive languages cannot be generated by tree-adjoining grammars.Comment: Shortened version accepted to DLT 2014 conferenc

    Descriptional Succinctness of Some Grammatical Formalisms for Natrual Language

    Get PDF
    We investigate the problem of describing languages compactly in different grammatical formalisms for natural languages. In particular, the problem is studied from the point of view of some newly developed natural language formalisms like linear control grammars (LCGs) and tree adjoining grammars (TAGs); these formalisms not only generate non-context-free languages that capture a wide variety of syntactic phenomena found in natural language, but also have computationally efficient polynomial time recognition algorithms. We prove that the formalisms enjoy the property of unbounded succinctness over the family of context-grammars, i.e. they are, in general, able to provide more compact representations of natural languages as compared to standard context-free grammars

    Dependency structures and lexicalized grammars

    Get PDF
    In this dissertation, we show that that both the generative capacity and the parsing complexity of lexicalized grammar formalisms are systematically related to structural properties of the dependency structures that these formalisms can induce. Dependency structures model the syntactic dependencies among the words of a sentence. We identify three empirically relevant classes of dependency structures, and show how they can be characterized both in terms of restrictions on the relation between dependency and word-order and within an algebraic framework. In the second part of the dissertation, we develop natural notions of automata and grammars for dependency structures, show how these yield infinite hierarchies of ever more powerful dependency languages, and classify several grammar formalisms with respect to the languages in these hierarchies that they are able to characterize. Our results provide fundamental insights into the relation between dependency structures and lexicalized grammars.In dieser Arbeit zeigen wir, dass sowohl die Ausdrucksmächtigkeit als auch die Verarbeitungskomplexität von lexikalisierten Grammatikformalismen auf systematische Art und Weise von strukturellen Eigenschaften der Dependenzstrukturen abhängen, die diese Formalismen induzieren. Dependenzstrukturen modellieren die syntaktischen Abhängigkeiten zwischen den Wörtern eines Satzes. Wir identifizieren drei empirisch relevante Klassen von Dependenzstrukturen und zeigen, wie sich diese sowohl durch Einschränkungen der Interaktion zwischen Dependenz und Wortstellung, als auch in einem algebraischen Rahmen charakterisieren lassen. Im zweiten Teil der Arbeit entwickeln wir natürliche Begriffe von Automaten und Grammatiken für Dependenzstrukturen, zeigen, wie diese zu unendlichen Hierarchien immer ausdrucksmächtigerer Dependenzsprachen führen, und klassifizieren mehrere Grammatikformalismen in Bezug auf die Sprachen in diesen Hierarchien, die von ihnen charakterisiert werden können. Unsere Resultate liefern grundlegende Einsichten in das Verhältnis zwischen Dependenzstrukturen und lexikalisierten Grammatiken

    Lambda-calculus and formal language theory

    Get PDF
    Formal and symbolic approaches have offered computer science many application fields. The rich and fruitful connection between logic, automata and algebra is one such approach. It has been used to model natural languages as well as in program verification. In the mathematics of language it is able to model phenomena ranging from syntax to phonology while in verification it gives model checking algorithms to a wide family of programs. This thesis extends this approach to simply typed lambda-calculus by providing a natural extension of recognizability to programs that are representable by simply typed terms. This notion is then applied to both the mathematics of language and program verification. In the case of the mathematics of language, it is used to generalize parsing algorithms and to propose high-level methods to describe languages. Concerning program verification, it is used to describe methods for verifying the behavioral properties of higher-order programs. In both cases, the link that is drawn between finite state methods and denotational semantics provide the means to mix powerful tools coming from the two worlds

    The formal properties of natural language syntax.

    Get PDF
    by Li, Chi Ho.Thesis (M.Phil.)--Chinese University of Hong Kong, 1997.Includes bibliographical references (leaves 39-40).Abstract --- p.iIntroduction --- p.1Mathematical Linguistics in a Nutshell --- p.4Two Classical Arguments --- p.8The Arguments from Sluicing and Doubling Relative Constructions --- p.11The Argument from the English such that constructions --- p.15The Argument from German constructions --- p.20The Argument from Feature Agreement --- p.23The Argument from Unbounded Dependency --- p.28Conclusion --- p.35Glossary --- p.37Bibliography --- p.3

    Grammars with Restricted Derivation Trees

    Get PDF
    V této disertační práci jsou studovány teoretické vlastnosti gramatik s omezenými derivačními stromy. Po uvedení současného stavu poznání v této oblasti je výzkum zaměřen na tři základní typy omezení derivačních stromů. Nejprve je představeno zcela nové téma, které je založeno na omezení řezů a je zkoumána vyjadřovací síla takto omezené gramatiky. Poté je zkoumáno několik nových vlastností omezení kladeného na cestu derivačních stromů. Zejména je studován vliv vymazávacích pravidel na vyjadřovací sílu gramatik s omezenou cestou a pro tyto gramatiky jsou zavedeny dvě normální formy. Následně je popsána nová souvislost mezi gramatikami s omezenou cestou a některými pseudouzly. Dále je prezentován protiargument k vyjadřovací síle tohoto modelu, která byla dosud považována za dobře známou vlastnost. Nakonec je zavedeno zobecnění modelu s omezenou cestou na ne jednu, ale několik cest. Tento model je následně studován zejména z hlediska vlastností vkládání, uzávěrových vlastností a vlastností syntaktické analýzy.This doctoral thesis studies theoretical properties of grammars with restricted derivation trees. After presenting the state of the art concerning this investigation area, the research is focused on the three main kinds of the restrictions placed upon the derivation trees. First, it introduces completely new investigation area represented by cut-based restriction and examines the generative power of the grammars restricted in this way. Second, it investigates several new properties of path-based restriction placed upon the derivation trees. Specifically, it studies the impact of erasing productions on the generative power of grammars with restricted path and introduces two corresponding normal forms. Then, it describes a new relation between grammars with restricted path and some pseudoknots. Next, it presents a counterargument to the generative power of grammars with controlled path that has been considered as well-known so far. Finally, it introduces a generalization of path-based restriction to not just one but several paths. The model generalized in this way is studied, namely its pumping, closure, and parsing properties.

    The communicative theory of Terminology (CTT) applied to the development of a corpus-based specialised dictionary of the ceramics industry

    Get PDF
    Esta tesis es el resultado de un proyecto destinado a la creación de un diccionario activo, bilingüe (español-inglés; inglés-español) y especializado de la industria cerámica y azulejera con la Teoría Comunicativa de la Terminología como su pilar teórico principal. Debido al posicionamiento teórico adoptado, la investigación aquí presentada ha partido de un estudio de corpus (compilado ad hoc) en el que los términos han sido analizados in vivo y caracterizados de acuerdo al ¿habitat¿ en el que se hallan en el texto especializado. Así pues, la aproximación hecha al estudio de la terminología industrial cerámica hace pertinente el uso de la etiqueta ¿lexicografía especializada¿ a la hora de referirnos a un trabajo como éste en el que se ha tratado de ir más allá de la práctica terminográfica para dar lugar a un estudio en el que se prima el contexto, las asociaciones naturales de los términos (colocaciones) y la naturaleza comunicativa de la terminología. De este modo, en esta tesis se ha presentado de manera progresiva, además de un marco teórico detallado y coherente con el fin último de la investigación, la metodología utilizada para la elaboración del diccionario en curso, ampliamente basada en el uso de programas informáticos tanto para la explotación del corpus (WordSmith Tools 4.0), como para la creación de la base de datos terminológica (TermStar XV) y la generación de entradas finales (GENDIC).Así pues, esta tesis presenta de manera progresiva los resultados obtenidos en cada etapa del método de trabajo y 4,000 entradas finales (en este caso del inglés al español) correspondientes a las letras A, B, N, O, U y V del diccionario.This PhD dissertation is the result of an ongoing process aimed at the creation of a bilingual corpus-based specialised active dictionary of the ceramic industry, with the Communicative Theory of Terminology (CTT) as its mainstay. According to the grounding principles of the CTT, this research has departed form a corpus-based approach in which terms have been analysed in vivo and characterised from the natural habitat in which they are given in specialised communication/discourse. In this light, it has been put forward how the study of terms – made possible thanks to the activity of compiling and describing them, called terminography – may be complemented by the wider projection of specialised lexicography for the compilation and elaboration of LSP, user-oriented and user-friendly quality products in the form of dictionaries. This specialised lexicographical dimension of the work has necessarily implied the need to renew the concept of speciality language dictionaries applied to the ceramic industry and has given way to the creation of a (prospective) active dictionary in this field with a marked emphasis on context. Accordingly, the importance of pragmatic aspects in a work of this sort, has made it necessary to undertake an in-depth revision and analysis of the socio-economic context for the research in order be able to establish and solve the specific terminological needs that the ceramic industrial discourse community may find. On the basis of this theoretical framework, the method of study followed for the development of the prospective dictionary has comprised 8 broad stages: the stage of work preparation and corpus compilation, the elaboration of the field diagram, the stage of documentary corpus management, term extraction, data processing, revision and normalisation and finally, the edition stage. Two main types of results have been presented: those obtained through work in progress in the different stages of the method and final ones strictly speaking, that is, 4,000 English-Spanish entries in their final format (as they will appear in the prospective dictionary) belonging to the letters A, B, N, O, U and V of a complete dictionary which will include a total of 26,000 entries

    The communicative theory of Terminology (CTT) applied to the development of a corpus-based specialised dictionary of the ceramics industry

    Get PDF
    Esta tesis es el resultado de un proyecto destinado a la creación de un diccionario activo, bilingüe (español-inglés; inglés-español) y especializado de la industria cerámica y azulejera con la Teoría Comunicativa de la Terminología como su pilar teórico principal. Debido al posicionamiento teórico adoptado, la investigación aquí presentada ha partido de un estudio de corpus (compilado ad hoc) en el que los términos han sido analizados in vivo y caracterizados de acuerdo al ¿habitat¿ en el que se hallan en el texto especializado. Así pues, la aproximación hecha al estudio de la terminología industrial cerámica hace pertinente el uso de la etiqueta ¿lexicografía especializada¿ a la hora de referirnos a un trabajo como éste en el que se ha tratado de ir más allá de la práctica terminográfica para dar lugar a un estudio en el que se prima el contexto, las asociaciones naturales de los términos (colocaciones) y la naturaleza comunicativa de la terminología. De este modo, en esta tesis se ha presentado de manera progresiva, además de un marco teórico detallado y coherente con el fin último de la investigación, la metodología utilizada para la elaboración del diccionario en curso, ampliamente basada en el uso de programas informáticos tanto para la explotación del corpus (WordSmith Tools 4.0), como para la creación de la base de datos terminológica (TermStar XV) y la generación de entradas finales (GENDIC).Así pues, esta tesis presenta de manera progresiva los resultados obtenidos en cada etapa del método de trabajo y 4,000 entradas finales (en este caso del inglés al español) correspondientes a las letras A, B, N, O, U y V del diccionario.This PhD dissertation is the result of an ongoing process aimed at the creation of a bilingual corpus-based specialised active dictionary of the ceramic industry, with the Communicative Theory of Terminology (CTT) as its mainstay. According to the grounding principles of the CTT, this research has departed form a corpus-based approach in which terms have been analysed in vivo and characterised from the natural habitat in which they are given in specialised communication/discourse. In this light, it has been put forward how the study of terms – made possible thanks to the activity of compiling and describing them, called terminography – may be complemented by the wider projection of specialised lexicography for the compilation and elaboration of LSP, user-oriented and user-friendly quality products in the form of dictionaries. This specialised lexicographical dimension of the work has necessarily implied the need to renew the concept of speciality language dictionaries applied to the ceramic industry and has given way to the creation of a (prospective) active dictionary in this field with a marked emphasis on context. Accordingly, the importance of pragmatic aspects in a work of this sort, has made it necessary to undertake an in-depth revision and analysis of the socio-economic context for the research in order be able to establish and solve the specific terminological needs that the ceramic industrial discourse community may find. On the basis of this theoretical framework, the method of study followed for the development of the prospective dictionary has comprised 8 broad stages: the stage of work preparation and corpus compilation, the elaboration of the field diagram, the stage of documentary corpus management, term extraction, data processing, revision and normalisation and finally, the edition stage. Two main types of results have been presented: those obtained through work in progress in the different stages of the method and final ones strictly speaking, that is, 4,000 English-Spanish entries in their final format (as they will appear in the prospective dictionary) belonging to the letters A, B, N, O, U and V of a complete dictionary which will include a total of 26,000 entries
    corecore