215 research outputs found

    Efficient Normal-Form Parsing for Combinatory Categorial Grammar

    Under categorial grammars that have powerful rules like composition, a simple n-word sentence can have exponentially many parses. Generating all parses is inefficient and obscures whatever true semantic ambiguities are in the input. This paper addresses the problem for a fairly general form of Combinatory Categorial Grammar, by means of an efficient, correct, and easy to implement normal-form parsing technique. The parser is proved to find exactly one parse in each semantic equivalence class of allowable parses; that is, spurious ambiguity (as carefully defined) is shown to be both safely and completely eliminated.

    Categorial Grammar

    The paper is a review article comparing a number of approaches to natural language syntax and semantics that have been developed using categorial frameworks. It distinguishes two related but distinct varieties of categorial theory, one related to Natural Deduction systems and the axiomatic calculi of Lambek, and another which involves more specialized combinatory operations

    Comparing and evaluating extended Lambek calculi

    Lambeks Syntactic Calculus, commonly referred to as the Lambek calculus, was innovative in many ways, notably as a precursor of linear logic. But it also showed that we could treat our grammatical framework as a logic (as opposed to a logical theory). However, though it was successful in giving at least a basic treatment of many linguistic phenomena, it was also clear that a slightly more expressive logical calculus was needed for many other cases. Therefore, many extensions and variants of the Lambek calculus have been proposed, since the eighties and up until the present day. As a result, there is now a large class of calculi, each with its own empirical successes and theoretical results, but also each with its own logical primitives. This raises the question: how do we compare and evaluate these different logical formalisms? To answer this question, I present two unifying frameworks for these extended Lambek calculi. Both are proof net calculi with graph contraction criteria. The first calculus is a very general system: you specify the structure of your sequents and it gives you the connectives and contractions which correspond to it. The calculus can be extended with structural rules, which translate directly into graph rewrite rules. The second calculus is first-order (multiplicative intuitionistic) linear logic, which turns out to have several other, independently proposed extensions of the Lambek calculus as fragments. I will illustrate the use of each calculus in building bridges between analyses proposed in different frameworks, in highlighting differences and in helping to identify problems.

    Principles and Implementation of Deductive Parsing

    We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.

    Grammatical structures and logical deductions

    The three essays presented here concern natural connections between grammatical derivations and structures provided by certain standard grammar formalisms, on the one hand, and deductions in logical systems, on the other hand. In the first essay we analyse the adequacy of Polish notation for higher-order languages. The Ajdukiewicz algorithm (Ajdukiewicz 1935) is discussed in terms of generalized MP-deductions. We exhibit a failure in Ajdukiewicz’s original version of the algorithm and give a correct one; we prove that generalized MP-deductions have the frontier property, which is essential for the plausibility of Polish notation. The second essay deals with logical systems corresponding to different grammar formalisms, as e.g. Finite State Acceptors, Context-Free Grammars, Categorial Grammars, and others. We show how can logical methods be used to establish certain linguistically significant properties of formal grammars. The third essay discusses the interplay between Natural Deduction proofs in grammar oriented logics and semantic structures expressible by typed lambda terms and combinators

    Da linguística gerativa à gramática categorial : sujeitos lexicais em infinitivos controlados

    Orientadores: Marcelo Esteban Coniglio, Sonia Maria Lazzarini CyrinoTese (doutorado) - Universidade Estadual de Campinas, Instituto de Filosofia e Ciências HumanasResumo: A presente tese situa-se na interface da lógica e da linguística; o seu objeto de estudo são os pronomes lexicais em sentenças de controle em três línguas Românicas: Português, Italiano e Espanhol. Esse assunto tem recebido mais atenção na linguística gerativa, especialmente nos anos recentes, do que na gramática de cunho lógico. Talvez como consequência disso, há ainda muito a ser entendido sobre essas estruturas linguísticas e as suas propriedades lógicas. Essa tese tenta preencher as lacunas na literatura \--- ou, pelo menos, avançar nessa direção \--- colocando questões que não foram suficientemente exploradas até agora. Para tal efeito avançamos duas perguntas-chaves, uma linguística e a outra lógica. Elas são, respectivamente: Qual é o estatuto sintático dos pronomes lexicais em estruturas de controle? E: Quais são os mecanismos disponíveis, em uma gramática lógica livre de contração, para se reusar recursos semânticos? A tese divide-se, consequentemente, em duas partes: linguística gerativa e gramática categorial. Na Parte I revisamos algumas das principais teorias de controle gerativistas e a recente discussão acerca das cláusulas infinitivas com sujeito lexical. Na Parte II revisamos a literatura categorial, atendendo principalmente às propostas acerca das estruturas de controle e dos pronomes anafóricos. Em última instância, mostraremos que as propostas linguísticas e lógicas prévias precisam ser modificadas para se explicar o fenômeno linguístico em questão. Com efeito, nos capítulos finais de cada uma das partes avançamos propostas alternativas que, a nosso ver, resultam mais adequadas que as suas rivais. Mais específicamente, na Parte I avançamos uma proposta linguística na linha do cálculo de controle T/Agr de Landau. Na Parte II apresentamos duas propostas categoriais, uma na linha do cálculo categorial combinatório e a outra, na gramática lógica de tipos. Finalmente mostramos a implementação da última proposta em um analisador sintático e de demonstração categorialAbstract: The present thesis lies at the interface of logic and linguistics; its object of study are control sentences with overt pronouns in Romance languages (European and Brazilian Portuguese, Italian and Spanish). This is a topic that has received considerably more attention on the part of linguists, especially in recent years, than from logicians. Perhaps for this reason, much remains to be understood about these linguistic structures and their underlying logical properties. This thesis seeks to fill the lacunas in the literature \--- or at least take steps in this direction \--- by way of addressing a number of issues that have so far been under-explored. To this end we put forward two key questions, one linguistic and the other logical. These are, respectively: What is the syntactic status of the surface pronoun? And: What are the available mechanisms to reuse semantic resources in a contraction-free logical grammar? Accordingly, the thesis is divided into two parts: generative linguistics and categorial grammar. Part I starts by reviewing the recent discussion within the generative literature on infinitive clauses with overt subjects, paying detailed attention to the main accounts in the field. Part II does the same on the logical grammar front, addressing in particular the issues of control and of anaphoric pronouns. Ultimately, the leading accounts from both camps will be found wanting. The closing chapter of each of Part I and Part II will thus put forward alternative candidates, that we contend are more successful than their predecessors. More specifically, in Part I we offer a linguistic account along the lines of Landau's T/Agr theory of control. In Part II we present two alternative categorial accounts: one based on Combinatory Categorial Grammar, the other on Type-Logical Grammar. Each of these accounts offers an improved, more fine-grained perspective on control infinitives featuring overt pronominal subjects. Finally, we include an Appendix in which our type-logical proposal is implemented in a categorial parser/theorem-prover (categorial parser/theorem-prover)DoutoradoFilosofiaDoutora em Filosofia2013/08115-1, 2015/09699-2FAPESPCAPE

    Type-driven natural language analysis

    The purpose of this thesis is in showing how recent developments in logic programming can be exploited to encode in a computational environment the features of certain linguistic theories. We are in this way able to make available for the purpose of natural language processing sophisticated capabilities of linguistic analysis directly justified by well developed grammatical frameworks. More specifically, we exploit hypothetical reasoning, recently proposed as one of the possible directions to widen logic programming, to account for the syntax of filler-gap dependencies along the lines of linguistic theories such as Generalized Phrase Structure Grammar and Categorial Grammar. Moreover, we make use, for the purpose of semantic analysis of the same kind of phenomena, of another recently proposed extension, interestingly related to the previous one, namely the idea of replacing first-order terms with the more expressive λ-terms of λ-Calculus

    Multi-dimensional Type Theory: Rules, Categories, and Combinators for Syntax and Semantics

    We investigate the possibility of modelling the syntax and semantics of natural language by constraints, or rules, imposed by the multi-dimensional type theory Nabla. The only multiplicity we explicitly consider is two, namely one dimension for the syntax and one dimension for the semantics, but the general perspective is important. For example, issues of pragmatics could be handled as additional dimensions. One of the main problems addressed is the rather complicated repertoire of operations that exists besides the notion of categories in traditional Montague grammar. For the syntax we use a categorial grammar along the lines of Lambek. For the semantics we use so-called lexical and logical combinators inspired by work in natural logic. Nabla provides a concise interpretation and a sequent calculus as the basis for implementations.