365 research outputs found
Local grammars and their representation by finite automata
In the study of collocations and of frozen sentences (idioms, clichés, collocations, many metaphors and figurative meanings, etc.) one often encounters sets of similar forms that cannot be related by formal rules of either type: phrase structure or transformational. We present examples of such situations and we show how the formalism of finite automata can be used to represent them in a natural way
Une grammaire locale de l'expression des sentiments
International audienc
Lexicon-Based Algorithms for the Automatic Analysis of Natural Language
Let us examine the following discourse (D) from the point of view of elementary grammatical analysis:(D) Two men cleaned the offices, then, they waited for the janitorThis discourse is composed of two members: two simple sentences connected by the conjunction 'then'. One of the elements needed for the interpretation of (D) lies in the nature of the antecedent of the pronoun 'they'. In principle, the pronoun 'they' refers to the noun phrase 'two men' of the first member, but it might indicate a group of persons different from these two men, if (D) is attached to an appropriate context or background. Whether the scene which constitutes the interpretation of (D) includes 3 persons (2 men and 1 janitor) or more depends entirely on the analysis of 'they'.Such questions of resolution of pronouns are trivial for a native speaker of English, but they become of paramount importance when one attempts a computer analysis of texts, and also when a reader who does not know well the language in which the discourse (D) is written tries to understand it. In both situations, in order to interpret (D), detailed dictionaries and grammars must be available which account for the relations occurring between the terms of (D). In this article, we are going to simulate the computation of the search for the antecedent(s) of 'they'. We will simplify this procedure by omitting its clerical aspects. In this way, we throw into relief the nature and the amount of information that must be stored in a lexicon-grammar, since, as we will see, we do not draw the usual line of demarcation between these two components of a language
ON COUNTING MEANINGFUL UNITS IN TEXTS
International audienceWe analyse a sample text. By identifying compounds and other sequences of words between which strong dependencies hold, we embed simple words that have no meaning by themselves into larger units that do carry specific meaning. Hence, the counts of simple words, and those of the units marked by our method become quite different. The analysis presented is operational to a large extent.L'analyse syntaxique automatique, première étape d'une procédure d'interprétation fine des textes par ordinateur, a recours à des outils comme les grammaires et les dictionnaires. Ces outils, tels qu'ils sont actuellement disponibles, ne sont pas suffisants. Ils doivent en effet prendre une forme électronique qui impose des révisions majeures de leur forme et contenu. Nous présentons une méthodologie linguistique qui a permis de construire des outils électroniques à large couverture des langues. Ces nouveaux outils mettent en évidence des unités linguistiques signifiantes, ce qui conduit à une modification substantielle de l'analyse du contenu des textes
Early Machine Translation in France
When the ALPAC report was published (Pierce 1966), I was deeply convinced that MT made no sense in the absence of detailed and formalized language descriptions. MT development was an engineering task, combining computer programming and linguistics, two fields that had an autonomous life and from which MT developers had to start. For computer specialists, several new tasks were clear: new programming tools should be helpful, as well as new algorithmic tools and new types of memories. For linguists, several subfields of linguistics were involved: the synchronic description of each language, namely, its morphology and lexicon, its syntax and possibly its semantics. No inventory of the needs and of the resources had been made seriously. But it should have been obvious that ambiguity was the major problem, and that only a detailed exploration of the contexts of ambiguous words could bring a solution.Authors such as L. Bloomfield, N. Chomsky and Z.S. Harris have provided the methodology for building cumulative lexicons and grammars. There is a price to pay: descriptions should be limited to reproducible phenomena, which is precisely what the above mentioned authors have attempted to clarify
Lexicon-Grammar and the syntactic analysis of French
International audienceA lexicon-grammar is constituted by the elementary sentences of a language. Instead of considering words as basic syntactic units to which grammatical information is attached, we use simple sentences (subject-verb-objects) as dictionary entries. Hence, a full dictionary item is a simple sentence with a description of the corresponding distributional and transformational properties.The systematic study of French has led to an organization of its lexicon-grammar based on three main components:- the lexicon-grammar of free sentences, that is, of sentences whose verb imposes selectional restrictions on its subject and complements (e.g. 'to fall', 'to eat', 'to watch'),- the lexicon-grammar of frozen or idiomatic expressions (e.g. 'N takes N into account', 'N raises a question'),- the lexicon-grammar of support verbs. These verbs do not have the common selectional restrictions, but more complex dependencies between subject and complement (e.g. 'to have', 'to make' in 'N has an impact on N', 'N makes a certain impression on N').These three components interact in specific ways. We present the structure of the lexicon-grammar built for French and we discuss its algorithmic implications for parsing
Lexique-grammaire et adverbes : deux exemples
On analyse deux constructions qui servent de sources à des adverbes de temps comme il y a dix ans, et les dates : le 6 septembre.Ces exemples servent à l’utilisation de deux notions générales dans la méthode d’analyse :Ces constructions rendent compte de manière naturelle des particularités des adverbes étudiés.We analyze two constructions of French as sources of time adverbials such as il y a dix ans (ten years ago) and dates: the 6th of September.These examples illustrate the use of two basic notions as an analytical method:The two constructions account in a natural fashion for peculiarities shown by the corresponding adverbs
The Construction of Local Grammars
Our programme is to use the model of W. Woods 1970 to attempt a full scale analysis of the language. It could be viewed as an attempt to revive the Markovian model, but this would be wrong, because previous Markovian models were aimed at giving a global description of a language, whereas the model we advocate, and which we call it finite-state for short, is of a strictly local nature. In this perspective, the global nature of language results from the interaction of a multiplicity of local finite-state schemes which we call finite-state local automata. To start with, we give elementary examples where the finite constraints can be exhaustively described in a local way, that is, without interferences from the rest of the grammar
- …