7,314 research outputs found

    A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval

    Get PDF
    In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail

    Semantic Representation of Context for Description of Named Rivers in a Terminological Knowledge Base

    Get PDF
    The description of named entities in terminological knowledge bases has never been addressed in any depth in terminology. Firm preconceptions, rooted in philosophy, about the only referential function of proper names have presumably led to disparage their inclusion in terminology resources, despite the relevance of named entities having been highlighted by prominent figures in the discipline of terminology. Scholars from different branches of linguistics depart from the conservative stance on proper names and have foregrounded the need for a novel approach, more linguistic than philosophical, to describing proper names. Therefore, this paper proposed a linguistic and terminological approach to the study of named entities when used in scientific discourse, with the purpose of representing them in EcoLexicon, an environmental knowledge base designed according to the premises of Frame-based Terminology. We focused more specifically on named rivers (or potamonyms) mentioned in a coastal engineering corpus. Inclusion of named entities in terminological knowledge bases requires analyzing the context that surrounds them in specialized texts because these contexts convey specialized knowledge about named entities. For the semantic representation of context, this paper thus analyzed the local syntactic and semantic contexts that surrounded potamonyms in coastal engineering texts and described the semantic annotation of the predicate-argument structure of sentences where a potamonym was mentioned. The semantic variables annotated were the following: (1) semantic category of the arguments; (2) semantic role of the arguments; (3) semantic relation between the arguments; and (4) lexical domain of the verbs. This method yielded valuable insight into the different semantic roles that named rivers played, the entities and processes that participated in the events educed by potamonyms through verbs, and how they all interacted. Furthermore, since arguments are specialized terms and verbs are relational constructs, the analysis of argument structure led to the construction of semantic networks that depicted specialized knowledge about named rivers. These conceptual networks were then used to craft the thematic description of potamonyms. Accordingly, the semantic network and the thematic description not only constituted the representation of a potamonym in EcoLexicon, but also allowed the geographic contextualization of specialized concepts in the terminological resource.PID2020-118369GB-I00 Spanish Ministry of Science and InnovationA-HUM-600-UGR20 Andalusian Ministry of EconomyFPU grant given by the Spanish Ministry of Educatio

    Secondary predication in Russian

    Get PDF
    The paper makes two contributions to semantic typology of secondary predicates. It provides an explanation of the fact that Russian has no resultative secondary predicates, relating this explanation to the interpretation of secondary predicates in English. And it relates depictive secondary predicates in Russian, which usually occur in the instrumental case, to other uses of the instrumental case in Russian, establishing here, too, a difference to English concerning the scope of the secondary predication phenomenon

    The Composite Nature of Interlanguage as a Developing System

    Get PDF
    This paper explores the nature of interlanguage (IL) as a developing system with a focus on the abstract lexical structure underlying IL construction. The developing system of IL is assumed to be ‘composite’ in that in second language acquisition (SLA) several linguistic systems are in contact, each of which may contribute different amounts to the developing system. The lexical structure is assumed to be ‘abstract’ in that the mental lexicon contains abstract elements called ‘lemmas’, which contain information about individual lexemes, and lemmas in the bilingual mental lexicon are language-specific and are in contact in IL production. Based on the research findings, it concludes that language transfer in IL production should be understood as lemma transfer of the learner's first language (L1) lexical structure at three abstract levels: lexical-conceptual structure, predicate-argument structure, and morphological realization patterns, and IL construction is driven by an incompletely acquired abstract lexical structure of a target language (TL) item

    A Theme-Rewriting Approach for Generating Algebra Word Problems

    Full text link
    Texts present coherent stories that have a particular theme or overall setting, for example science fiction or western. In this paper, we present a text generation method called {\it rewriting} that edits existing human-authored narratives to change their theme without changing the underlying story. We apply the approach to math word problems, where it might help students stay more engaged by quickly transforming all of their homework assignments to the theme of their favorite movie without changing the math concepts that are being taught. Our rewriting method uses a two-stage decoding process, which proposes new words from the target theme and scores the resulting stories according to a number of factors defining aspects of syntactic, semantic, and thematic coherence. Experiments demonstrate that the final stories typically represent the new theme well while still testing the original math concepts, outperforming a number of baselines. We also release a new dataset of human-authored rewrites of math word problems in several themes.Comment: To appear EMNLP 201

    Verb similarity: comparing corpus and psycholinguistic data

    Get PDF
    Similarity, which plays a key role in fields like cognitive science, psycholinguistics and natural language processing, is a broad and multifaceted concept. In this work we analyse how two approaches that belong to different perspectives, the corpus view and the psycholinguistic view, articulate similarity between verb senses in Spanish. Specifically, we compare the similarity between verb senses based on their argument structure, which is captured through semantic roles, with their similarity defined by word associations. We address the question of whether verb argument structure, which reflects the expression of the events, and word associations, which are related to the speakers' organization of the mental lexicon, shape similarity between verbs in a congruent manner, a topic which has not been explored previously. While we find significant correlations between verb sense similarities obtained from these two approaches, our findings also highlight some discrepancies between them and the importance of the degree of abstraction of the corpus annotation and psycholinguistic representations.La similitud, que desempeña un papel clave en campos como la ciencia cognitiva, la psicolingüística y el procesamiento del lenguaje natural, es un concepto amplio y multifacético. En este trabajo analizamos cómo dos enfoques que pertenecen a diferentes perspectivas, la visión del corpus y la visión psicolingüística, articulan la semejanza entre los sentidos verbales en español. Específicamente, comparamos la similitud entre los sentidos verbales basados en su estructura argumental, que se capta a través de roles semánticos, con su similitud definida por las asociaciones de palabras. Abordamos la cuestión de si la estructura del argumento verbal, que refleja la expresión de los acontecimientos, y las asociaciones de palabras, que están relacionadas con la organización de los hablantes del léxico mental, forman similitud entre los verbos de una manera congruente, un tema que no ha sido explorado previamente. Mientras que encontramos correlaciones significativas entre las similitudes de los sentidos verbales obtenidas de estos dos enfoques, nuestros hallazgos también resaltan algunas discrepancias entre ellos y la importancia del grado de abstracción de la anotación del corpus y las representaciones psicolingüísticas.La similitud, que exerceix un paper clau en camps com la ciència cognitiva, la psicolingüística i el processament del llenguatge natural, és un concepte ampli i multifacètic. En aquest treball analitzem com dos enfocaments que pertanyen a diferents perspectives, la visió del corpus i la visió psicolingüística, articulen la semblança entre els sentits verbals en espanyol. Específicament, comparem la similitud entre els sentits verbals basats en la seva estructura argumental, que es capta a través de rols semàntics, amb la seva similitud definida per les associacions de paraules. Abordem la qüestió de si l'estructura de l'argument verbal, que reflecteix l'expressió dels esdeveniments, i les associacions de paraules, que estan relacionades amb l'organització dels parlants del lèxic mental, formen similitud entre els verbs d'una manera congruent, un tema que no ha estat explorat prèviament. Mentre que trobem correlacions significatives entre les similituds dels sentits verbals obtingudes d'aquests dos enfocaments, les nostres troballes també ressalten algunes discrepàncies entre ells i la importància del grau d'abstracció de l'anotació del corpus i les representacions psicolingüístiques

    What does semantic tiling of the cortex tell us about semantics?

    Get PDF
    Recent use of voxel-wise modeling in cognitive neuroscience suggests that semantic maps tile the cortex. Although this impressive research establishes distributed cortical areas active during the conceptual processing that underlies semantics, it tells us little about the nature of this processing. While mapping concepts between Marr's computational and implementation levels to support neural encoding and decoding, this approach ignores Marr's algorithmic level, central for understanding the mechanisms that implement cognition, in general, and conceptual processing, in particular. Following decades of research in cognitive science and neuroscience, what do we know so far about the representation and processing mechanisms that implement conceptual abilities? Most basically, much is known about the mechanisms associated with: (1) features and frame representations, (2) grounded, abstract, and linguistic representations, (3) knowledge-based inference, (4) concept composition, and (5) conceptual flexibility. Rather than explaining these fundamental representation and processing mechanisms, semantic tiles simply provide a trace of their activity over a relatively short time period within a specific learning context. Establishing the mechanisms that implement conceptual processing in the brain will require more than mapping it to cortical (and sub-cortical) activity, with process models from cognitive science likely to play central roles in specifying the intervening mechanisms. More generally, neuroscience will not achieve its basic goals until it establishes algorithmic-level mechanisms that contribute essential explanations to how the brain works, going beyond simply establishing the brain areas that respond to various task conditions

    METAPHORICAL SWITCHING: A LINGUISTIC REPERTOIRE OF MUSLIM JAVANESE PRIESTS

    Get PDF
    Metaphorical switching is one of study in sociolinguistic. This term refers to a speaker that has no obvious explanatory factors for using more than one languages in his utterance. It is mostly done by skilled bilingual. Linguistic repertoire refers to the use of language by a speaker from one variety of languages to other varieties during the utterance events. This term is commonly found where the speaker considers the appropriate setting, topic, addressee and other social factors. The metaphorical switching in linguistic repertoire can be identified by using code switching and code mixing analysis. These kind of analysis used in a sermon is interesting to explore since there is only sole speaker that fully dominates the whole speaking. A sermon is a monologue, where the audience

    The GENIE System: classifying documents by combining mixed-techniques

    Get PDF
    Today, the automatic text classification is still an open problem and its implementation in companies and organizations with large volumes of data in text format is not a trivial matter. To achieve optimum results many parameters come into play, such as the language, the context, the level of knowledge of the issues discussed, the format of the documents, or the type of language that has been used in the documents to be classified. In this paper we describe a multi-language rule-based pipeline system, called GENIE, used for automatic document categorisation. We have used several business corpora in order to test the real capabilities of our proposal, and we have studied the results of applying different stages of the pipeline over the same data to test the influence of each step in the categorization process. The results obtained by this system are very promising, and in fact, the GENIE system is already being used on real production environments with very good results
    corecore