254,364 research outputs found

    Web ontology representation and reasoning via fragments of set theory

    Full text link
    In this paper we use results from Computable Set Theory as a means to represent and reason about description logics and rule languages for the semantic web. Specifically, we introduce the description logic \mathcal{DL}\langle 4LQS^R\rangle(\D)--admitting features such as min/max cardinality constructs on the left-hand/right-hand side of inclusion axioms, role chain axioms, and datatypes--which turns out to be quite expressive if compared with \mathcal{SROIQ}(\D), the description logic underpinning the Web Ontology Language OWL. Then we show that the consistency problem for \mathcal{DL}\langle 4LQS^R\rangle(\D)-knowledge bases is decidable by reducing it, through a suitable translation process, to the satisfiability problem of the stratified fragment 4LQSR4LQS^R of set theory, involving variables of four sorts and a restricted form of quantification. We prove also that, under suitable not very restrictive constraints, the consistency problem for \mathcal{DL}\langle 4LQS^R\rangle(\D)-knowledge bases is \textbf{NP}-complete. Finally, we provide a 4LQSR4LQS^R-translation of rules belonging to the Semantic Web Rule Language (SWRL)

    City of Lions

    Get PDF
    In the course of the past two decades, the city of Lviv has enjoyed close attention as well as a “close reading” in literary and scholarly texts on the city. This attention fits easily into two categories: (a) scholars producing academic studies on the city and (b) classical literary works on the city, composed in various languages, finally becoming available to a broader readership through translation into English. The book under discussion falls into the second category.1 It must be pointed out right away that this is an unusual book—a truly successful combination of two essays—that should be rewarded with proper attention. Under one cover, the reader has the opportunity to enjoy two pieces that are linked together by the image of the city of Lviv. The first is an elegiac essay, Mój Lwów (My Lviv), authored by Polish writer Józef Wittlin. (This first English-language translation is by Antonina Lloyd-Jones.) The second essay is from post‑2010s Lviv by internationally-recognized British author Philippe Sands; it uses Wittlin’s work as a springboard for Sands’ own explorations of the city, or of what is left of the city from that period, mixed with a personal narrative

    Masked Language Model Scoring

    Full text link
    Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one. We show that PLLs outperform scores from autoregressive language models like GPT-2 in a variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on state-of-the-art baselines for low-resource translation pairs, with further gains from domain adaptation. We attribute this success to PLL's unsupervised expression of linguistic acceptability without a left-to-right bias, greatly improving on scores from GPT-2 (+10 points on island effects, NPI licensing in BLiMP). One can finetune MLMs to give scores without masking, enabling computation in a single inference pass. In all, PLLs and their associated pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of pretrained MLMs; e.g., we use a single cross-lingual model to rescore translations in multiple languages. We release our library for language model scoring at https://github.com/awslabs/mlm-scoring.Comment: ACL 2020 camera-ready (presented July 2020

    Transition-based Semantic Role Labeling with Pointer Networks

    Get PDF
    Semantic role labeling (SRL) focuses on recognizing the predicate-argument structure of a sentence and plays a critical role in many natural language processing tasks such as machine translation and question answering. Practically all available methods do not perform full SRL, since they rely on pre-identified predicates, and most of them follow a pipeline strategy, using specific models for undertaking one or several SRL subtasks. In addition, previous approaches have a strong dependence on syntactic information to achieve state-of-the-art performance, despite being syntactic trees equally hard to produce. These simplifications and requirements make the majority of SRL systems impractical for real-world applications. In this article, we propose the first transition-based SRL approach that is capable of completely processing an input sentence in a single left-to-right pass, with neither leveraging syntactic information nor resorting to additional modules. Thanks to our implementation based on Pointer Networks, full SRL can be accurately and efficiently done in O(n2)O(n^2), achieving the best performance to date on the majority of languages from the CoNLL-2009 shared task.Comment: Final peer-reviewed manuscript accepted for publication in Knowledge-Based System

    Direction of Translation from English to Japanese

    Get PDF
    Translation of written texts from English to Japanese, which poses numerous challenges because of the contrasting word order of the two languages, may be facilitated by right-to-left processing of the original sentence. One drawback of this method is that students may become reliant on processing English in this manner and therefore be less inclined to read English in its natural order. A further disadvantage of processing English from right to left is that it encourages learners to process the text visually rather than phonologically. Their mental representation of the text may be in Japanese rather than English. A questionnaire was given to 115 Japanese university students to elicit their experiences of translation and their preferences for the direction of translation. Many of them believe right-to-left translation helps achieve a detailed and accurate understanding of the text but limits their acquisition of other essential English skills.英語と日本語は、語順が余りにも違いすぎるために、翻訳する際には語順を逆転させて訳すことが容易な方法だと考えられている。しかし、この方法には欠点がある。それは、学生たちが英語のテキストの語順を逆転させることにばかり気を取られて、英語本来の語順に注意が向かなくなることである。そして、この語順を逆転させる翻訳法は、読んで理解する傾向を助長し、英語を耳から聴いて理解することを妨げてしまう。則ち、テキストが心の中では、英語ではなく日本語でイメージされるのである。115名の大学生を対象とした彼らの翻訳経験に関する調査結果から、我々は翻訳指導への示唆を得ることが出来る。大学生たちは、英語の本質的な言語技術の習得には限界があるにも関わらず、語順を逆転させて翻訳することが、テキストをより正しく、より詳細に理解することにつながると信じているのである

    Neural overlap of L1 and L2 semantic representations across visual and auditory modalities : a decoding approach/

    Get PDF
    This study investigated whether brain activity in Dutch-French bilinguals during semantic access to concepts from one language could be used to predict neural activation during access to the same concepts from another language, in different language modalities/tasks. This was tested using multi-voxel pattern analysis (MVPA), within and across language comprehension (word listening and word reading) and production (picture naming). It was possible to identify the picture or word named, read or heard in one language (e.g. maan, meaning moon) based on the brain activity in a distributed bilateral brain network while, respectively, naming, reading or listening to the picture or word in the other language (e.g. lune). The brain regions identified differed across tasks. During picture naming, brain activation in the occipital and temporal regions allowed concepts to be predicted across languages. During word listening and word reading, across-language predictions were observed in the rolandic operculum and several motor-related areas (pre- and postcentral, the cerebellum). In addition, across-language predictions during reading were identified in regions typically associated with semantic processing (left inferior frontal, middle temporal cortex, right cerebellum and precuneus) and visual processing (inferior and middle occipital regions and calcarine sulcus). Furthermore, across modalities and languages, the left lingual gyrus showed semantic overlap across production and word reading. These findings support the idea of at least partially language- and modality-independent semantic neural representations

    Tree transducers, L systems, and two-way machines

    Get PDF
    A relationship between parallel rewriting systems and two-way machines is investigated. Restrictions on the “copying power” of these devices endow them with rich structuring and give insight into the issues of determinism, parallelism, and copying. Among the parallel rewriting systems considered are the top-down tree transducer; the generalized syntax-directed translation scheme and the ETOL system, and among the two-way machines are the tree-walking automaton, the two-way finite-state transducer, and (generalizations of) the one-way checking stack automaton. The. relationship of these devices to macro grammars is also considered. An effort is made .to provide a systematic survey of a number of existing results

    Limitations of Cross-Lingual Learning from Image Search

    Full text link
    Cross-lingual representation learning is an important step in making NLP scale to all the world's languages. Recent work on bilingual lexicon induction suggests that it is possible to learn cross-lingual representations of words based on similarities between images associated with these words. However, that work focused on the translation of selected nouns only. In our work, we investigate whether the meaning of other parts-of-speech, in particular adjectives and verbs, can be learned in the same way. We also experiment with combining the representations learned from visual data with embeddings learned from textual data. Our experiments across five language pairs indicate that previous work does not scale to the problem of learning cross-lingual representations beyond simple nouns

    Word Representation Models for Morphologically Rich Languages in Neural Machine Translation

    Full text link
    Dealing with the complex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. We incorporate these representations in a novel machine translation model which jointly learns word alignments and translations via a hard attention mechanism. Evaluating on translating from several morphologically rich languages into English, we show consistent improvements over strong baseline methods, of between 1 and 1.5 BLEU points
    corecore