Search CORE

3,151 research outputs found

Bilingual Knowledge Extraction Using Chunk Alignment

Author: Hwang Young-Sook
Paik Kyonghee
佐々木裕
Publication venue: Logico-Linguistic Society of Japan
Publication date: 16/11/2005
Field of study

In this paper, we propose a new method for effectively acquiring bilingual knowledge by exploiting the dependency relations among the aligned chunks and words. We use a monolingual dependency parser to automatically obtain dependency parses of target language using chunk and word alignment. For reducing the computational complexity of structural alignment, we use a bilingual dictionary and adopt a divide-and-conquer strategy. By sharing the dependency relations of a given source sentence, we automatically obtain a dependency parse of a target sentence that is structurally consistent with the source sentence. Moreover, we extract bilingual knowledge bases from translation correspondences of singletons to surface verb subcategorization patterns by exploiting the bilingual dependency relations. To acquire reliable ones, we take a stepwise filtering method based on statistical test

Waseda University Repository

Recommended from our members

Learning to Live with Machine Translation

Author: Long Hoyt
Publication venue: 'Project Muse'
Publication date: 08/06/2023
Field of study

Rapid advancements in technologies of text and image generation have increasingly put the perceived autonomy of human creativity under threat. Even before ChatGPT and other large-language models sent such anxieties into overdrive, literary critics were arguing for a hermeneutics of automatic writing and revisiting long-held assumptions about artistic originality. Few, however, gave much thought to these model's quirky cousins—a family branch that once ruled over the utopian dreams invested in AI: machine translation (MT). This essay reflects on why translation has been lost in all the recent talk about these models and offers a necessary corrective. It considers what a critical response to MT might look like when reframed around an understanding of current technologies and a vision of MT as potential collaborator rather than human replacement. First, it offers an overview of current neural-based MT and the theories of translation that underwrite it. It then uses literary texts as a limit case for surveying the technology's most visible gaps, providing a deep, qualitative analysis of Japanese literary texts machine translated into English. Finally, it takes a speculative turn and considers what "good enough" machine translation of a large corpus of world literature might be good for in a future of ubiquitous and ever more accessible MT. The results hint at more immediate ways that MT invites inquiry into the present conditions of world literature, but also to a future where the entanglement of human translation and agency with the material agency of the technology bring forth potentials in both

Knowledge UChicago

Modeling information structure in a cross-linguistic perspective

Author: Song Sanghoun
Publication venue: Language Science Press
Publication date: 01/01/2017
Field of study

This study makes substantial contributions to both the theoretical and computational treatment of information structure, with a specific focus on creating natural language processing applications such as multilingual machine translation systems. The present study first provides cross-linguistic findings in regards to information structure meanings and markings. Building upon such findings, the current model represents information structure within the HPSG/MRS framework using Individual Constraints. The primary goal of the present study is to create a multilingual grammar model of information structure for the LinGO Grammar Matrix system. The present study explores the construction of a grammar library for creating customized grammar incorporating information structure and illustrates how the information structure-based model improves performance of transfer-based machine translation

OAPEN Library

Institutional Repository of the Freie Universität Berlin

ZENODO

Language Science Press

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Directory of Open Access Books (DOAB)

Automatic generation of parallel treebanks: an efficient unsupervised system

Author: Zhechev Ventsislav
Publication venue: Dublin City University. School of Computing
Publication date: 01/01/2009
Field of study

The need for syntactically annotated data for use in natural language processing has increased dramatically in recent years. This is true especially for parallel treebanks, of which very few exist. The ones that exist are mainly hand-crafted and too small for reliable use in data-oriented applications. In this work I introduce a novel open-source platform for the fast and robust automatic generation of parallel treebanks through sub-tree alignment, using a limited amount of external resources. The intrinsic and extrinsic evaluations that I undertook demonstrate that my system is a feasible alternative to the manual annotation of parallel treebanks. Therefore, I expect the presented platform to help boost research in the field of syntaxaugmented machine translation and lead to advancements in other fields where parallel treebanks can be employed

CiteSeerX

Irish Universities

DCU Online Research Access Service

Korean Grammar Using TAGs

Author: Park Hyun Seok
Publication venue: ScholarlyCommons
Publication date: 01/12/1994
Field of study

This paper addresses various issues related to representing the Korean language using Tree Adjoining Grammars. Topics covered include Korean grammar using TAGs, Machine Translation between Korean and English using Synchronous Tree Adjoining Grammars (STAGs), handling scrambling using Multi Component TAGs (MC-TAGs), and recovering empty arguments. The data for the parsing is from US military communication messages

ScholarlyCommons@Penn

JTEC panel report on machine translation in Japan

Author: Carbonell Jaime
Johnson David
Rich Elaine
Tomita Masaru
Vasconcellos Muriel
Wilks Yorick
Publication venue
Publication date
Field of study

The goal of this report is to provide an overview of the state of the art of machine translation (MT) in Japan and to provide a comparison between Japanese and Western technology in this area. The term 'machine translation' as used here, includes both the science and technology required for automating the translation of text from one human language to another. Machine translation is viewed in Japan as an important strategic technology that is expected to play a key role in Japan's increasing participation in the world economy. MT is seen in Japan as important both for assimilating information into Japanese as well as for disseminating Japanese information throughout the world. Most of the MT systems now available in Japan are transfer-based systems. The majority of them exploit a case-frame representation of the source text as the basis of the transfer process. There is a gradual movement toward the use of deeper semantic representations, and some groups are beginning to look at interlingua-based systems

NASA Technical Reports Server

CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

Author: Graduate Students Faculty &
Publication venue: ScholarlyCommons
Publication date: 01/12/1990
Field of study

CLIFF is the Computational Linguists\u27 Feedback Forum. We are a group of students and faculty who gather once a week to hear a presentation and discuss work currently in progress. The \u27feedback\u27 in the group\u27s name is important: we are interested in sharing ideas, in discussing ongoing research, and in bringing together work done by the students and faculty in Computer Science and other departments. However, there are only so many presentations which we can have in a year. We felt that it would be beneficial to have a report which would have, in one place, short descriptions of the work in Natural Language Processing at the University of Pennsylvania. This report then, is a collection of abstracts from both faculty and graduate students, in Computer Science, Psychology and Linguistics. We want to stress the close ties between these groups, as one of the things that we pride ourselves on here at Penn is the communication among different departments and the inter-departmental work. Rather than try to summarize the varied work currently underway at Penn, we suggest reading the abstracts to see how the students and faculty themselves describe their work. The report illustrates the diversity of interests among the researchers here, as well as explaining the areas of common interest. In addition, since it was our intent to put together a document that would be useful both inside and outside of the university, we hope that this report will explain to everyone some of what we are about

ScholarlyCommons@Penn

A Survey on Semantic Processing Techniques

Author: Cambria Erik
Chen Guanyi
He Kai
Mao Rui
Ni Jinjie
Yang Zonglin
Zhang Xulang
Publication venue
Publication date: 22/10/2023
Field of study

Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

arXiv.org e-Print Archive