Search CORE

780 research outputs found

PRIME: A System for Multi-lingual Patent Retrieval

Author: Fujii Atsushi
Fukui Masatoshi
Higuchi Shigeto
Ishikawa Tetsuya
Publication venue
Publication date: 01/01/2001
Field of study

Given the growing number of patents filed in multiple countries, users are interested in retrieving patents across languages. We propose a multi-lingual patent retrieval system, which translates a user query into the target language, searches a multilingual database for patents relevant to the query, and improves the browsing efficiency by way of machine translation and clustering. Our system also extracts new translations from patent families consisting of comparable patents, to enhance the translation dictionary

arXiv.org e-Print Archive

CiteSeerX

Mapping Science Based on Research Content Similarity

Author: Egami Shusaku
Kawamura Takahiro
Matsumoto Naoya
Watanabe Katsutaro
Publication venue: 'IntechOpen'
Publication date: 18/07/2018
Field of study

Maps of science representing the structure of science help us understand science and technology development. Thus, research in scientometrics has developed techniques for analyzing research activities and for measuring their relationships; however, navigating the recent scientific landscape is still challenging, since conventional inter-citation and co-citation analysis has difficulty in applying to recently published articles and ongoing projects. Therefore, to characterize what is being attempted in the current scientific landscape, this article proposes a content-based method of locating research articles/projects in a multi-dimensional space using word/paragraph embedding. Specifically, for addressing an unclustered problem, we introduced cluster vectors based on the information entropies of technical concepts. The experimental results showed that our method formed a clustered map from approx. 300 k IEEE articles and NSF projects from 2012 to 2016. Finally, we confirmed that formation of specific research areas can be captured as changes in the network structure

IntechOpen

Crossref

Surrounding Word Sense Model for Japanese All-words Word Sense Disambiguation

Author: Komiya Kanako
Kotani Yoshiyuki
Morita Hajime
Sasaki Minoru
Sasaki Yuto
Shinnou Hiroyuki
Publication venue
Publication date: 01/01/2015
Field of study

Waseda University Repository

A survey on thesauri application in automatic natural language processing

Author: Andrey Vasilyev
Ilya Paramonov
Ivan Shchitov
Ksenia Lagutina
Nadezhda Lagutina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2017
Field of study

This paper is devoted to investigate efficiency of thesauri use in popular natural language processing (NLP) fields: information retrieval and analysis of texts and subject areas. A thesaurus is a natural language resource that models a subject area and can reflect human expert's knowledge in many NLP tasks. The main target of this survey is to determine how much thesauri affect processing quality and where they can provide better performance. We describe studies that use different types of thesauri, discuss contribution of the thesaurus into achieved results, and propose directions for future research in the thesaurus field

Directory of Open Access Journals

A Survey of Multilingual Text Retrieval

Author: Dorr Bonnie J.
Oard Douglas W.
Publication venue
Publication date: 15/10/1998
Field of study

This report reviews the present state of the art in selection of texts in one language based on queries in another, a problem we refer to as ``multilingual'' text retrieval. Present applications of multilingual text retrieval systems are limited by the cost and complexity of developing and using the multilingual thesauri on which they are based and by the level of user training that is required to achieve satisfactory search effectiveness. A general model for multilingual text retrieval is used to review the development of the field and to describe modern production and experimental systems. The report concludes with some observations on the present state of the art and an extensive bibliography of the technical literature on multilingual text retrieval. The research reported herein was supported, in part, by Army Research Office contract DAAL03-91-C-0034 through Battelle Corporation, NSF NYI IRI-9357731, Alfred P. Sloan Research Fellow Award BR3336, and a General Research Board Semester Award. (Also cross-referenced as UMIACS-TR-96-19

Digital Repository at the University of Maryland

Theory and Applications for Advanced Text Mining

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques. They are various techniques from relation extraction to under or less resourced language. I believe that this book will give new knowledge in the text mining field and help many readers open their new research fields

Directory of Open Access Books (DOAB)

Using Case Prototypicality as a Semantic Primitive

Author: Lee Ik-Hwan
Song Mansuk
Yang Dan-Hee
Publication venue: Chinese and Oriental Languages Information Processing Society
Publication date: 01/01/1998
Field of study

Waseda University Repository