20,947 research outputs found

    PRIME: A System for Multi-lingual Patent Retrieval

    Full text link
    Given the growing number of patents filed in multiple countries, users are interested in retrieving patents across languages. We propose a multi-lingual patent retrieval system, which translates a user query into the target language, searches a multilingual database for patents relevant to the query, and improves the browsing efficiency by way of machine translation and clustering. Our system also extracts new translations from patent families consisting of comparable patents, to enhance the translation dictionary

    Semantic Matching Using Ontology in Multilingual Environment

    Get PDF
    The tremendous increase in usage of data over the past few years and the ease of availability of things across the globe any time lead to the Advancement of multilingual database. Storage, retrieval and archiving of data for multi lingual system has been a challenge. Our research project “Semantic Matching using Ontology in Multilingual Environment” is an extension is to look into addressing multi-lingual data. The report focuses on providing design and implementation of multilingual system. These comprise of two main components being (i) Cross Lingual Information Retrieval, and (ii) Indian Language to Indian Language Machine Translation. Consider the context of large scale natural language processing applications in the areas of Cross Lingual IR and Machine Translation, wherein such a model for multilingual dictionary is established. When contrasted to traditional single lingual or bilingual dictionary, the model uses the core concept of Synonym Groupings (synsets) that is used as a way to connect different languages in a crisp and efficient manner

    New Directions In Database-Systems Research and Development

    Get PDF
    Prepared for: Chief of Naval Research Arlington, VA 22217In this paper, three new directions in database-systems research and development are indicated. One new direction is the emergence of the multilingual database systems where a single database system can execute many transactions written respectively in different data languages and support many databases structured correspondingly in various data models. Thus, a multi-lingual database system allows the old transactions and existing databases to be migrated to the new system, the user to explore the strong features of the various data languages and data models in the same system, the hardware upgrade to be focused on a single system instead of a heterogeneous collection of database systems, and the database application to cover wider types of transactions and interaction in the same environment. One other new direction is the emphasis of the multi-backend database systems where the database system is configured with a number of microprocessor-based processing units and their disk subsystems. These processing units and disk subsystems are called database backends. The unique characteristics of the backends are that the number of the backends is variable, the system software in all of the backends is identical, and the multiplicity of the backends is proportional to the performance and capacity of the system. Thus, for the first time, a multi-backend database system enables the user to relate the amount of hardware used (i.e., the number of the backends) to the degree of performance gain and capacity growth of the system. The third new direction is the possibility of the multi-host database systems where a single database system can communicate with a variable number and heterogeneous collection of mainframes in several different data languages and allow the mainframes to share the common database store and access. This paper attempts to articulate the background, benefits, requirements and architectures of these new types of database system, namely, the multi-lingua the multi-backend, and the multi-host database systems.DoD STARS Program and from the Office of Naval Research.Approved for public release; distribution is unlimited

    MycoBank gearing up for new horizons.

    Get PDF
    MycoBank, a registration system for fungi established in 2004 to capture all taxonomic novelties, acts as a coordination hub between repositories such as Index Fungorum and Fungal Names. Since January 2013, registration of fungal names is a mandatory requirement for valid publication under the International Code of Nomenclature for algae, fungi and plants (ICN). This review explains the database innovations that have been implemented over the past few years, and discusses new features such as advanced queries, registration of typification events (MBT numbers for lecto, epi- and neotypes), the multi-lingual database interface, the nomenclature discussion forum, annotation system, and web services with links to third parties. MycoBank has also introduced novel identification services, linking DNA sequence data to numerous related databases to enable intelligent search queries. Although MycoBank fills an important void for taxon registration, challenges for the future remain to improve links between taxonomic names and DNA data, and to also introduce a formal system for naming fungi known from DNA sequence data only. To further improve the quality of MycoBank data, remote access will now allow registered mycologists to act as MycoBank curators, using Citrix software

    Application of Out-Of-Language Detection To Spoken-Term Detection

    Get PDF
    This paper investigates the detection of English spoken terms in a conversational multi-language scenario. The speech is processed using a large vocabulary continuous speech recognition system. The recognition output is represented in the form of word recognition lattices which are then used to search required terms. Due to the potential multi-lingual speech segments at the input, the spoken term detection system is combined with a module performing out-of-language detection to adjust its confidence scores. First, experimental results of spoken term detection are provided on the conversational telephone speech database distributed by NIST in 2006. Then, the system is evaluated on a multi-lingual database with and without employment of the out-of-language detection module, where we are only interested in detecting English terms (stored in the index database). Several strategies to combine these two systems in an efficient way are proposed and evaluated. Around 7% relative improvement over a stand-alone STD is achieved

    Semantic Matching Using Ontology in Multilingual Environment

    Get PDF
    The tremendous increase in usage of data over the past few years and the ease of availability of things across the globe any time lead to the Advancement of multilingual database. Storage, retrieval and archiving of data for multi lingual system has been a challenge. Our research project “Semantic Matching using Ontology in Multilingual Environment” is an extension is to look into addressing multi-lingual data. The report focuses on providing design and implementation of multilingual system. These comprise of two main components being (i) Cross Lingual Information Retrieval, and (ii) Indian Language to Indian Language Machine Translation. Consider the context of large scale natural language processing applications in the areas of Cross Lingual IR and Machine Translation, wherein such a model for multilingual dictionary is established. When contrasted to traditional single lingual or bilingual dictionary, the model uses the core concept of Synonym Groupings (synsets) that is used as a way to connect different languages in a crisp and efficient manner

    Cross model access in the multi-lingual, multi-model database management system.

    Get PDF
    Relational, hierarchical, network, functional, and object oriented databases support its corresponding query language, SQL, DL/I, CODASYL-DML, DAPLEX, and OO-DML, respectively. However, each database type may be accessed only by its own language. The goal of M2DBMS is to provide a heterogeneous environment in which any supported database is accessible by any supported query language. This is known as cross model access capability. In this thesis, relational to object oriented database cross model access is successfully implemented for a test database. Data from the object oriented database EWIROODB is accessed and retrieved, using an SQL query from the relational database EWIROODB. One problem is that the two interfaces (object oriented and relational) create catalog files with different formation, which makes the cross model access impossible, initially. In this thesis the relational created catalog file is used, and the cross model access capability is achieved. The object oriented catalog file must be identical with the relational one. Therefore, work yet to be done is to write a program that automatically reformats the object oriented catalog file into an equivalent relational catalog filehttp://archive.org/details/crossmodelaccess00anasLt, Hellenic NavyApproved for public release; distribution is unlimited

    XL-NBT: A Cross-lingual Neural Belief Tracking Framework

    Full text link
    Task-oriented dialog systems are becoming pervasive, and many companies heavily rely on them to complement human agents for customer service in call centers. With globalization, the need for providing cross-lingual customer support becomes more urgent than ever. However, cross-lingual support poses great challenges---it requires a large amount of additional annotated data from native speakers. In order to bypass the expensive human annotation and achieve the first step towards the ultimate goal of building a universal dialog system, we set out to build a cross-lingual state tracking framework. Specifically, we assume that there exists a source language with dialog belief tracking annotations while the target languages have no annotated dialog data of any form. Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data. We then distill and transfer its own knowledge to the student state tracker in target languages. We specifically discuss two types of common parallel resources: bilingual corpus and bilingual dictionary, and design different transfer learning strategies accordingly. Experimentally, we successfully use English state tracker as the teacher to transfer its knowledge to both Italian and German trackers and achieve promising results.Comment: 13 pages, 5 figures, 3 tables, accepted to EMNLP 2018 conferenc
    • …
    corecore