917 research outputs found

    Knowledge Representation and WordNets

    Get PDF
    Knowledge itself is a representation of “real facts”. Knowledge is a logical model that presents facts from “the real world” witch can be expressed in a formal language. Representation means the construction of a model of some part of reality. Knowledge representation is contingent to both cognitive science and artificial intelligence. In cognitive science it expresses the way people store and process the information. In the AI field the goal is to store knowledge in such way that permits intelligent programs to represent information as nearly as possible to human intelligence. Knowledge Representation is referred to the formal representation of knowledge intended to be processed and stored by computers and to draw conclusions from this knowledge. Examples of applications are expert systems, machine translation systems, computer-aided maintenance systems and information retrieval systems (including database front-ends).knowledge, representation, ai models, databases, cams

    A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval

    Get PDF
    In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail

    Search techniques in electronic dictionaries: a classification for translators

    Get PDF
    Translators, and language professionals in general, have long claimed that dictionaries are deficient, especially regarding access and updating of content. Some authors have also noted that these deficiencies are compounded by the fact that language professionals do not receive (proper) training in dictionary use, and therefore do not fully benefit from them. Electronic dictionaries include new search capabilities, not found in traditional dictionaries, that could meet users’ needs. However, the diversity of search options in electronic dictionaries makes their classification difficult, and consequently hinders training in their use. Systematization of search techniques in electronic dictionaries would favor the teaching and learning process, and could also facilitate the task of lexicographers and terminographers in the creation of new and more standardized electronic dictionaries. In this paper we classify search techniques in electronic dictionaries by focusing on three elements that are common to every search and that, taken together, encompass all the search possibilities we have observed in electronic dictionaries

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Xu: An Automated Query Expansion and Optimization Tool

    Full text link
    The exponential growth of information on the Internet is a big challenge for information retrieval systems towards generating relevant results. Novel approaches are required to reformat or expand user queries to generate a satisfactory response and increase recall and precision. Query expansion (QE) is a technique to broaden users' queries by introducing additional tokens or phrases based on some semantic similarity metrics. The tradeoff is the added computational complexity to find semantically similar words and a possible increase in noise in information retrieval. Despite several research efforts on this topic, QE has not yet been explored enough and more work is needed on similarity matching and composition of query terms with an objective to retrieve a small set of most appropriate responses. QE should be scalable, fast, and robust in handling complex queries with a good response time and noise ceiling. In this paper, we propose Xu, an automated QE technique, using high dimensional clustering of word vectors and Datamuse API, an open source query engine to find semantically similar words. We implemented Xu as a command line tool and evaluated its performances using datasets containing news articles and human-generated QEs. The evaluation results show that Xu was better than Datamuse by achieving about 88% accuracy with reference to the human-generated QE.Comment: Accepted to IEEE COMPSAC 201

    Smart Search in Newspaper Archives Using Topic Maps

    Get PDF
    The OmniPaper project has implemented three information retrieval prototypes in the area of electronic news publishing. One prototype uses SOAP as communication protocol between the central system and a number of distributed news archives. The second prototype uses an RDF metadata database, enabling direct metadata queries to the central system. Finally the Topic Map prototype uses query expansion and semantic linking for smart metadata search. The Topic Map prototype enhances thesearch experience by implementing a knowledge layer that combines the semantic content of a lexical database, consisting of concepts and keywords, with a metadata-set of newspaper articles. The linking between both is currently implemented at the level of keywords but will be developed at the level of concepts in the final prototype. The knowledge layer has been designed from a Topic Map point of view, although the XTM syntax has not been used to avoid performance issues. The consortium’s adopted view on information publishing and retrieval considers querying and navigation as two very related actions that can both be captured under the name “search for relevant information”. Navigation forces the user to followpredefined paths whereas querying enables the user to look freely for a suitable starting point. The query and navigation functionality is provided through a web engine and is build on top of the information structure of the knowledge layer

    Effectiveness of Title-Search vs. Full-Text Search in the Web

    Get PDF
    Search engines sometimes apply the search on the full text of documents or web-pages; but sometimes they can apply the search on selected parts of the documents only, e.g. their titles. Full-text search may consume a lot of computing resources and time. It may be possible to save resources by applying the search on the titles of documents only, assuming that a title of a document provides a concise representation of its content. We tested this assumption using Google search engine. We ran search queries that have been defined by users, distinguishing between two types of queries/users: queries of users who are familiar with the area of the search, and queries of users who are not familiar with the area of the search. We found that searches which use titles provide similar and sometimes even (slightly) better results compared to searches which use the full-text. These results hold for both types of queries/users. Moreover, we found an advantage in title-search when searching in unfamiliar areas because the general terms used in queries in unfamiliar areas match better with general terms which tend to be used in document titles

    Improving Search and Discovery of Geospatial Information in Australia and New Zealand using Semantic Web Techniques

    Get PDF
    This thesis proposes a set of techniques to make it easier for end users of spatial catalogue systems to locate datasets which they can then use for their own purposes. While other methods are used to locate spatial datasets, catalogue systems continue to be a common choice and are actively supported by those with jurisdiction over datasets in both the public and private sectors

    Intelligent Query Answering Through Rule Learning and Generalization

    Get PDF
    The Department of Defense (DoD) relies heavily on information systems to complete a myriad of tasks, from day-to-day personnel actions to mission critical imagery retrieval, intelligence analysis, and mission planning. The astronomical growth in size and performance of data storage systems leads to problems in processing the amount of data returned on any given query. Typical relational database systems return a set of unordered records. This approach is acceptable in small information systems, but in large systems, such as military image retrieval systems with more than 1 million records, it requires considerable time (often hours to days) to sort through thousands of records and select the relevant for analysis. This research introduces Intelligent Query Answering (IQA) as a novel approach to information retrieval. IQA implements the FOIL algorithm to learn rules based upon user feedback QUI90. The Winnow algorithm adjusts rule weights based on user classification, for improved document orderings BLU97. A semantic tree specific to the domain allows rule generalization across the domain. Testing shows a document sort accuracy rate of 63-93% against a controlled test dataset and 78-89% accuracy rate on a subset of declassified National Air Intelligence Center imagery metadata. These results demonstrate that this research provides groundwork for future efforts in rule learning and rule generalization in the information retrieval field
    • …
    corecore