19,307 research outputs found
Visualizing the semantic content of large text databases using text maps
A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content
Signature Files: An Integrated Access Method for Formatted and Unformatted Databases
The signature file approach is one of the most powerful information storage and retrieval techniques which is used for finding the data objects that are relevant to the user queries. The main idea of all signature based schemes is to reflect the essence of the data items into bit pattern (descriptors or signatures) and store them in a separate file which acts as a filter to eliminate the non aualifvine data items for an information reauest. It provides an integrated access method for both formattid and formatted databases. A complative
overview and discussion of the proposed signatnre generation methods and the major signature file organization schemes are presented. Applications of the signature techniques to formatted and unformatted databases, single and multiterm query cases, serial and paratlei architecture. static and dynamic environments are provided with a special emphasis on the multimedia databases where the pioneering prototype systems
using signatnres yield highly encouraging results
Recommended from our members
Parallel computing in information retrieval - An updated review
The progress of parallel computing in Information Retrieval (IR) is reviewed. In particular we stress the importance of the motivation in using parallel computing for Text Retrieval. We analyse parallel IR systems using a classification due to Rasmussen [1] and describe some parallel IR systems. We give a description of the retrieval models used in parallel Information Processing.. We describe areas of research which we believe are needed
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
Application of Information Retrieval Techniques to Heterogeneous Databases in the Virtual Distributed Laboratory
The Department of Defense (DoD) maintains thousands of Synthetic Aperture Radar (SAR), Infrared (IR), Hyper-Spectral intelligence imagery and Electro-Optical (EO) target signature data. These images are essential to evaluating and testing individual algorithm methodologies and development techniques within the Automatic Target Recognition (ATR) community. The Air Force Research Laboratory Sensors Directorate (AFRL/SN) has proposed the Virtual Distributed Laboratory (VDL) to maintain a central collection of the associated imagery metadata and a query mechanism to retrieve the desired imagery. All imagery metadata is stored in relational database format for access from agencies throughout the federal government and large civilian universities. Each set of imagery is independently maintained at each agency s location along with a local copy of the associated metadata that is periodically updated and sent to the VDL. This research focuses on applying information retrieval techniques to the multiple heterogeneous imagery metadata databases to present users the most relevant images based on user defined search criteria. More specifically, it defines a hierarchical concept thesaurus development methodology to handle the complexities of heterogeneous databases and the application of two classic information retrieval models. The results indicate this type of thesaurus-based approach can significantly increase the precision and recall levels of retrieving relevant documents
XML Security in Certificate Management - XML Certificator
The trend of rapid growing use of XML format in data/document management system reveals that security measures should be urgently considered into next generation's data/document systems. This paper presents a new certificate management system developed on the basis of XML security mechanisms. The system is supported by the theories of XML security as well as Object oriented technology and database. Finally it has been successfully implemented in using C&#, SQL, XML signature and XML encryption. An implementation metrics is evidently presented
- …