288,119 research outputs found

    Searching for text documents

    Get PDF

    Evaluation of clustering techniques for efficient searching in JXTA-based P2P systems

    Get PDF
    The efficient file searching is an essential feature in P2P systems. While many current approaches use brute force techniques to search files by meta information (file names, extensions or user-provided tags), the interest is in implementing techniques that allow content-based search in P2P systems. Recently, clustering techniques have been used for searching text documents to increase the efficiency of document discovery and retrieval. Integrating such techniques into P2P systems is important toenhance searching in P2P file sharing systems. While some effort has been done for content-based searching for text documents in P2P systems, there has been few research work for applying these techniques for multimedia content in P2P systems. In this paper we introduce two P2P content-based clustering techniques for multimedia documents. These techniques are an adaptation of the existing Class-based Semantic Search (CSS) algorithm for text documents. The proposed algorithms have been integrated into a JXTA-based Overlay P2P platform, and some initial evaluation results are provided. The JXTA-Overlay together with the considered clustering techniques is thus very useful for developing P2P multimedia applications requiring efficient searching of multimedia contents in peer nodesPeer ReviewedPostprint (published version

    Building a domain-specific document collection for evaluating metadata effects on information retrieval

    Get PDF
    This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in information retrieval (IR). The collection consists of more than 61,000 documents extracted from YouTube video pages on basketball in general and NBA (National Basketball Association) in particular, together with a set of 40 topics and their relevance judgements. In addition, a collection of nearly 250,000 user profiles related to the NBA collection is available. Several baseline IR experiments report the effect of using video-associated metadata on retrieval effectiveness. The results surprisingly show that searching the videos titles only performs significantly better than searching additional metadata text fields of the videos such as the tags or the description

    CMFRI launches Open Access Institutional Repository

    Get PDF
    'E-prints@CMFRI'feature the facility of searching the articles by year, author, subject, document type or division. Interested users can freely download full-text as most of the documents are directly accessible. 'Request Copy' forms can be used for documents to which direct full-text download is restricted due to publishers' embargoes

    PDF Text Searching System

    Get PDF
    This project is to develop a text searching system that assist users to develop a simple PDFtext-searching system, whichis capable of searching and processing the information in text files on user PC and in local networks. The main purpose of developing this project is to assist users in finding PDF documents and files within their local drives, where the appropriate documents can be found by entering the desired search terms (keywords) in the PDF Text Searching System. There are two objectives that have been set for this project. The first objective is to perform a study and have a better understanding on the software that will be used in order to develop PDF text-searching system, and the second objective is to develop a PDF text-searching system, which is capable of searching and processing the information in text files on userPC and in local networks. For the methodology, Rapid Application Development (RAD) approach has beenemployed. The methodology has been chosenbecause it is effective and suitable for short duration project. It was designed for developer and user to join together and work intensively toward their goal. By using the RAD methodology, the project is able to be completed within the time allocated. In the results and discussion part, it covers all the outcome that obtains from the project completion, which is based on the surveys conducted and questionnaires. In this chapter, the findings that were gain will determine whether the proposed system is acceptable and meet with the user's needs. In order to provide better services, some suggestion being carried out for future enhancement. This can improve the current system to be more efficient and effective

    A software tool for searching in binary text images

    Get PDF
    In this paper we present a software tool for searching word images in scanned text documents. We consider that the document pages are represented as files in tif, jpg, gif, png, bmp and other graphic file formats. Our experiments prove the efficiency of the proposed approach and show that such type of searching can be successful. Examples of using various languages are presented. Our software is user oriented and can be applied to any collection of scanned documents
    corecore