4 research outputs found

    Semantic Learning and Web Image Mining with Image Recognition and Classification

    Get PDF
    Image mining is more than just an extension of data mining to image domain. Web Image mining is a technique commonly used to extract knowledge directly from images on WWW. Since main targets of conventional Web mining are numerical and textual data, Web mining for image data is on demand. There are huge image data as well as text data on the Web. However, mining image data from the Web is paid less attention than mining text data, since treating semantics of images are much more difficult. This paper proposes a novel image recognition and image classification technique using a large number of images automatically gathered from the Web as learning images. For classification the system uses imagefeature- based search exploited in content-based image retrieval(CBIR), which do not restrict target images unlike conventional image recognition methods and support vector machine(SVM), which is one of the most efficient & widely used statistical method for generic image classification that fit to the learning tasks. By the experiments it is observed that the proposed system outperforms some existing search system

    Automatic Caption Localization for Photographs on World Wide Web Pages

    Get PDF
    http://faculty.nps.edu/ncrowe/marie/webpics.htmlA variety of software tools index text of the World Wide Web, but little attention has been paid to the many photographs. We explore the indirect method of locating for indexing the likely explicit and implicit captions of photographs. We use multimodal clues including the specific words used, the syntax, the surrounding layout of the Web page, and the general appearance of the associated image. Our MARIE-3 system thus avoids full image processing and full natural-language processing, but shows a surprising degree of success. Experiments with a semi-random set of Web pages showed 41% recall with 41% precision for the task of distinguishing captions from other text, and 70% recall with 30% precision. This is much better than chance since actual captions were only 1.4% of the text on pages with photographs.supported by the U.S. Army Artificial Intelligence Center, and by the U. S. Naval Postgraduate SchoolChief for Naval OperationsApproved for public release; distribution is unlimited

    WAQS : a web-based approximate query system

    Get PDF
    The Web is often viewed as a gigantic database holding vast stores of information and provides ubiquitous accessibility to end-users. Since its inception, the Internet has experienced explosive growth both in the number of users and the amount of content available on it. However, searching for information on the Web has become increasingly difficult. Although query languages have long been part of database management systems, the standard query language being the Structural Query Language is not suitable for the Web content retrieval. In this dissertation, a new technique for document retrieval on the Web is presented. This technique is designed to allow a detailed retrieval and hence reduce the amount of matches returned by typical search engines. The main objective of this technique is to allow the query to be based on not just keywords but also the location of the keywords within the logical structure of a document. In addition, the technique also provides approximate search capabilities based on the notion of Distance and Variable Length Don\u27t Cares. The proposed techniques have been implemented in a system, called Web-Based Approximate Query System, which contains an SQL-like query language called Web-Based Approximate Query Language. Web-Based Approximate Query Language has also been integrated with EnviroDaemon, an environmental domain specific search engine. It provides EnviroDaemon with more detailed searching capabilities than just keyword-based search. Implementation details, technical results and future work are presented in this dissertation
    corecore