178 research outputs found

    A Natural Language Processing Based Internet Agent

    Get PDF
    Searching for useful information is a difficult job by the virtue of the information overloading problem. With technological advances, notably the World-Wide Web (WWW), it allows every ordinary information owner to offer information online for others to access and retrieve. However, it also makes up a global information system that is extremely large-scale, diverse and dynamic. Internet agents and Internet search engines have been used to deal with such problems. But the search results are usually not quite relevant to what a user wants since most of them use simple keyword matching. In this paper, we propose a natural language processing based agent (NIAGENT) that understands a user's natural query. NIAGENT not only cooperates with a meta Internet search engine in order to increase recall of web pages but also analyzes the contents of the referenced documents to increase precision. Moreover, the proposed agent is autonomous, light-weight, and multithreaded. The architectural design also represents an interesting application of a distributed and cooperative computing paradigm. A prototype of NIAGENT, implemented in Java, shows its promise to find more useful information than keyword based searching.published_or_final_versio

    Evaluating the retrieval effectiveness of Web search engines using a representative query sample

    Full text link
    Search engine retrieval effectiveness studies are usually small-scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, data was collected using specialised software, the Relevance Assessment Tool (RAT). We found that while Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3 per cent of cases whereas Bing only found the correct answer 76.6 per cent of the time. We conclude that search engine performance on navigational queries is of great importance, as users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines

    Integrated Filtered Web-Search Engine

    Get PDF
    WWW has become one of the most important sources of information. WWW is not an indexed information warehouse where people easily look for specifieddata; it is instead a large collection of network of computers that contains the information. Finding informationin the WWWcan be as easy as it can be hard. Search engine was developed to assist users in searching information on the net. There exist a number of available effective search engine in the market nowadays but where human are concerns they always have something that they are not satisfied with. Mass information supplied to the users might get them exhausted as they browse through eachand every oneofthe results returned. Even so, there were users who have the habits of only look at the top 10 of the results page and will go to another search engine if there still not satisfied with the information. This project aims to reduce users dilemma on mass information supplied as well as to combine the major search engines normally used by most users nowadays. The benefits are that users can have more results from various search engines with one single click without any redundant results

    Relevance feedback and query expansion for searching the web: a model for searching a digital library

    Get PDF
    A fully operational large scale digital library is likely to be based on a distributed architecture and because of this it is likely that a number of independent search engines may be used to index different overlapping portions of the entire contents of the library. In any case, different media, text, audio, image, etc., will be indexed for retrieval by different search engines so techniques which provide a coherent and unified search over a suite of underlying independent search engines are thus likely to be an important part of navigating in a digital library. In this paper we present an architecture and a system for searching the world's largest DL, the world wide web. What makes our system novel is that we use a suite of underlying web search engines to do the bulk of the work while our system orchestrates them in a parallel fashion to provide a higher level of information retrieval functionality. Thus it is our meta search engine and not the underlying direct search engines that provide the relevance feedback and query expansion options for the user. The paper presents the design and architecture of the system which has been implemented, describes an initial version which has been operational for almost a year, and outlines the operation of the advanced version

    Combining information seeking services into a meta supply chain of facts

    Get PDF
    The World Wide Web has become a vital supplier of information that allows organizations to carry on such tasks as business intelligence, security monitoring, and risk assessments. Having a quick and reliable supply of correct facts from perspective is often mission critical. By following design science guidelines, we have explored ways to recombine facts from multiple sources, each with possibly different levels of responsiveness and accuracy, into one robust supply chain. Inspired by prior research on keyword-based meta-search engines (e.g., metacrawler.com), we have adapted the existing question answering algorithms for the task of analysis and triangulation of facts. We present a first prototype for a meta approach to fact seeking. Our meta engine sends a user's question to several fact seeking services that are publicly available on the Web (e.g., ask.com, brainboost.com, answerbus.com, NSIR, etc.) and analyzes the returned results jointly to identify and present to the user those that are most likely to be factually correct. The results of our evaluation on the standard test sets widely used in prior research support the evidence for the following: 1) the value-added of the meta approach: its performance surpasses the performance of each supplier, 2) the importance of using fact seeking services as suppliers to the meta engine rather than keyword driven search portals, and 3) the resilience of the meta approach: eliminating a single service does not noticeably impact the overall performance. We show that these properties make the meta-approach a more reliable supplier of facts than any of the currently available stand-alone services

    Integrated Filtered Web-Search Engine

    Get PDF
    WWW has become one of the most important sources of information. WWW is not an indexed information warehouse where people easily look for specifieddata; it is instead a large collection of network of computers that contains the information. Finding informationin the WWWcan be as easy as it can be hard. Search engine was developed to assist users in searching information on the net. There exist a number of available effective search engine in the market nowadays but where human are concerns they always have something that they are not satisfied with. Mass information supplied to the users might get them exhausted as they browse through eachand every oneofthe results returned. Even so, there were users who have the habits of only look at the top 10 of the results page and will go to another search engine if there still not satisfied with the information. This project aims to reduce users dilemma on mass information supplied as well as to combine the major search engines normally used by most users nowadays. The benefits are that users can have more results from various search engines with one single click without any redundant results

    Intelligent search for distributed information sources using heterogeneous neural networks

    Get PDF
    As the number and diversity of distributed information sources on the Internet exponentially increase, various search services are developed to help the users to locate relevant information. But they still exist some drawbacks such as the difficulty of mathematically modeling retrieval process, the lack of adaptivity and the indiscrimination of search. This paper shows how heteroge-neous neural networks can be used in the design of an intelligent distributed in-formation retrieval (DIR) system. In particular, three typical neural network models - Kohoren's SOFM Network, Hopfield Network, and Feed Forward Network with Back Propagation algorithm are introduced to overcome the above drawbacks in current research of DIR by using their unique properties. This preliminary investigation suggests that Neural Networks are useful tools for intelligent search for distributed information sources

    Ontology-based specific and exhaustive user profiles for constraint information fusion for multi-agents

    Get PDF
    Intelligent agents are an advanced technology utilized in Web Intelligence. When searching information from a distributed Web environment, information is retrieved by multi-agents on the client site and fused on the broker site. The current information fusion techniques rely on cooperation of agents to provide statistics. Such techniques are computationally expensive and unrealistic in the real world. In this paper, we introduce a model that uses a world ontology constructed from the Dewey Decimal Classification to acquire user profiles. By search using specific and exhaustive user profiles, information fusion techniques no longer rely on the statistics provided by agents. The model has been successfully evaluated using the large INEX data set simulating the distributed Web environment

    Advanced web searching for the information professional

    Get PDF
    pp. 29-4
    corecore