5 research outputs found

    A Web Searching Guide: Internet Search Engines & Autonomous Interface Agents Collaboration

    Get PDF
    The Internet represents the biggest communication media and its dimension increases every day. This continuous growth of information makes the Internet more and more interesting, but also the task of finding selected information becomes more complex and hard. Finding exactly what a user needs is not always an easy task: for example common search engines provide thousands of links for every search. Obviously not all these links are related to what the user really needs. In this paper, we present a Collaborative Autonomous Interface Agent (CAIA) that collaborates with the Internet search engines and supports the user in finding exactly the information consistent with his/her interest. A system has been designed, fully implemented and tested. The testing results shows a big improvement in the relevancy of the retrieved links and of the user’s satisfaction by using CAIA+Google compared to using only Google

    MRDTL: a multi-relational decision tree learning algorithm

    Get PDF
    Many real-world data sets are organized in relational databases consisting of multiple tables and associations. Other types of data such as in bioinformatics, computational biology, HTML and XML documents require reasoning about the structure of the objects. However, most of the existing approaches to machine learning typically assume that the data are stored in a single table, and use a propositional (as opposed to relational) language for discovering predictive models. Hence, there is a need for data mining algorithms for discovery of a-priori unknown relationships from multi-relational data. This thesis explores a new framework for multi-relational data mining. It describes experiments with an implementation of a Multi-Relational Decision Tree Learning (MRDTL) algorithm for induction of decision trees from relational databases based on an approach suggested by Knobbe et al., 1999. Our experiments with widely used benchmark data sets (e.g., the carcinogenesis data) show that the performance of MRDTL is competitive with that of other algorithms for learning classifiers from multiple relations including Progol (Muggleton, 1995) FOIL (Quinlan, 1993), Tilde (Blockeel, 1998). Preliminary results indicate that MRDTL, when augmented with principled methods for handling missing attribute values, is likely to be competitive with the state-of-the-art algorithms for learning classifiers from multiple relations on real-world data sets drawn from bioinformatics applications (prediction of gene localization and gene function) used in the KDD Cup 2001 data mining competition (Cheng et al., 2002)
    corecore