7,198 research outputs found

    Towards the Automatic Classification of Documents in User-generated Classifications

    Get PDF
    There is a huge amount of information scattered on the World Wide Web. As the information flow occurs at a high speed in the WWW, there is a need to organize it in the right manner so that a user can access it very easily. Previously the organization of information was generally done manually, by matching the document contents to some pre-defined categories. There are two approaches for this text-based categorization: manual and automatic. In the manual approach, a human expert performs the classification task, and in the second case supervised classifiers are used to automatically classify resources. In a supervised classification, manual interaction is required to create some training data before the automatic classification task takes place. In our new approach, we intend to propose automatic classification of documents through semantic keywords and building the formulas generation by these keywords. Thus we can reduce this human participation by combining the knowledge of a given classification and the knowledge extracted from the data. The main focus of this PhD thesis, supervised by Prof. Fausto Giunchiglia, is the automatic classification of documents into user-generated classifications. The key benefits foreseen from this automatic document classification is not only related to search engines, but also to many other fields like, document organization, text filtering, semantic index managing

    The Open Graph Archive: A Community-Driven Effort

    Full text link
    In order to evaluate, compare, and tune graph algorithms, experiments on well designed benchmark sets have to be performed. Together with the goal of reproducibility of experimental results, this creates a demand for a public archive to gather and store graph instances. Such an archive would ideally allow annotation of instances or sets of graphs with additional information like graph properties and references to the respective experiments and results. Here we examine the requirements, and introduce a new community project with the aim of producing an easily accessible library of graphs. Through successful community involvement, it is expected that the archive will contain a representative selection of both real-world and generated graph instances, covering significant application areas as well as interesting classes of graphs.Comment: 10 page

    Datamining for Web-Enabled Electronic Business Applications

    Get PDF
    Web-Enabled Electronic Business is generating massive amount of data on customer purchases, browsing patterns, usage times and preferences at an increasing rate. Data mining techniques can be applied to all the data being collected for obtaining useful information. This chapter attempts to present issues associated with data mining for web-enabled electronic-business
    corecore