2 research outputs found

    Automatic Classification and Intelligent Clustering for WWWeb Information Retrieval Systems

    No full text
    In this paper we present some aspects of an intelligent interface for a WWWeb legal information retrieval system. Our system is able to keep the context of the user interaction in order to supply suggestions for further refinement of the user queries. The set of documents obtained from the user queries is dynamically organised in clusters labeled with keywords from a juridical thesaurus. Since, some of the texts were not previously classified, we have developed an automatic juridical classifier based on a neural network. The classifier receives as input a legal text and proposes a set of juridical terms that characterize it

    Automatisches Klassifizieren : Verfahren zur Erschliessung elektronischer Dokumente

    Get PDF
    Automatic classification of text documents refers to the computerized allocation of class numbers from existing classification schemes to natural language texts by means of suitable algorithms. Based upon a comprehensive literature review, this thesis establishes an informed and up-to-date view of the applicability of automatic classification for the subject approach to electronic documents, particularly to Web resources. Both methodological aspects and the experiences drawn from relevant projects and applications are covered. Concerning methodology, the present state-of-the-art comprises a number of statistical approaches that rely on machine learning; these methods use pre-classified example documents for establishing a model - the "classifier" - which is then used for classifying new documents. However, the four large-scale projects conducted in the 1990s by the Universities of Lund, Wolverhampton and Oldenburg, and by OCLC (Dublin, OH), still used rather simple and more traditional methodological approaches. These projects are described and analyzed in detail. As they made use of traditional library classifications their results are significant for LIS, even if no permanent quality services have resulted from these endeavours. The analysis of other relevant applications and projects reveals a number of attempts to use automatic classification for document processing in the fields of patent and media documentation. Here, semi-automatic solutions that support human classifiers are preferred, due to the yet unsatisfactory classification results obtained by fully automated systems. Other interesting implementations include Web portals, search engines and (commercial) information services, whereas only little interest has been shown in the automatic classification of books and bibliographic records. In the concluding part of the study the author discusses the most significant applications and projects, and also addresses several problems and issues in the context of automatic classification
    corecore