1 research outputs found

    Automatic Text Classification Through Point of Cultural Interest Digital Identifiers

    No full text
    The present work faces the problem of automatic classification and representation of unstructured texts into the Cultural Heritage domain. The research is carried out through a methodology based on the exploitation of machine-readable dictionaries of terminological simple words and multiword expressions. In the paper we will discuss the design and the population of a domain ontology, that enters into a complex interaction with the electronic dictionaries and a network of local grammars. A Max-Ent classifier, based on the ontology schema, aims to confer to each analyzed text an object identifier which is related to the semantic dimension of the text. Into this activity, the unstructured texts are processed through the use of the semantically annotated dictionaries in order to discover the underlying structure which facilitates the classification. The final purpose is the automatic attribution of POIds to texts on the base of the semantic features extracted into the texts through NLP strategies
    corecore