1 research outputs found

    A belief networks-based generative model for Structured Documents. An application to the XML Categorization

    No full text
    We present a generarive Bayesian model for the modeling of structured (e.g. XML) documents. This model allows us to simultaneously take into account structure and content information. It is used here for classifying XML documents. We adopt a machine learning approach and the model parameters are learned from a labeled training set of rep- resentative documents. We discuss the role of structural information for classification and describe experiments on a small collection of class labeled structured documents. We also present preliminary results showing how this model could classify documents with DTDs not represented in the training set
    corecore