3 research outputs found

    Hierarchical Metadata-Aware Document Categorization under Weak Supervision

    Full text link
    Categorizing documents into a given label hierarchy is intuitively appealing due to the ubiquity of hierarchical topic structures in massive text corpora. Although related studies have achieved satisfying performance in fully supervised hierarchical document classification, they usually require massive human-annotated training data and only utilize text information. However, in many domains, (1) annotations are quite expensive where very few training samples can be acquired; (2) documents are accompanied by metadata information. Hence, this paper studies how to integrate the label hierarchy, metadata, and text signals for document categorization under weak supervision. We develop HiMeCat, an embedding-based generative framework for our task. Specifically, we propose a novel joint representation learning module that allows simultaneous modeling of category dependencies, metadata information and textual semantics, and we introduce a data augmentation module that hierarchically synthesizes training documents to complement the original, small-scale training set. Our experiments demonstrate a consistent improvement of HiMeCat over competitive baselines and validate the contribution of our representation learning and data augmentation modules.Comment: 9 pages; Accepted to WSDM 202

    A Variational Bayesian Superresolution Approach Using Adaptive Image Prior Model

    Get PDF
    The objective of superresolution is to reconstruct a high-resolution image by using the information of a set of low-resolution images. Recently, the variational Bayesian superresolution approach has been widely used. However, these methods cannot preserve edges well while removing noises. For this reason, we propose a new image prior model and establish a Bayesian superresolution reconstruction algorithm. In the proposed prior model, the degree of interaction between pixels is adjusted adaptively by an adaptive norm, which is derived based on the local image features. Moreover, in this paper, a monotonically decreasing function is used to calculate and update the single parameter, which is used to control the severity of penalizing image gradients in the proposed prior model. Thus, the proposed prior model is adaptive to the local image features thoroughly. With the proposed prior model, the edge details are preserved and noises are reduced simultaneously. A variational Bayesian inference is employed in this paper, and the formulas for calculating all the variables including the HR image, motion parameters, and hyperparameters are derived. These variables are refined progressively in an iterative manner. Experimental results show that the proposed SR approach is very efficient when compared to existing approaches
    corecore