1 research outputs found

    MicroConceptBERT: concept-relation based document information extraction framework.

    Get PDF
    Extracting information from documents is a crucial task in natural language processing research. Existing information extraction methodologies often focus on specific domains, such as medicine, education or finance, and are limited by language constraints. However, more comprehensive approaches that transcend document types, languages, contexts, and structures would significantly advance the field proposed in recent research. This study addresses this challenge by introducing microConceptBERT: a concept-relations-based framework for document information extraction, which offers flexibility for various document processing tasks while accounting for hierarchical, semantic, and heuristic features. The proposed framework has been applied to a question-answering task on benchmark datasets: SQUAD 2.0 and DOCVQA. Notably, the F1 evaluation metric attains an outperforming 87.01 performance rate on the SQUAD 2.0 dataset compared to baseline models: BERT-base and BERT-large models
    corecore