8 research outputs found

    A Joint Model for Definition Extraction with Syntactic Connection and Semantic Consistency

    Full text link
    Definition Extraction (DE) is one of the well-known topics in Information Extraction that aims to identify terms and their corresponding definitions in unstructured texts. This task can be formalized either as a sentence classification task (i.e., containing term-definition pairs or not) or a sequential labeling task (i.e., identifying the boundaries of the terms and definitions). The previous works for DE have only focused on one of the two approaches, failing to model the inter-dependencies between the two tasks. In this work, we propose a novel model for DE that simultaneously performs the two tasks in a single framework to benefit from their inter-dependencies. Our model features deep learning architectures to exploit the global structures of the input sentences as well as the semantic consistencies between the terms and the definitions, thereby improving the quality of the representation vectors for DE. Besides the joint inference between sentence classification and sequential labeling, the proposed model is fundamentally different from the prior work for DE in that the prior work has only employed the local structures of the input sentences (i.e., word-to-word relations), and not yet considered the semantic consistencies between terms and definitions. In order to implement these novel ideas, our model presents a multi-task learning framework that employs graph convolutional neural networks and predicts the dependency paths between the terms and the definitions. We also seek to enforce the consistency between the representations of the terms and definitions both globally (i.e., increasing semantic consistency between the representations of the entire sentences and the terms/definitions) and locally (i.e., promoting the similarity between the representations of the terms and the definitions)

    Extração de contextos definitórios do Corpus COVID-19 com CQL

    Get PDF
    Termos representam os conceitos de um domínio e sua compreensão permite o acesso aos saberes contidos nos textos especializados. Entender o significado dos termos, portanto, é de grande importância não apenas para que pesquisadores possam socializar seus estudos e descobertas, mas também para que profissionais e estudantes de várias áreas possam se valer da informação especializada em contextos de estudo e de trabalho. A evolução rápida do conhecimento muitas vezes não permite que a terminologia criada para designar conceitos seja dicionarizada com a necessária rapidez. Tal fato pode representar um grande desafio para aqueles que necessitam ter acesso ao conhecimento especializado. Tendo em vista o contexto descrito, este estudo parte da revisão de abordagens utilizadas para a extração automática de traços definitórios (TDs) e contextos definitórios (CDs) e propõe a utilização da ferramenta Corpus Query Language (CQL) para a extração de informações que auxiliem no entendimento da terminologia empregada em textos especializados. Em especial, verificamos a utilidade das sintaxes de busca construídas com a CQL para esse propósito, aplicando-as ao Corpus COVID-19. O percurso apresentado neste estudo poderá auxiliar não apenas especialistas da área médica, mas também tradutores, lexicógrafos e professores a processarem, de forma mais rápida e precisa, o conhecimento contido em textos especializados.Terms represent the concepts of a domain and by comprehending them readers have access to the knowledge contained in specialized texts. Therefore, understanding the meaning of terms is of great importance not only for researchers to share the results of their studies, but also for professionals and students from various areas to apply specialized information in their learning and working contexts. The fast-evolving knowledge does not always permit that the terminology created to designate new concepts is quickly inserted in dictionaries, and this may represent a great challenge for those who need access to specialized knowledge. After presenting approaches used in the last twenty years for the automatic extraction of definition traits (DT) and definition contexts (DC), we propose the use of the Corpus Query Language (CQL) tool to retrieve information that helps in understanding the terminology used in specialized texts. In particular, we attested the usefulness of search syntaxes built with CQL for this purpose, applying them to the COVID-19 Corpus. The path presented in this study can help not only specialists in the medical field, but also translators, lexicographers and teachers to process, in a faster and more accurate way, the knowledge contained in specialized texts

    Definition context extraction from the COVID-19 corpus with CQL

    Get PDF
    Termos representam os conceitos de um domínio e sua compreensão permite o acesso aos saberes contidos nos textos especializados. Entender o significado dos termos, portanto, é de grande importância não apenas para que pesquisadores possam socializar seus estudos e descobertas, mas também para que profissionais e estudantes de várias áreas possam se valer da informação especializada em contextos de estudo e de trabalho. A evolução rápida do conhecimento muitas vezes não permite que a terminologia criada para designar conceitos seja dicionarizada com a necessária rapidez. Tal fato pode representar um grande desafio para aqueles que necessitam ter acesso ao conhecimento especializado. Tendo em vista o contexto descrito, este estudo parte da revisão de abordagens utilizadas para a extração automática de traços definitórios (TDs) e contextos definitórios (CDs) e propõe a utilização da ferramenta Corpus Query Language(CQL) para a extraçãode informações que auxiliem no entendimento da terminologia empregadaem textos especializados. Em especial, verificamos a utilidade das sintaxes de busca construídas com a CQLpara esse propósito, aplicando-as ao Corpus COVID-19. O percurso apresentado neste estudo poderá auxiliar não apenas especialistas da área médica, mas também tradutores, lexicógrafos e professores a processarem, de forma mais rápida e precisa, o conhecimento contido em textos especializados.Terms represent the concepts of a domain and by comprehending them readers have access to the knowledge contained in specialized texts. Therefore, understanding the meaning of terms is of great importance not only for researchers to share the results of their studies, but also for professionals and students from various areas to applyspecialized information in their learning and workingcontexts. The fast-evolving knowledge does not always permit that the terminology created to designate new concepts is quickly inserted in dictionaries, and this may represent a great challenge for those who need access to specialized knowledge. After presenting approaches used in the last twenty years for the automatic extraction of definition traits (DT) and definition contexts (DC), we propose the use of the Corpus Query Language (CQL) tool to retrieveinformation that helps in understanding the terminology used in specialized texts. In particular, we attested the usefulness of search syntaxes built with CQL for this purpose, applying them to the COVID-19 Corpus. The path presented in this study can help not only specialists in the medical field, but also translators, lexicographers and teachers to process, in a faster and more accurate way, the knowledge contained in specialized texts

    Generic soft pattern models for definitional question answering

    No full text
    10.1145/1076034.1076101SIGIR 2005 - Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval384-39

    Relation extraction for information extraction from free text

    Get PDF

    Retrieving questions and answers in community-based question answering services

    Get PDF