18 research outputs found

    The details of decision rule module.

    No full text
    Finding relations between genes and diseases is essential in developing a clinical diagnosis, treatment, and drug design for diseases. One successful approach for mining the literature is the document-based relation extraction method. Despite recent advances in document-level extraction of entity-entity, there remains a difficulty in understanding the relations between distant words in a document. To overcome the above limitations, we propose an AI-based text-mining model that learns the document-level relations between genes and diseases using an attention mechanism. Furthermore, we show that including a direct edge (DE) and indirect edges between genetic targets and diseases when training improves the model’s performance. Such relation edges can be visualized as graphs, enhancing the interpretability of the model. For the performance, we achieved an F1-score of 0.875, outperforming state-of-the-art document-level extraction models. In summary, the SCREENER identifies biological connections between target genes and diseases with superior performance by leveraging direct and indirect target-disease relations. Furthermore, we developed a web service platform named SCREENER (Streamlined CollaboRativE lEarning of NEr and Re), which extracts the gene-disease relations from the biomedical literature in real-time. We believe this interactive platform will be useful for users to uncover unknown gene-disease relations in the world of fast-paced literature publications, with sufficient interpretation supported by graph visualizations. The interactive website is available at: https://ican.standigm.com.</div

    SCREENER web service workflow.

    No full text
    A simple schematic diagram of SCREENER’s architecture is shown. Modules A to E correspond to the actions taken for each input document, and module F integrates results across documents to provide collective relations when the SCREENER has received multiple documents as input.</p

    Example of related entities.

    No full text
    The predefined relation (Linked Of) is represented with above arrows.</p

    The details of evaluation method.

    No full text
    Finding relations between genes and diseases is essential in developing a clinical diagnosis, treatment, and drug design for diseases. One successful approach for mining the literature is the document-based relation extraction method. Despite recent advances in document-level extraction of entity-entity, there remains a difficulty in understanding the relations between distant words in a document. To overcome the above limitations, we propose an AI-based text-mining model that learns the document-level relations between genes and diseases using an attention mechanism. Furthermore, we show that including a direct edge (DE) and indirect edges between genetic targets and diseases when training improves the model’s performance. Such relation edges can be visualized as graphs, enhancing the interpretability of the model. For the performance, we achieved an F1-score of 0.875, outperforming state-of-the-art document-level extraction models. In summary, the SCREENER identifies biological connections between target genes and diseases with superior performance by leveraging direct and indirect target-disease relations. Furthermore, we developed a web service platform named SCREENER (Streamlined CollaboRativE lEarning of NEr and Re), which extracts the gene-disease relations from the biomedical literature in real-time. We believe this interactive platform will be useful for users to uncover unknown gene-disease relations in the world of fast-paced literature publications, with sufficient interpretation supported by graph visualizations. The interactive website is available at: https://ican.standigm.com.</div

    The number of each entity in 1,377 documents of SCREENER dataset.

    No full text
    SCREENER dataset consists of 1,377 annotated document files for extracting gene-disease association. It has 52,709 entities(Gene, Disease, NEGREG, CPA, MPA, REG, VAR, POSREG, PROTEIN, PATHWAY, INTERACTION, ENZYME) and 43,601 relations(LinkedOf). The Detailed number of each entity is shown in Table 1 and distribution of entities is shown in Fig 1. (TIF)</p

    Indirect edges and direct edge.

    No full text
    Before we add the direct edge between gene and disease, there is only an indirect edges related a given gene with a disease. However, adding direct edge improves performance.</p

    SCREENER architecture diagram.

    No full text
    The SCREENER consists of two modules: Named Entity Recognition (NER) and Relation Extraction (RE). The NER module utilizes a pre-trained BERT model on scientific texts to predict entity types of the spans. The RE module concatenates four vectors: each span-based entity vector (red and green), a max-pooled vector from embedded tokens positioned between two entities (yellow), and the attention score vector (blue). Together, the model is designed to learn the context features of the input sentences, enhancing the model’s performance in predicting the link between two queried entities from the NER module.</p

    SCREENER dataset entities distribution.

    No full text
    The distribution of each entity in 1,377 documents of SCREENER dataset. (TIF)</p
    corecore