15 research outputs found

    Assessing mortality prediction through different representation models based on concepts extracted from clinical notes

    Full text link
    Recent years have seen particular interest in using electronic medical records (EMRs) for secondary purposes to enhance the quality and safety of healthcare delivery. EMRs tend to contain large amounts of valuable clinical notes. Learning of embedding is a method for converting notes into a format that makes them comparable. Transformer-based representation models have recently made a great leap forward. These models are pre-trained on large online datasets to understand natural language texts effectively. The quality of a learning embedding is influenced by how clinical notes are used as input to representation models. A clinical note has several sections with different levels of information value. It is also common for healthcare providers to use different expressions for the same concept. Existing methods use clinical notes directly or with an initial preprocessing as input to representation models. However, to learn a good embedding, we identified the most essential clinical notes section. We then mapped the extracted concepts from selected sections to the standard names in the Unified Medical Language System (UMLS). We used the standard phrases corresponding to the unique concepts as input for clinical models. We performed experiments to measure the usefulness of the learned embedding vectors in the task of hospital mortality prediction on a subset of the publicly available Medical Information Mart for Intensive Care (MIMIC-III) dataset. According to the experiments, clinical transformer-based representation models produced better results with getting input generated by standard names of extracted unique concepts compared to other input formats. The best-performing models were BioBERT, PubMedBERT, and UmlsBERT, respectively

    Анализ текстовых отзывов пользователей дистанционных курсов на основе алгоритмов машинного обучения

    Get PDF
    Keywords extraction algorithm PositionRank was described and considered as a way to extract useful information from user’s reviews in massive open online courses. Results are observed over a dataset of real reviews from Coursera

    Metode Pembobotan Hibrida untuk Ekstraksi Frasa Kunci Bahasa Arab

    Get PDF
    Banyaknya informasi membuat proses pengindeksan dan pencarian inti dari dokumen menjadi permasalahan yang rumit. Sebagian besar dokumen yang tersedia tidak dilengkapi dengan kata kunci terkait. Hal ini sehingga memaksa pembaca untuk membaca seluruh dokumen untuk mendapat gambaran penuh dari konten seluruh dokumen. Ekstraksi frasa kunci otomatis yang menggunakan Algoritma YAKE memberi solusi cepat ekstraksi frasa kunci menggunakan fitur lokal dari sebuah dokumen. Namun, penggunaan fitur lokal saja membuat hasil ekstraksi menjadi kurang relevan karena diperlukan istilah signifikan yang muncul di dokumen lain. Masalah lain yang muncul adalah terdapat beberapa fitur lokal yang tidak dapat digunakan untuk bahasa Arab, misalnya huruf kapital. Pada penelitian ini, diusulkan metode pembobotan kata yang mengintegrasikan fitur statistik lokal dari sebuah dokumen dan fitur eksternal dari dokumen lain untuk sistem ekstraksi kata kunci. Metode ini dapat digunakan secara efektif pada bahasa Arab dan dapat digunakan pada bahasa lain yang tidak memiliki huruf kapital serta untuk dokumen-dokumen yang tidak terstruktur seperti berita atau karya ilmiah. Dari hasil uji coba telah dibuktikan bahwa performansi metode ini lebih baik daripada metode pembanding yaitu YAKE dan TF-IDF

    Metode Pembobotan Hibrida untuk Ekstraksi Frasa Kunci Bahasa Arab

    Get PDF
    Banyaknya informasi membuat proses pengindeksan dan pencarian inti dari dokumen menjadi permasalahan yang rumit. Sebagian besar dokumen yang tersedia tidak dilengkapi dengan kata kunci terkait. Hal ini sehingga memaksa pembaca untuk membaca seluruh dokumen untuk mendapat gambaran penuh dari konten seluruh dokumen. Ekstraksi frasa kunci otomatis yang menggunakan Algoritma YAKE memberi solusi cepat ekstraksi frasa kunci menggunakan fitur lokal dari sebuah dokumen. Namun, penggunaan fitur lokal saja membuat hasil ekstraksi menjadi kurang relevan karena diperlukan istilah signifikan yang muncul di dokumen lain. Masalah lain yang muncul adalah terdapat beberapa fitur lokal yang tidak dapat digunakan untuk bahasa Arab, misalnya huruf kapital. Pada penelitian ini, diusulkan metode pembobotan kata yang mengintegrasikan fitur statistik lokal dari sebuah dokumen dan fitur eksternal dari dokumen lain untuk sistem ekstraksi kata kunci. Metode ini dapat digunakan secara efektif pada bahasa Arab dan dapat digunakan pada bahasa lain yang tidak memiliki huruf kapital serta untuk dokumen-dokumen yang tidak terstruktur seperti berita atau karya ilmiah. Dari hasil uji coba telah dibuktikan bahwa performansi metode ini lebih baik daripada metode pembanding yaitu YAKE dan TF-IDF

    Applying Transformer-based Text Summarization for Keyphrase Generation

    Full text link
    Keyphrases are crucial for searching and systematizing scholarly documents. Most current methods for keyphrase extraction are aimed at the extraction of the most significant words in the text. But in practice, the list of keyphrases often includes words that do not appear in the text explicitly. In this case, the list of keyphrases represents an abstractive summary of the source text. In this paper, we experiment with popular transformer-based models for abstractive text summarization using four benchmark datasets for keyphrase extraction. We compare the results obtained with the results of common unsupervised and supervised methods for keyphrase extraction. Our evaluation shows that summarization models are quite effective in generating keyphrases in the terms of the full-match F1-score and BERTScore. However, they produce a lot of words that are absent in the author's list of keyphrases, which makes summarization models ineffective in terms of ROUGE-1. We also investigate several ordering strategies to concatenate target keyphrases. The results showed that the choice of strategy affects the performance of keyphrase generation.Comment: 15 pages, 4 figures. DAMDID-202

    BibRank: Automatic Keyphrase Extraction Platform Using~Metadata

    Full text link
    Automatic Keyphrase Extraction involves identifying essential phrases in a document. These keyphrases are crucial in various tasks such as document classification, clustering, recommendation, indexing, searching, summarization, and text simplification. This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms. The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing bibliographic data in BibTeX format. BibRank combines innovative weighting techniques with positional, statistical, and word co-occurrence information to extract keyphrases from documents. The platform proves valuable for researchers and developers seeking to enhance their keyphrase extraction algorithms and advance the field of natural language processing.Comment: 12 pages , 4 figures, 8 table

    Survey on Insurance Claim analysis using Natural Language Processing and Machine Learning

    Get PDF
    In the insurance industry nowadays, data is carrying the major asset and playing a key role. There is a wealth of information available to insurance transporters nowadays. We can identify three major eras in the insurance industry's more than 700-year history. The industry follows the manual era from the 15th century to 1960, the systems era from 1960 to 2000, and the current digital era, i.e., 2001-20X0. The core insurance sector has been decided by trusting data analytics and implementing new technologies to improve and maintain existing practices and maintain capital together. This has been the highest corporate object in all three periods.AI techniques have been progressively utilized for a variety of insurance activities in recent years. In this study, we give a comprehensive general assessment of the existing research that incorporates multiple artificial intelligence (AI) methods into all essential insurance jobs. Our work provides a more comprehensive review of this research, even if there have already been a number of them published on the topic of using artificial intelligence for certain insurance jobs. We study algorithms for learning, big data, block chain, data mining, and conversational theory, and their applications in insurance policy, claim prediction, risk estimation, and other fields in order to comprehensively integrate existing work in the insurance sector using AI approaches
    corecore