Search CORE

272 research outputs found

Applications of Mining Arabic Text: A Review

Author: Al-Radaideh Qasem
Publication venue: 'IntechOpen'
Publication date: 14/02/2020
Field of study

Since the appearance of text mining, the Arabic language gained some interest in applying several text mining tasks over a text written in the Arabic language. There are several challenges faced by the researchers. These tasks include Arabic text summarization, which is one of the challenging open areas for research in natural language processing (NLP) and text mining fields, Arabic text categorization, and Arabic sentiment analysis. This chapter reviews some of the past and current researches and trends in these areas and some future challenges that need to be tackled. It also presents some case studies for two of the reviewed approaches

IntechOpen

Crossref

MultiGBS: A multi-layer graph approach to biomedical summarization

Author: Davoodijam Ensieh
Ghadiri Nasser
Rinaldi Fabio
Shahreza Maryam Lotfi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Automatic text summarization methods generate a shorter version of the input text to assist the reader in gaining a quick yet informative gist. Existing text summarization methods generally focus on a single aspect of text when selecting sentences, causing the potential loss of essential information. In this study, we propose a domain-specific method that models a document as a multi-layer graph to enable multiple features of the text to be processed at the same time. The features we used in this paper are word similarity, semantic similarity, and co-reference similarity, which are modelled as three different layers. The unsupervised method selects sentences from the multi-layer graph based on the MultiRank algorithm and the number of concepts. The proposed MultiGBS algorithm employs UMLS and extracts the concepts and relationships using different tools such as SemRep, MetaMap, and OGER. Extensive evaluation by ROUGE and BERTScore shows increased F-measure values

arXiv.org e-Print Archive

Western Sydney ResearchDirect

SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation

Author: Alcón Óscar
Lloret Elena
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 21/11/2018
Field of study

Text summarization is the task of condensing a document keeping the relevant information. This task integrated in wider information systems can help users to access key information without having to read everything, allowing for a higher efficiency. In this research work, we have developed and evaluated a single-document extractive summarization approach, named SemPCA-Summarizer, which reduces the dimension of a document using Principal Component Analysis technique enriched with semantic information. A concept-sentence matrix is built from the textual input document, and then, PCA is used to identify and rank the relevant concepts, which are used for selecting the most important sentences through different heuristics, thus leading to various types of summaries. The results obtained show that the generated summaries are very competitive, both from a quantitative and a qualitative viewpoint, thus indicating that our proposed approach is appropriate for briefly providing key information, and thus helping to cope with a huge amount of information available in a quicker and efficient manner

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Automatic Text Summarization for Hindi Using Real Coded Genetic Algorithm

Author: Arora Anuja
Jain Arti
Kumar Kumar Vimal
Morato Lara Jorge Luis
Yadav Divakar
Publication venue: MDPI
Publication date: 01/06/2022
Field of study

In the present scenario, Automatic Text Summarization (ATS) is in great demand to address the ever-growing volume of text data available online to discover relevant information faster. In this research, the ATS methodology is proposed for the Hindi language using Real Coded Genetic Algorithm (RCGA) over the health corpus, available in the Kaggle dataset. The methodology comprises five phases: preprocessing, feature extraction, processing, sentence ranking, and summary generation. Rigorous experimentation on varied feature sets is performed where distinguishing features, namely- sentence similarity and named entity features are combined with others for computing the evaluation metrics. The top 14 feature combinations are evaluated through Recall-Oriented Understudy for Gisting Evaluation (ROUGE) measure. RCGA computes appropriate feature weights through strings of features, chromosomes selection, and reproduction operators: Simulating Binary Crossover and Polynomial Mutation. To extract the highest scored sentences as the corpus summary, different compression rates are tested. In comparison with existing summarization tools, the ATS extractive method gives a summary reduction of 65%

Directory of Open Access Journals

Universidad Carlos III de Madrid e-Archivo

Abstract Creation of Research Paper Using Feature Specific Sentence Extraction based Summarization

Author: Ms. Jagtap Jayanti, Prof. Patel H.H
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2015
Field of study

Several techniques for identifying essential content for text summarization have been created to date. Subject representation techniques is primary infer a midway reflection of the content that that grabs the styles discussed in the data. Considering these representations of topics, phrases in the details records are obtained for each and every relevance. In our suggested system sentence relevance detection is applied determines a score for each sentence based on its significance. Then an overview is produced by selecting most calculated sentences. The produced overview is use for producing subjective by Enhanced summation technique, choosing the sentences from the overview one by one and create word chart. In our system enhance edge weighting strategy is applied for high connection throughout words of produced chart. For discovering few shortest path sentences suggested method use dijkstras algorithm. Before choosing the best quickest path sentences, system examine framework of phrase grammatically. Outcomes demonstrate that extractive and abstractive-oriented overviews produced by Improve COPMENDIUM outshine current system of summation system. We used feature specific sentence extraction techniques which enhance the effectiveness of the summarization strategy. DOI: 10.17762/ijritcc2321-8169.15074

International Journal on Recent and Innovation Trends in Computing and Communication