30 research outputs found

    Multi Document Summarization Based On Cross-Document Relation Using Voting Technique

    Get PDF
    News articles which are available through online search often provide readers with large collection of texts. Especially in the case of news story, different news sources reporting on the same event usually returns multiple articles in response to a reader’s search. In this work, we first identify cross-document relations from un-annotated texts using Genetic-CBR approach. Following that, we develop a new sentence scoring model based on voting technique over the identified cross-document relations. Our experiments show that incorporating the proposed methods in the summarization process yields substantial improvement over the mainstream methods. The performances of all methods were evaluated using ROUGE—a standard evaluation metric used in text summarization

    Deep sequential pattern mining for readability enhancement of Indonesian summarization

    Get PDF
    In text summarization research, readability is a great issue that must be addressed. Our hypothesis is readability can be accomplished by using text representations that keep the meaning of text documents intact. Therefore, this study aims to combine sequential pattern mining (SPM) in producing a sequence of a word as text representation with unsupervised deep learning to produce an Indonesian text summary called DeepSPM. This research uses PrefixSpan as an SPM algorithm and deep belief network (DBN) as an unsupervised deep learning method. This research uses 18,774 Indonesian news text from IndoSum. The readability aspect is evaluated by recall-oriented understudy for gisting evaluation (ROUGE) as a co-selection-based analysis; Dwiyanto Djoko Pranowo metrics, Gunning fog index (GFI), and Flesch-Kincaid grade level (FKGL) as content-based analysis; and human readability evaluation with two experts. The experiment result shows that DeepSPM yields better than DBN, with the F-measure value of ROUGE-1 enhanced to 0.462, ROUGE-2 is 0.37, and ROUGE-L is 0.41. The significance of ROUGE results also be tested using T-Test. The content-based analysis and human readability evaluation findings are conformable with the findings of co-selection-based analysis that generated summaries are only partially readable or have a medium level of readability aspect

    Feature-based approach and sequential pattern mining to enhance quality of Indonesian automatic text summarization

    Get PDF
    Indonesian automatic text summarization research is developed rapidly. The quality, especially readability aspect, of text summary can be reached if the meaning of the text can be maintained properly. Therefore, this research aims to enhance the quality of extractive Indonesian automatic text summarization with considering the quality of structured representation of text. This research uses sequential pattern mining (SPM) to produce This research use SPM to produce sequence of words (SoW) as structured text representation using PrefixSpan algorithm. Then, SPM is combined with feature-based approach using sentence scoring method to produce summary. The experiment result using IndoSum dataset shows that even though the combination of SPM and sentence scoring can increase the precision value of recall-oriented understudy for gisting evaluation (ROUGE)-1, ROUGE-2, and ROUGE-L, from 0.68 to 0.76, 0.54 to 0.69, and 0.51 to 0.72. Especially, combination of SPM and Sentence Scoring can enhance precision, recall, and f-measure of ROUGE-L that consider the order of word occurance in measurement. SPM increases ROUGE-L f-measure value of sentence scoring from 0.32 to 0.36. Moreover, combination of sentence scoring and SPM is better than SumBasic that used as feature-based approach in the previous Indonesian text summarization research

    A Review On Automatic Text Summarization Approaches

    Get PDF
    It has been more than 50 years since the initial investigation on automatic text summarization was started.Various techniques have been successfully used to extract the important contents from text document to represent document summary.In this study,we review some of the studies that have been conducted in this still-developing research area.It covers the basics of text summarization,the types of summarization,the methods that have been used and some areas in which text summarization has been applied.Furthermore,this paper also reviews the significant efforts which have been put in studies concerning sentence extraction,domain specific summarization and multi document summarization and provides the theoretical explanation and the fundamental concepts related to it.In addition,the advantages and limitations concerning the approaches commonly used for text summarization are also highlighted in this study

    An improved deepfake detection method based on CNNS

    Get PDF
    Today's image generation technology can generate high-quality face images, and it isn't easy to recognize the authenticity of the generated images through human eyes. This study aims to improve deepfake detection, a face swapping forgery, by absorbing the advantages of deep learning technologies. This study generates a unified and enhanced data set from multiple sources using spatial enhancement technology to solve the problem of poor detection performance on cross-data sets. Taking the advantages of Inception and ResNet networks, new deepfake detection architecture composed of 20 network layers is proposed as the deepfake detection model. To further improve the proposed model, hyperparameter values are optimized. The experiment result shows that the proposed network significantly enhanced over the mainstream methods, such as ResNeXt50, ResNet101, XceptionNet, and VGG19, in terms of accuracy, loss value, AUC, numbers of parameters, and FLOPs. Overall, the methods introduced in this study can help to expand the data set, better detect deepfake contents, and effectively optimize network model

    Ad Hoc On-Demand Distance Vector (AODV) Routing Protocol Invehicular Ad Hoc Network (VANET): An Analysis Study

    Get PDF
    One ofvariation Mobile Ad-hoc Network(MANET)is a Vehicular Ad Hoc Network (VANET). Vanet also partof Intelligent Transportation Systems (ITS)that uses cars as nodes of a network of a mobile network . The communication types in VANET are categorized into three types which are vehicle-to-vehicle communication (V2V), vehicle-to-roadside communication (V2R) and vehicle-to-infrastructure communication (V2I) [2].The routing protocol investigatedin this research is topology-based ad hoc routing protocol that is Ad hoc On-Demand Distance Vector (AODV). The routing analysis of this protocol is evaluatedbasedonthroughput and packet drop. This research investigatesthe latest trend of routing protocol used in VANET, evaluate the routing protocol in VANET using TWO (2)performance parameters and to implement the routing protocol in VANET in network simulator. This research was conductedusing Network Simulator (NS-3) simulation

    Cross-document Structural Relationship Identification Using Supervised Machine Learning

    No full text
    Multi document analysis has been a field of interest for decades and is still being actively researched until today. One example of such analysis could be for the task of multi document summarization which is meant to represent the concise description of the original documents. In this paper, we will focus on some special properties that multi document articles hold, specifically news articles. Information across news articles reporting on the same story are often related. Cross-document Structure Theory (CST) gives several relationships between pairs of sentences from different documents. Among them, we focus on four relations namely “Identity”, “Overlap”, “Subsumption”, and “Description”. Our aim is to automatically identify these CST relationships. We applied three machine learning techniques, i.e. SVM, Neural Network and our proposed Case-based reasoning (CBR) model. Comparison between these techniques shows that the proposed CBR model yields better results

    A Genetic-CBR Approach for Cross-Document Relationship Identification

    No full text
    Various applications concerning multi document has emerged recently. Information across topically related documents can often be linked. Cross-document Structure Theory (CST) analyzes the relationships that exist between sentences across related documents. However, most of the existing works rely on human experts to identify the CST relationships. In this work, we aim to automatically identify some of the CST relations using supervised learning method. We propose Genetic-CBR approach which incorporates genetic algorithm (GA) to improve the case base reasoning (CBR) classification. GA is used to scale the weights of the data features used by the CBR classifier. We perform the experiments using the datasets obtained from CSTBank corpus. Comparison with other learning methods shows that the proposed method yields better results

    Using SVMs for Classification of Cross-Document Relationships

    No full text
    Cross-document Structure Theory (CST) has recently been proposed to facilitate tasks related to multi-document analysis. Classifying and identifying the CST relationships between sentences across topically related documents have since been proven as necessary. However, there have not been sufficient studies presented in literature to automatically identify these CST relationships. In this study, a supervised machine learning technique, i.e. Support Vector Machines (SVMs), was applied to identify four types of CST relationships, namely “Identity”, “Overlap”, “Subsumption”, and “Description” on the datasets obtained from CSTBank corpus. The performance of the SVMs classification was measured using Precision, Recall and F-measure. In addition, the results obtained using SVMs were also compared with those from the previous literature using boosting classification algorithm. It was found that SVMs yielded better results in classifying the four CST relationships
    corecore