Search CORE

90 research outputs found

Machine Learning of Generic and User-Focused Summarization

Author: Bloedorn Eric
Mani Inderjeet
Publication venue
Publication date: 01/01/1998
Field of study

A key problem in text summarization is finding a salience function which determines what information in the source should be included in the summary. This paper describes the use of machine learning on a training corpus of documents and their abstracts to discover salience functions which describe what combination of features is optimal for a given summarization task. The method addresses both "generic" and user-focused summaries.Comment: In Proceedings of the Fifteenth National Conference on AI (AAAI-98), p. 821-82

arXiv.org e-Print Archive

CiteSeerX

Sentiment Analysis: State of the Art

Author: Chalothorn Tawunrat
Ellman Jeremy
Publication venue: Institute of Research Engineers and Doctors
Publication date: 01/08/2013
Field of study

We present the state of art in sentiment analysis which covers the purpose of sentiment analysis, levels of sentiment analysis and processes that could be used to measure polarity and classify labels. Moreover, brief details about some resources of sentiment analysis are included

Northumbria Research Link

The Today Tendency of Sentiment Classification

Author: Phu Vo Ngoc
Tran Vo Thi Ngoc
Publication venue: 'IntechOpen'
Publication date: 27/06/2018
Field of study

Sentiment classification has already been studied for many years because it has had many crucial contributions to many different fields in everyday life, such as in political activities, commodity production, and commercial activities. There have been many kinds of the sentiment analysis such as machine learning approaches, lexicon-based approaches, etc., for many years. The today tendency of the sentiment classification is as follows: (1) Processing many big data sets with shortening execution times (2) Having a high accuracy (3) Integrating flexibly and easily into many small machines or many different approaches. We will present each category in more details

IntechOpen

Crossref

Fault-Tolerant Learning for Term Extraction

Author: Lu Yingliang
Meng Yao
Xia Yingju
Yang Yuhang
Yu Hao
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Waseda University Repository

Web news mining in an evolving framework

Author: Iglesias Martínez José Antonio
Ledezma Espino Agapito Ismael
Sanchis de Miguel María Araceli
Tiemblo Alexandra
Publication venue: 'Elsevier BV'
Publication date: 01/03/2016
Field of study

Online news has become one of the major channels for Internet users to get news. News websites are daily overwhelmed with plenty of news articles. Huge amounts of online news articles are generated and updated everyday, and the processing and analysis of this large corpus of data is an important challenge. This challenge needs to be tackled by using big data techniques which process large volume of data within limited run times. Also, since we are heading into a social-media data explosion, techniques such as text mining or social network analysis need to be seriously taken into consideration. In this work we focus on one of the most common daily activities: web news reading. News websites produce thousands of articles covering a wide spectrum of topics or categories which can be considered as a big data problem. In order to extract useful information, these news articles need to be processed by using big data techniques. In this context, we present an approach for classifying huge amounts of different news articles into various categories (topic areas) based on the text content of the articles. Since these categories are constantly updated with new articles, our approach is based on Evolving Fuzzy Systems (EFS). The EFS can update in real time the model that describes a category according to the changes in the content of the corresponding articles. The novelty of the proposed system relies in the treatment of the web news articles to be used by these systems and the implementation and adjustment of them for this task. Our proposal not only classifies news articles, but it also creates human interpretable models of the different categories. This approach has been successfully tested using real on-line news. (C) 2015 Elsevier B.V. All rights reserved.This work has been supported by the Spanish Government under i-Support (Intelligent Agent Based Driver Decision Support) Project (TRA2011-29454-C03-03)

Universidad Carlos III de Madrid e-Archivo

A Hierarchical Emotion Classification Technique for Thai Reviews

Author: Charoensuk Jirawan
Sornil Ohm
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 01/12/2018
Field of study

Emotion classification is an interesting problem in affective computing that can be applied in various tasks, such as speech synthesis, image processing and text processing. With the increasing amount of textual data on the Internet, especially reviews of customers that express opinions and emotions about products. These reviews are important feedback for companies. Emotion classification aims to identify an emotion label for each review. This research investigated three approaches for emotion classification of opinions in the Thai language, written in unstructured format, free form or informal style. Different sets of features were studied in detail and analyzed. The experimental results showed that a hierarchical approach, where the subjectivity of the review is determined first, then the polarity of opinion is identified and finally the emotional label is calculated, yielded the highest performance, with precision, recall and F-measure at 0.691, 0.743 and 0.709, respectively

Journal of ICT Research and Applications

Directory of Open Access Journals

ITB Journal

Word segmentation of Vietnamese texts: a comparison of approaches

Author: Dinh Quang Thang
Le Hong Phuong
Nguyen Cam Tu
Nguyen Thi Minh Huyen
Rossignol Mathias
Vu Xuan Luong
Publication venue: HAL CCSD
Publication date: 28/05/2008
Field of study

International audienceWe present in this paper a comparison between three segmentation systems for the Vietnamese language. Indeed, the majority of Vietnamese words is built by semantic composition from about 7,000 syllables, that also have a meaning as isolated words. So the identification of word boundaries in a text is not a simple task, and ambiguities often appear. Beyond the presentation of the tested systems, we also propose a standard definition for word segmentation in Vietnamese, and introduce a reference corpus developed for the purpose of evaluating such a task. The results observed confirm that it can be relatively well treated by automatic means, although a solution needs to be found to take into account out-of-vocabulary words

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Survey on Insurance Claim analysis using Natural Language Processing and Machine Learning

Author: Sapana Kolambe et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 02/11/2023
Field of study

In the insurance industry nowadays, data is carrying the major asset and playing a key role. There is a wealth of information available to insurance transporters nowadays. We can identify three major eras in the insurance industry's more than 700-year history. The industry follows the manual era from the 15th century to 1960, the systems era from 1960 to 2000, and the current digital era, i.e., 2001-20X0. The core insurance sector has been decided by trusting data analytics and implementing new technologies to improve and maintain existing practices and maintain capital together. This has been the highest corporate object in all three periods.AI techniques have been progressively utilized for a variety of insurance activities in recent years. In this study, we give a comprehensive general assessment of the existing research that incorporates multiple artificial intelligence (AI) methods into all essential insurance jobs. Our work provides a more comprehensive review of this research, even if there have already been a number of them published on the topic of using artificial intelligence for certain insurance jobs. We study algorithms for learning, big data, block chain, data mining, and conversational theory, and their applications in insurance policy, claim prediction, risk estimation, and other fields in order to comprehensively integrate existing work in the insurance sector using AI approaches

International Journal on Recent and Innovation Trends in Computing and Communication

Satellite Workshop On Language, Artificial Intelligence and Computer Science for Natural Language Processing Applications (LAICS-NLP): Discovery of Meaning from Text

Author: Kulathuramaiyer Narayanan
Ong , Siou Chin.
Yeo Alvin Wee
Publication venue: Faculty of Engineering Kasetsart University, Bangkok, Thailand.
Publication date: 01/01/2006
Field of study

This paper proposes a novel method to disambiguate important words from a collection of documents. The hypothesis that underlies this approach is that there is a minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense per discourse [13] further to a collection of related documents rather than a single document. We perform distributed clustering on a set of features representing each of the top ten categories of documents in the Reuters-21578 dataset. Groups of terms that have a similar term distributional pattern across documents were identified. WordNet-based similarity measurement was then computed for terms within each cluster. An aggregation of the associations in WordNet that was employed to ascertain term similarity within clusters has provided a means of identifying clusters’ root senses

Unimas Institutional Repository