Search CORE

3 research outputs found

Text segmentation for analysing different languages

Author: Pak Irina *
Teh Phoey Lee *
Publication venue
Publication date: 11/11/2016
Field of study

Over the past several years, researchers have applied different methods of text segmentation. Text segmentation is defined as a method of splitting a document into smaller segments, assuming with its own relevant meaning. Those segments can be classified into the tag, word, sentence, topic, phrase and any information unit. Firstly, this study reviews the different types of text segmentation methods used in different types of documentation, and later discusses the various reasons for utilizing it in opinion mining. The main contribution of this study includes a summarisation of research papers from the past 10 years that applied text segmentation as their main approach in text analysing. Results show that word segmentation was successfully and widely used for processing different languages

Crossref

Sunway Institutional Repository

Text segmentation techniques: A critical review

Author: Pak Irina *
Teh Phoey Lee *
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/11/2017
Field of study

Text segmentation is widely used for processing text. It is a method of splitting a document into smaller parts, which is usually called segments. Each segment has its relevant meaning. Those segments categorized as word, sentence, topic, phrase or any information unit depending on the task of the text analysis. This study presents various reasons of usage of text segmentation for different analyzing approaches. We categorized the types of documents and languages used. The main contribution of this study includes a summarization of 50 research papers and an illustration of past decade (January 2007- January 2017)’s of research that applied text segmentation as their main approach for analysing text. Results revealed the popularity of using text segmentation in different languages. Besides that, the “word” seems to be the most practical and usable segment, as it is the smaller unit than the phrase, sentence or line

Crossref

Sunway Institutional Repository

Optimization of Sentiment Analysis Methods for classifying text comments of bank customers

Author: Komotskiy E.
Lutfullaeva M.
Medvedeva M.
PhD
Spasov K.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

A method of sentiment analysis of the text and its approbation in solving the problem of analysis of text comments left by the Bank's customers are performed. The proposed method consists in a combination of three approaches: rules-based, dictionaries and machine learning with a teacher. New method of text vectorization- tonal vectorization instead of classical ones, such as “bag-of-words ” and TF-IDF, is proposed. The text was classified by logistic regression with regularization. A series of experiments were carried out and the optimal value of the regularization parameter was found in terms of classification accuracy. © 201

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin