Search CORE

293 research outputs found

A survey on Data Extraction and Data Duplication Detection

Author: Yashika A. Shah, Snehal S. Zade, Smita M. Raut, Shraddha P. Shirbhate, Vijeta U. Khadse, Anup P. Date
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/05/2018
Field of study

Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Processing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algorithms are needed to extract useful features from huge amount of data. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. This Paper review the literature on duplicate detection and data fusion (remov e and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user

International Journal on Recent and Innovation Trends in Computing and Communication

Advances in Emotion Recognition: Link to Depressive Disorder

Author: Cheng Xiaotong
Feng Zhengzhi
Ouyang Tante
Wang Xiaoxia
Publication venue: 'IntechOpen'
Publication date: 17/04/2020
Field of study

Emotion recognition enables real-time analysis, tagging, and inference of cognitive affective states from human facial expression, speech and tone, body posture and physiological signal, as well as social text on social network platform. Recognition of emotion pattern based on explicit and implicit features extracted through wearable and other devices could be decoded through computational modeling. Meanwhile, emotion recognition and computation are critical to detection and diagnosis of potential patients of mood disorder. The chapter aims to summarize the main findings in the area of affective recognition and its applications in major depressive disorder (MDD), which have made rapid progress in the last decade

IntechOpen

Crossref

Role of sentiment classification in sentiment analysis: a survey

Author: J Prabhu
M R Pavan Kumar
Publication venue: Annals of Library and Information Studies (ALIS)
Publication date: 06/11/2018
Field of study

Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results

Online Publishing @ NISCAIR

Unsupervised learning on social data

Author: Borutta Felix
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 11/03/2020
Field of study

Feature Extraction and Duplicate Detection for Text Mining: A Survey

Author: Ramya R S
Venugopal K R
Publication venue: Global Journals Inc. (US)
Publication date: 22/04/2016
Field of study

Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user

Global Journal of Computer Science and Technology (GJCST)

A probabilistic method for emerging topic tracking in Microblog stream

Author: Cao J
Gao W
Huang J
Peng M
Wang Hua
Zhang X
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/04/2016
Field of study

Crossref

Victoria University Eprints Repository

Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review

Author: Iskandaryan Ditsuhi
Ramos Francisco
Trilles Sergio
Publication venue: 'MDPI AG'
Publication date: 01/04/2020
Field of study

The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features

Multidisciplinary Digital Publishing Institute

Repositori Institucional de la Universitat Jaume I

Unsupervised learning on social data

Author: Borutta Felix
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 11/03/2020
Field of study

Digitale Hochschulschriften der LMU