Search CORE

7,808 research outputs found

Twitter gender classification using user unstructured information

Author: Batista F.
Carvalho J.
Vicente M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper describes an approach to automatically detect the gender of Twitter users, based only on clues provided by their profile information in an unstructured form. A number of features that capture phenomena specific of Twitter users is proposed and evaluated on a dataset of about 242K English language users. Different supervised and unsupervised approaches are used to assess the performance of the proposed features, including Naive Bayes variants, Logistic Regression, Support Vector Machines, Fuzzy c-Means clustering, and K-means. An unsupervised approach based on Fuzzy c-Means proved to be very suitable for this task, returning the correct gender for about 96% of the users.info:eu-repo/semantics/acceptedVersio

Crossref

Repositório Institucional do ISCTE-IUL

A Comparative Study of Machine Learning Models for Tabular Data Through Challenge of Monitoring Parkinson's Disease Progression Using Voice Recordings

Author: Arabnia Hamid Reza
Giuntini Amy
Iman Mohammadreza
Rasheed Khaled
Publication venue
Publication date: 27/05/2020
Field of study

People with Parkinson's disease must be regularly monitored by their physician to observe how the disease is progressing and potentially adjust treatment plans to mitigate the symptoms. Monitoring the progression of the disease through a voice recording captured by the patient at their own home can make the process faster and less stressful. Using a dataset of voice recordings of 42 people with early-stage Parkinson's disease over a time span of 6 months, we applied multiple machine learning techniques to find a correlation between the voice recording and the patient's motor UPDRS score. We approached this problem using a multitude of both regression and classification techniques. Much of this paper is dedicated to mapping the voice data to motor UPDRS scores using regression techniques in order to obtain a more precise value for unknown instances. Through this comparative study of variant machine learning methods, we realized some old machine learning methods like trees outperform cutting edge deep learning models on numerous tabular datasets.Comment: Accepted at "HIMS'20 - The 6th Int'l Conf on Health Informatics and Medical Systems"; https://americancse.org/events/csce2020/conferences/hims2

arXiv.org e-Print Archive

Automatic classification of speaker characteristics

Author: Huang Xu
Nguyen Phuoc
Sharma Dharmendra
Tran Dat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Crossref

Deakin Research Online

University of Canberra Research Repository

A new tool for the evaluation of the rehabilitation outcomes in older persons. a machine learning model to predict functional status 1 year ahead

Author: Cacciafesta Mauro
Dellepiane Umberto
Gueli Nicolò
Renzi Alessia
Renzi Stefania
Verrusio Walter
Zaccone Mariagrazia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Purpose To date, the assessment of disability in older people is obtained utilizing a Comprehensive Geriatric Assessment (CGA). However, it is often diﬃcult to understand which areas of CGA are most predictive of the disability. The aim of this study is to evaluate the possibility to early predict—1year ahead—the disability level of a patient using machine leaning models. Methods Community-dwelling older people were enrolled in this study. CGA was made at baseline and at 1year follow-up. After collecting input/independent variables (i.e., age, gender, schooling followed, body mass index, information on smoking, polypharmacy, functional status, cognitive performance, depression, nutritional status), we performed two distinct Support Vector Machine models (SVMs) able to predict functional status 1year ahead. To validate the choice of the model, the results achieved with the SVMs were compared with the output produced by simple linear regression models. Results 218 patients (mean age = 78.01; SD = 7.85; male = 39%) were recruited. The combination of the two SVMs is able to achieve a higher prediction accuracy (exceeding 80% instances correctly classiﬁed vs 67% instances correctly classiﬁed by the combination of the two linear regression models). Furthermore, SVMs are able to classify both the three categories, self suﬃciently, disability risk and disability, while linear regression model separates the population only in two groups (self-suﬃciency and disability) without identifying the intermediate category (disability risk) which turns out to be the most critical one. Conclusions The development of such a model can contribute to the early detection of patients at risk of self-suﬃciency loss

Archivio della ricerca- Università di Roma La Sapienza

The comparison study of kernel KC-means and support vector machines for classifying schizophrenia

Author: Hartini Sri
Rustam Zuherman
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2020
Field of study

Schizophrenia is one of mental disorder that affects the mind, feeling, and behavior. Its treatment is usually permanent and quite complicated; therefore, early detection is important. Kernel KC-means and support vector machines are the methods known as a good classifier. This research, therefore, aims to compare kernel KC-means and support vector machines, using data obtained from Northwestern University, which consists of 171 schizophrenia and 221 non-schizophrenia samples. The performance accuracy, F1-score, and running time were examined using the 10-fold cross-validation method. From the experiments, kernel KC-means with the sixth-order polynomial kernel gives 87.18 percent accuracy and 93.15 percent F1-score at the faster running time than support vector machines. However, with the same kernel, it was further deduced from the results that support vector machines provides better performance with an accuracy of 88.78 percent and F1-score of 94.05 percent

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Gender Classification by Information Fusion of Hair and Face

Author: Bao-Liang Lu
Xiao-Chen Lian
Zheng Ji
Publication venue: 'IntechOpen'
Publication date: 01/01/2009
Field of study

IntechOpen

CiteSeerX

Crossref

Using unstructured profile information for gender classification of Portuguese and English

Author: B Heil
H Halteren van
JC Bezdek
S Cessie Le
S Keerthi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

This paper reports experiments on automatically detecting the gender of Twitter users, based on unstructured information found on their Twitter profile. A set of features previously proposed is evaluated on two datasets of English and Portuguese users, and their performance is assessed using several supervised and unsupervised approaches, including Naive Bayes variants, Logistic Regression, Support Vector Machines, Fuzzy c-Means clustering, and k-means. Results show that features perform well in both languages separately, but even best results were achieved when combining both languages. Supervised approaches reached 97.9 % accuracy, but Fuzzy c-Means also proved suitable for this task achieving 96.4 % accuracy.info:eu-repo/semantics/acceptedVersio

Crossref

Repositório Institucional do ISCTE-IUL

Survey of data mining approaches to user modeling for adaptive hypermedia

Author: Chen SY
Frias-Martinez E
Liu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio

CiteSeerX

Crossref

Brunel University Research Archive