Search CORE

54,957 research outputs found

Blog Analysis with Fuzzy TFIDF

Author: Ho Chi-Shu
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2007
Field of study

These days blogs are becoming increasingly popular because it allows anyone to share their personal diary, opinions, and comments on the World Wide Wed. Many blogs contain valuable information, but it is a difficult task to extract this information from a high number of blog comments. The goal is to analyze a high number of blog comments by clustering all blog comments by their similarity based on keyword relevance into smaller groups. TF-IDF weight has been used in classifying documents by measuring appearance frequency of each keyword in a document, but it is not effective in differentiating semantic similarities between words. By applying fuzzy semantic to TF-IDF, TF-IDF becomes fuzzy TF-IDF and has the ability to rank semantic relevancy. Fuzzy VSM can be effective in exploring hidden relationship between blog comments by adapting fuzzy TF-IDF and fuzzy semantic for extending Vector Space Model to fuzzy VSM. Therefore, fuzzy VSM can cluster a high number of blog comments into small number of groups based on document similarity and semantic relevancy

SJSU ScholarWorks

Adaptive kNN using Expected Accuracy for Classification of Geo-Spatial Data

Author: Atzmueller Martin
Becker Martin
Hotho Andreas
Kibanov Mark
Mueller Juergen
Stumme Gerd
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/12/2017
Field of study

The k-Nearest Neighbor (kNN) classification approach is conceptually simple - yet widely applied since it often performs well in practical applications. However, using a global constant k does not always provide an optimal solution, e.g., for datasets with an irregular density distribution of data points. This paper proposes an adaptive kNN classifier where k is chosen dynamically for each instance (point) to be classified, such that the expected accuracy of classification is maximized. We define the expected accuracy as the accuracy of a set of structurally similar observations. An arbitrary similarity function can be used to find these observations. We introduce and evaluate different similarity functions. For the evaluation, we use five different classification tasks based on geo-spatial data. Each classification task consists of (tens of) thousands of items. We demonstrate, that the presented expected accuracy measures can be a good estimator for kNN performance, and the proposed adaptive kNN classifier outperforms common kNN and previously introduced adaptive kNN algorithms. Also, we show that the range of considered k can be significantly reduced to speed up the algorithm without negative influence on classification accuracy

arXiv.org e-Print Archive

Tilburg University Repository

SciRecSys: A Recommendation System for Scientific Publication by Discovering Keyword Relationships

Author: D. Sánchez
F. Fouss
I. Dagan
J. Singthongchai
J.P. Keener
L.T. Kien
P. Lops
S.H. Cha
V. Anh Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

In this work, we propose a new approach for discovering various relationships among keywords over the scientific publications based on a Markov Chain model. It is an important problem since keywords are the basic elements for representing abstract objects such as documents, user profiles, topics and many things else. Our model is very effective since it combines four important factors in scientific publications: content, publicity, impact and randomness. Particularly, a recommendation system (called SciRecSys) has been presented to support users to efficiently find out relevant articles

arXiv.org e-Print Archive

Crossref

A Combined Representation Learning Approach for Better Job and Skill Recommendation

Author: Almalis Nikolaos D
Guo Xingsheng
Gupta A.
Rendle Steffen
Zhao Meng
Zhou W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2018
Field of study

Job recommendation is an important task for the modern recruitment industry. An excellent job recommender system not only enables to recommend a higher paying job which is maximally aligned with the skill-set of the current job, but also suggests to acquire few additional skills which are required to assume the new position. In this work, we created three types of information net- works from the historical job data: (i) job transition network, (ii) job-skill network, and (iii) skill co-occurrence network. We provide a representation learning model which can utilize the information from all three networks to jointly learn the representation of the jobs and skills in the shared k-dimensional latent space. In our experiments, we show that by jointly learning the representation for the jobs and skills, our model provides better recommendation for both jobs and skills. Additionally, we also show some case studies which validate our claims

Crossref

IUPUIScholarWorks