Search CORE

56 research outputs found

コミュニティ型コンテンツにおける重要だが無視されているコメントの抽出手法の提案

Author: 村上陽平
灘本明代
荒牧英治
阿辺川武
Publication venue: 甲南大学
Publication date: 18/12/2009
Field of study

Konan University Repository

コミュニティ型コンテンツのコンテンツホール検索の提案

Author: Akiyo Nadamoto
Eiji Aramaki
Takeshi Abekawa
Yohei Murakami
村上陽平
灘本明代
荒牧英治
阿辺川武
Publication venue: 甲南大学
Publication date: 20/12/2008
Field of study

Konan University Repository

SciRecSys: A Recommendation System for Scientific Publication by Discovering Keyword Relationships

Author: D. Sánchez
F. Fouss
I. Dagan
J. Singthongchai
J.P. Keener
L.T. Kien
P. Lops
S.H. Cha
V. Anh Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

In this work, we propose a new approach for discovering various relationships among keywords over the scientific publications based on a Markov Chain model. It is an important problem since keywords are the basic elements for representing abstract objects such as documents, user profiles, topics and many things else. Our model is very effective since it combines four important factors in scientific publications: content, publicity, impact and randomness. Particularly, a recommendation system (called SciRecSys) has been presented to support users to efficiently find out relevant articles

arXiv.org e-Print Archive

Crossref

Finding Support Documents with a Logistic Regression Approach

Author: He Daqing
Li Qi
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 28/07/2011
Field of study

Entity retrieval finds the relevant results for a user’s information needs at a finer unit called “entity”. To retrieve such entity, people usually first locate a small set of support documents which contain answer entities, and then further detect the answer entities in this set. In the literature, people view the support documents as relevant documents, and their findings as a conventional document retrieval problem. In this paper, we will state that finding support documents and that of relevant documents, although sounds similar, have important differences. Further, we propose a logistic regression approach to find support documents. Our experiment results show that the logistic regression method performs significantly better than a baseline system that treat the support document finding as a conventional document retrieval problem

D-Scholarship@Pitt

Improving Sentiment Analysis of Short Informal Indonesian Product Reviews using Synonym Based Feature Expansion

Author: Afirianto Tri
Fauzi M. Ali
Nur Firmansyah Ro'i Fahreza
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2018
Field of study

Sentiment analysis in short informal texts like product reviews is more challenging. Short texts are sparse, noisy, and lack of context information. Traditional text classification methods may not be suitable for analyzing sentiment of short texts given all those difficulties. A common approach to overcome these problems is to enrich the original texts with additional semantics to make it appear like a large document of text. Then, traditional classification methods can be applied to it. In this study, we developed an automatic sentiment analysis system of short informal Indonesian texts using Naïve Bayes and Synonym Based Feature Expansion. The system consists of three main stages, preprocessing and normalization, features expansion and classification. After preprocessing and normalization, we utilize Kateglo to find some synonyms of every words in original texts and append them. Finally, the text is classified using Naïve Bayes. The experiment shows that the proposed method can improve the performance of sentiment analysis of short informal Indonesian product reviews. The best sentiment classification performance using proposed feature expansion is obtained by accuracy of 98%.The experiment also show that feature expansion will give higher improvement in small number of training data than in the large number of them

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

SEWordSim: Software-Specific Word Similarity Database

Author: Lawall Julia
LO David
TIAN Yuan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/05/2014
Field of study

International audienceMeasuring the similarity of words is important in accurately representing and comparing documents, and thus improves the results of many natural language processing (NLP) tasks. The NLP community has proposed various measurements based on WordNet, a lexical database that contains relationships between many pairs of words. Recently, a number of techniques have been proposed to address software engineering issues such as code search and fault localization that require understanding natural language documents, and a measure of word similarity could improve their results. However, WordNet only contains information about words senses in general-purpose conversation, which often differ from word senses in a software-engineering context, and the software-specific word similarity resources that have been developed rely on data sources containing only a limited range of words and word uses.In recent work, we have proposed a word similarity resource based on information collected automatically from StackOverflow. We have found that the results of this resource are given scores on a 3-point Likert scale that are over 50% higher than the results of a resource based on WordNet. In this demo paper, we review our data collection methodology and propose a Java API to make the resulting word similarity resource useful in practice.The SEWordSim database and related information can be found at http://goo.gl/BVEAs8. Demo video is available at http://goo.gl/dyNwyb

Crossref

Institutional Knowledge at Singapore Management University

INRIA a CCSD electronic archive server

Perception based User Profiles for Web Personalization

Author: H K Yogish
M P Sowbhagya
Raju G T
Publication venue: Auricle Global Society of Education and Research
Publication date: 13/07/2023
Field of study

Personalized web services reduce the burden of information overload by collecting facts that match the needs of the user. An important aspect of personalized web services is the creation of user profiles that contain user information and settings. This article introduces a unique method called Perception-Based User Profiles (PUP) based on perception and browsing order, develops and updates user profiles. User profiles include perceptions and relationships, which can help guarantee that user interests are represented semantically. Second, when calculating the perception and duration of the relationship, for each site in a session, the user's browsing order is considered. Third, cognitive psychometric memory model is used to update the user profile's perceptions and relationships at the end of each session, ensuring the user profile's dynamics. The results of the tests suggest that this strategy works well for building and updating user profiles

International Journal on Recent and Innovation Trends in Computing and Communication

An overview of textual semantic similarity measures based on web intelligence

Author: Martinez-Gil Jorge
Publication venue
Publication date: 30/06/2012
Field of study

Computing the semantic similarity between terms (or short text expressions) that have the same meaning but which are not lexicographically similar is a key challenge in many computer related fields. The problem is that traditional approaches to semantic similarity measurement are not suitable for all situations, for example, many of them often fail to deal with terms not covered by synonym dictionaries or are not able to cope with acronyms, abbreviations, buzzwords, brand names, proper nouns, and so on. In this paper, we present and evaluate a collection of emerging techniques developed to avoid this problem. These techniques use some kinds of web intelligence to determine the degree of similarity between text expressions. These techniques implement a variety of paradigms including the study of co-occurrence, text snippet comparison, frequent pattern finding, or search log analysis. The goal is to substitute the traditional techniques where necessary

ZENODO

How Short is a Piece of String?: the Impact of Text Length and Text Augmentation on Short-text Classification Accuracy

Author: Hensman Svetlana
Longo Luca
McCartney Austin
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2017
Field of study

Recent increases in the use and availability of short messages have created opportunities to harvest vast amounts of information through machine-based classification. However, traditional classification methods have failed to yield accuracies comparable to classification accuracies on longer texts. Several approaches have previously been employed to extend traditional methods to overcome this problem, including the enhancement of the original texts through the construction of associations with external data supplementation sources. Existing literature does not precisely describe the impact of text length on classification performance. This work quantitatively examines the changes in accuracy of a small selection of classifiers using a variety of enhancement methods, as text length progressively decreases. Findings, based on ANOVA testing at a 95% confidence interval, suggest that the performance of classifiers using simple enhancements decreases with decreasing text length, but that the use of more sophisticated enhancements risks over-supplementation of the text and consequent concept drift and classification performance decrease as text length increases

Arrow@TUDublin