60 research outputs found
Recommended from our members
Exploring the Semantic Meaning of Constructs that Lead to Human Decisions
This study examines automated approaches to discovering behavioral knowledge that are encoded as constructs in social and behavioral science disciplines. To date, constructs relationships are ordinarily revealed through laborious psychometric methods, but this study has shown that it is possible to extract these relationships through automated computational approaches. By building on text similarity measures from prior literature, we are able to predict construct relationships through construct name, definition and items. The predicted relationships were woven into an interlock system to demonstrate construct interplays, even though they have not been studied. The construct interlock could be seen as a theory map to understand human decision-making. Two use cases were presented to demonstrate the efficacy of the proposed measures: measuring the root constructs in UTAUT and visualizing network of construct perceived usefulness. The encouraging results showed that the proposed measures could dramatically expedite theory development, at the same time also expedite progression of human science
A Hybrid Question Answering System based on Ontology and Topic Modeling
A Question Answering (QA) system is an application which could provide accurate answer in response to the natural language questions. However, some QA systems have their weaknesses, especially for the QA system built based on Knowledge-based approach. It requires to pre-define various triple patterns in order to solve different question types. The ultimate goal of this paper is to propose an automated QA system using a hybrid approach, a combination of the knowledge-based and text-based approaches. Our approach only requires two SPARQLs to retrieve the candidate answers from the ontology without defining any question pattern, and then uses the Topic Model to find the most related candidate answers as the answers. We also investigate and evaluate different language models (unigram and bigram). Our results have shown that this proposed QA system is able to perform beyond the random baseline and solve up to 44 out of 80 questions with Mean Reciprocal Rank (MRR) of 38.73% using bigram LDA
An empirical study on CO2 emissions in ASEAN countries
This paper proposes a local feature selection (FS) measure namely, Categorical Descriptor Term (CTD) for text categorization. It is derived based on classic term weighting scheme, TFIDF. The method explicitly chooses feature set for each category by only selecting set of terms from relevant category. Although past literatures have suggested that the use of features from irrelevant categories can improve the measure of text categorization, we believe that by incorporating only
relevant feature can be highly effective. The experimental
comparison is carried out between CTD and five wellknown
feature selection measures: Information Gain, Chi-Square, Correlation Coefficient, Odd Ratio and GSS Coefficient. The results also show that our proposed method can perform comparatively well with other FS measures, especially on collection with highly overlapped topics
Joint Distance and Information Content Word Similarity Measure
Measuring semantic similarity between words is very important to many applications related to information retrieval and natural language processing. In the paper, we have discovered that word similarity metrics suffer from the drawback of obtaining equal similarities of two words, if they have the same path and depth values in WordNet. Likewise information content methods which depend on word probability of a corpus tend to posture the same drawback. This paper proposes a new hybrid semantic similarity to overcome the drawbacks by exploiting advantages of Li and Lin methods. On a benchmark set of human judgments on Miller Charles and Rubenstein Goodenough data sets, the proposed approach outperforms existing methods in distance and information content based methods
An empirical study of feature selection for text categorization based on term weightage
This paper proposes a local feature selection (FS) measure namely, Categorical Descriptor Term (CTD) for text categorization. It is derived based on classic term weighting scheme, TFIDF. The method explicitly chooses feature set for each category by only selecting set of terms from relevant category. Although past literatures have suggested that the use of features from irrelevant categories can improve the measure of text categorization, we believe that by incorporating only relevant feature can be highly effective. The experimental comparison is carried out between CTD and five wellknown feature selection measures: Information Gain, Chi-Square, Correlation Coefficient, Odd Ratio and GSS Coefficient. The results also show that our proposed method can perform comparatively well with other FS measures, especially on collection with highly overlapped topics
A Framework to Predict Software “Quality in Use” from Software Reviews
Software reviews are verified to be a good source of users’ experience.
The software “quality in use” concerns meeting users’ needs. Current
software quality models such as McCall and Boehm, are built to support software
development process, rather than users perspectives. In this paper, opinion
mining is used to extract and summarize software “quality in use” from software
reviews. A framework to detect software “quality in use” as defined by the
ISO/IEC 25010 standard is presented here. The framework employs opinionfeature
double propagation to expand predefined lists of software “quality in
use” features to domain specific features. Clustering is used to learn software
feature “quality in use” characteristics group. A preliminary result of extracted
software features shows promising results in this direction
Assessing Malaysian University English Test (MUET) Essay on Language and Semantic Features Using Intelligent Essay Grader (IEG)
Automated Essay Scoring (AES) refers to the Artificial Intelligence (AI) application with the “intelligence” in assessing and scoring essays. There are several well-known commercial AES adopted by western countries, as well as many research works conducted in investigating automated essay scoring. However, most of the products and research works are not related to the Malaysian English test context. The AES products tend to score essays based on the scoring rubrics of a particular English text context (e.g., TOEFL, GMAT) by employing their proprietary scoring algorithm that is not accessible by the users. In Malaysia, the research and development of AES are scarce. This paper intends to formulate a Malaysia-based AES, namely Intelligent Essay Grader (IEG), for the Malaysian English test environment by using our collection of two Malaysian University English Test (MUET) essay dataset. We proposed the essay scoring rubric based on its language and semantic features. We analyzed the correlation of the proposed language and semantic features with the essay grade using the Pearson Correlation Coefficient. Furthermore, we constructed an essay scoring model to predict the essay grades. In our result, we found that the language featured such as vocabulary count and advanced part of speech were highly correlated with the essay grades, and the language features showed a greater influence on essay grades than the semantic features. From our prediction model, we observed that the model yielded better accuracy results based on the selected high-correlated essay features, followed by the language features
- …