13,372 research outputs found
An Approach for Mining Top-k High Utility Item Sets (HUI)
Itemsets have been extracted by utilising high utility item (HUI) mining, which provides more benefits to the consumer. This could be one of the significant domains in data mining and be resourceful for several real-time implementations. Even though modern HUI mining algorithms may identify item sets that meet the minimum utility threshold, However, fixing the minimum threshold utility value has not been a simple task, and often it is intricate for the consumers when we keep the minimum utility value low. It might generate a massive amount of itemsets, and when the value is at its maximum, it might provide a smaller amount of itemsets. To avoid these issues, top-k HUI mining, where k represents the number of HUIs to be identified, has been proposed. Further, in this manuscript, the authors projected an algorithm called the top-k exact utility (TKEU) algorithm, which works without computing and comparing transaction weighted utilisation (TWU) values and deliberates the individual utility item values for deriving the top-k HUI. The datasets are pre-processed by the proposed algorithm to lessen the system memory space and to provide optimal outcomes for condensed datasets
EmoTxt: A Toolkit for Emotion Recognition from Text
We present EmoTxt, a toolkit for emotion recognition from text, trained and
tested on a gold standard of about 9K question, answers, and comments from
online interactions. We provide empirical evidence of the performance of
EmoTxt. To the best of our knowledge, EmoTxt is the first open-source toolkit
supporting both emotion recognition from text and training of custom emotion
classification models.Comment: In Proc. 7th Affective Computing and Intelligent Interaction
(ACII'17), San Antonio, TX, USA, Oct. 23-26, 2017, p. 79-80, ISBN:
978-1-5386-0563-
Logistic Knowledge Tracing: A Constrained Framework for Learner Modeling
Adaptive learning technology solutions often use a learner model to trace
learning and make pedagogical decisions. The present research introduces a
formalized methodology for specifying learner models, Logistic Knowledge
Tracing (LKT), that consolidates many extant learner modeling methods. The
strength of LKT is the specification of a symbolic notation system for
alternative logistic regression models that is powerful enough to specify many
extant models in the literature and many new models. To demonstrate the
generality of LKT, we fit 12 models, some variants of well-known models and
some newly devised, to 6 learning technology datasets. The results indicated
that no single learner model was best in all cases, further justifying a broad
approach that considers multiple learner model features and the learning
context. The models presented here avoid student-level fixed parameters to
increase generalizability. We also introduce features to stand in for these
intercepts. We argue that to be maximally applicable, a learner model needs to
adapt to student differences, rather than needing to be pre-parameterized with
the level of each student's ability
Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)
We report and fix an important systematic error in prior studies that ranked
classifiers for software analytics. Those studies did not (a) assess
classifiers on multiple criteria and they did not (b) study how variations in
the data affect the results. Hence, this paper applies (a) multi-criteria tests
while (b) fixing the weaker regions of the training data (using SMOTUNED, which
is a self-tuning version of SMOTE). This approach leads to dramatically large
increases in software defect predictions. When applied in a 5*5
cross-validation study for 3,681 JAVA classes (containing over a million lines
of code) from open source systems, SMOTUNED increased AUC and recall by 60% and
20% respectively. These improvements are independent of the classifier used to
predict for quality. Same kind of pattern (improvement) was observed when a
comparative analysis of SMOTE and SMOTUNED was done against the most recent
class imbalance technique. In conclusion, for software analytic tasks like
defect prediction, (1) data pre-processing can be more important than
classifier choice, (2) ranking studies are incomplete without such
pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of
Software Engineering (ICSE), 201
- …