4,422 research outputs found
What Works Better? A Study of Classifying Requirements
Classifying requirements into functional requirements (FR) and non-functional
ones (NFR) is an important task in requirements engineering. However, automated
classification of requirements written in natural language is not
straightforward, due to the variability of natural language and the absence of
a controlled vocabulary. This paper investigates how automated classification
of requirements into FR and NFR can be improved and how well several machine
learning approaches work in this context. We contribute an approach for
preprocessing requirements that standardizes and normalizes requirements before
applying classification algorithms. Further, we report on how well several
existing machine learning methods perform for automated classification of NFRs
into sub-categories such as usability, availability, or performance. Our study
is performed on 625 requirements provided by the OpenScience tera-PROMISE
repository. We found that our preprocessing improved the performance of an
existing classification method. We further found significant differences in the
performance of approaches such as Latent Dirichlet Allocation, Biterm Topic
Modeling, or Naive Bayes for the sub-classification of NFRs.Comment: 7 pages, the 25th IEEE International Conference on Requirements
Engineering (RE'17
A Preliminary Study on Why Second Language Learners Accept Ungrammatical Sentences: Its Theoretical Implications
Why do second language learners sometimes accept ungrammatical sentences in the target language? In the
present study, we focus on Japanese-speaking learners of English as a Foreign Language (EFL) and investigate
whether such “grammatical illusion” effect would be observed in them and whether the effect could be
dependent on their proficiency. The results of one acceptability judgment questionnaire experiment and of
one preliminary self-paced reading experiment are reported. The results of the questionnaire experiment
showed that the lower-proficiency Japanese EFL learners were more likely to accept ungrammatical sentences
in English compared to the higher-proficiency learners. The results of the self-paced reading experiment
indicated that the reading time difference between ungrammatical sentences and their grammatical
counterparts was significant for one native English speaker but not for two Japanese EFL learners. It is
suggested that the “grammatical illusion” effect (i.e., erroneous acceptance of ungrammatical sentences) in
second language learners is more likely to be observed when their proficiency is lower, and possibly that
second language learners can accept ungrammatical sentences during their real-time processing. We discuss
a new approach to second language acquisition from the perspective of the grammatical illusion
phenomenon
Automatic acquisition of LFG resources for German - as good as it gets
We present data-driven methods for the acquisition of LFG resources from two German treebanks. We discuss problems specific to semi-free word order languages as well as problems arising fromthe data structures determined
by the design of the different treebanks. We compare two ways of encoding semi-free word order, as done in the two German treebanks, and argue that the design of the TiGer treebank is more adequate for the acquisition of LFG
resources. Furthermore, we describe an architecture for LFG grammar acquisition for German, based on the two German treebanks, and compare our results with a hand-crafted German LFG grammar
A MDL-based Model of Gender Knowledge Acquisition
This paper presents an iterative model of\ud
knowledge acquisition of gender information\ud
associated with word endings in\ud
French. Gender knowledge is represented\ud
as a set of rules containing exceptions.\ud
Our model takes noun-gender pairs as input\ud
and constantly maintains a list of\ud
rules and exceptions which is both coherent\ud
with the input data and minimal with\ud
respect to a minimum description length\ud
criterion. This model was compared to\ud
human data at various ages and showed a\ud
good fit. We also compared the kind of\ud
rules discovered by the model with rules\ud
usually extracted by linguists and found\ud
interesting discrepancies
#Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds
Compounding of natural language units is a very common phenomena. In this
paper, we show, for the first time, that Twitter hashtags which, could be
considered as correlates of such linguistic units, undergo compounding. We
identify reasons for this compounding and propose a prediction model that can
identify with 77.07% accuracy if a pair of hashtags compounding in the near
future (i.e., 2 months after compounding) shall become popular. At longer times
T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This
technique has strong implications to trending hashtag recommendation since
newly formed hashtag compounds can be recommended early, even before the
compounding has taken place. Further, humans can predict compounds with an
overall accuracy of only 48.7% (treated as baseline). Notably, while humans can
discriminate the relatively easier cases, the automatic framework is successful
in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported
Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM
conference on Computer-Supported Cooperative Work and Social Computing (CSCW
2016
- …