4,422 research outputs found

    What Works Better? A Study of Classifying Requirements

    Full text link
    Classifying requirements into functional requirements (FR) and non-functional ones (NFR) is an important task in requirements engineering. However, automated classification of requirements written in natural language is not straightforward, due to the variability of natural language and the absence of a controlled vocabulary. This paper investigates how automated classification of requirements into FR and NFR can be improved and how well several machine learning approaches work in this context. We contribute an approach for preprocessing requirements that standardizes and normalizes requirements before applying classification algorithms. Further, we report on how well several existing machine learning methods perform for automated classification of NFRs into sub-categories such as usability, availability, or performance. Our study is performed on 625 requirements provided by the OpenScience tera-PROMISE repository. We found that our preprocessing improved the performance of an existing classification method. We further found significant differences in the performance of approaches such as Latent Dirichlet Allocation, Biterm Topic Modeling, or Naive Bayes for the sub-classification of NFRs.Comment: 7 pages, the 25th IEEE International Conference on Requirements Engineering (RE'17

    A Preliminary Study on Why Second Language Learners Accept Ungrammatical Sentences: Its Theoretical Implications

    Get PDF
    Why do second language learners sometimes accept ungrammatical sentences in the target language? In the present study, we focus on Japanese-speaking learners of English as a Foreign Language (EFL) and investigate whether such “grammatical illusion” effect would be observed in them and whether the effect could be dependent on their proficiency. The results of one acceptability judgment questionnaire experiment and of one preliminary self-paced reading experiment are reported. The results of the questionnaire experiment showed that the lower-proficiency Japanese EFL learners were more likely to accept ungrammatical sentences in English compared to the higher-proficiency learners. The results of the self-paced reading experiment indicated that the reading time difference between ungrammatical sentences and their grammatical counterparts was significant for one native English speaker but not for two Japanese EFL learners. It is suggested that the “grammatical illusion” effect (i.e., erroneous acceptance of ungrammatical sentences) in second language learners is more likely to be observed when their proficiency is lower, and possibly that second language learners can accept ungrammatical sentences during their real-time processing. We discuss a new approach to second language acquisition from the perspective of the grammatical illusion phenomenon

    Automatic acquisition of LFG resources for German - as good as it gets

    Get PDF
    We present data-driven methods for the acquisition of LFG resources from two German treebanks. We discuss problems specific to semi-free word order languages as well as problems arising fromthe data structures determined by the design of the different treebanks. We compare two ways of encoding semi-free word order, as done in the two German treebanks, and argue that the design of the TiGer treebank is more adequate for the acquisition of LFG resources. Furthermore, we describe an architecture for LFG grammar acquisition for German, based on the two German treebanks, and compare our results with a hand-crafted German LFG grammar

    A MDL-based Model of Gender Knowledge Acquisition

    Get PDF
    This paper presents an iterative model of\ud knowledge acquisition of gender information\ud associated with word endings in\ud French. Gender knowledge is represented\ud as a set of rules containing exceptions.\ud Our model takes noun-gender pairs as input\ud and constantly maintains a list of\ud rules and exceptions which is both coherent\ud with the input data and minimal with\ud respect to a minimum description length\ud criterion. This model was compared to\ud human data at various ages and showed a\ud good fit. We also compared the kind of\ud rules discovered by the model with rules\ud usually extracted by linguists and found\ud interesting discrepancies

    #Bieber + #Blast = #BieberBlast: Early Prediction of Popular Hashtag Compounds

    Full text link
    Compounding of natural language units is a very common phenomena. In this paper, we show, for the first time, that Twitter hashtags which, could be considered as correlates of such linguistic units, undergo compounding. We identify reasons for this compounding and propose a prediction model that can identify with 77.07% accuracy if a pair of hashtags compounding in the near future (i.e., 2 months after compounding) shall become popular. At longer times T = 6, 10 months the accuracies are 77.52% and 79.13% respectively. This technique has strong implications to trending hashtag recommendation since newly formed hashtag compounds can be recommended early, even before the compounding has taken place. Further, humans can predict compounds with an overall accuracy of only 48.7% (treated as baseline). Notably, while humans can discriminate the relatively easier cases, the automatic framework is successful in classifying the relatively harder cases.Comment: 14 pages, 4 figures, 9 tables, published in CSCW (Computer-Supported Cooperative Work and Social Computing) 2016. in Proceedings of 19th ACM conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2016
    corecore