79 research outputs found

    Evaluating prediction systems in software project estimation

    Get PDF
    This is the Pre-print version of the Article - Copyright @ 2012 ElsevierContext: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems. Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random ‘predictions’, that is guessing, and (3) calculation of effect sizes. Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing. Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.Martin Shepperd was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/H050329

    Quantitative Argumentation Debates with Votes for Opinion Polling

    Get PDF
    Opinion polls are used in a variety of settings to assess the opinions of a population, but they mostly conceal the reasoning behind these opinions. Argumentation, as understood in AI, can be used to evaluate opinions in dialectical exchanges, transparently articulating the reasoning behind the opinions. We give a method integrating argumentation within opinion polling to empower voters to add new statements that render their opinions in the polls individually rational while at the same time justifying them. We then show how these poll results can be amalgamated to give a collectively rational set of voters in an argumentation framework. Our method relies upon Quantitative Argumentation Debate for Voting (QuAD-V) frameworks, which extend QuAD frameworks (a form of bipolar argumentation frameworks in which arguments have an intrinsic strength) with votes expressing individuals’ opinions on arguments

    Automatic Concept Extraction in Semantic Summarization Process

    Get PDF
    The Semantic Web offers a generic infrastructure for interchange, integration and creative reuse of structured data, which can help to cross some of the boundaries that Web 2.0 is facing. Currently, Web 2.0 offers poor query possibilities apart from searching by keywords or tags. There has been a great deal of interest in the development of semantic-based systems to facilitate knowledge representation and extraction and content integration [1], [2]. Semantic-based approach to retrieving relevant material can be useful to address issues like trying to determine the type or the quality of the information suggested from a personalized environment. In this context, standard keyword search has a very limited effectiveness. For example, it cannot filter for the type of information, the level of information or the quality of information. Potentially, one of the biggest application areas of content-based exploration might be personalized searching framework (e.g., [3],[4]). Whereas search engines provide nowadays largely anonymous information, new framework might highlight or recommend web pages related to key concepts. We can consider semantic information representation as an important step towards a wide efficient manipulation and retrieval of information [5], [6], [7]. In the digital library community a flat list of attribute/value pairs is often assumed to be available. In the Semantic Web community, annotations are often assumed to be an instance of an ontology. Through the ontologies the system will express key entities and relationships describing resources in a formal machine-processable representation. An ontology-based knowledge representation could be used for content analysis and object recognition, for reasoning processes and for enabling user-friendly and intelligent multimedia content search and retrieval. Text summarization has been an interesting and active research area since the 60’s. The definition and assumption are that a small portion or several keywords of the original long document can represent the whole informatively and/or indicatively. Reading or processing this shorter version of the document would save time and other resources [8]. This property is especially true and urgently needed at present due to the vast availability of information. Concept-based approach to represent dynamic and unstructured information can be useful to address issues like trying to determine the key concepts and to summarize the information exchanged within a personalized environment. In this context, a concept is represented with a Wikipedia article. With millions of articles and thousands of contributors, this online repository of knowledge is the largest and fastest growing encyclopedia in existence. The problem described above can then be divided into three steps: • Mapping of a series of terms with the most appropriate Wikipedia article (disambiguation). • Assigning a score for each item identified on the basis of its importance in the given context. • Extraction of n items with the highest score. Text summarization can be applied to many fields: from information retrieval to text mining processes and text display. Also in personalized searching framework text summarization could be very useful. The chapter is organized as follows: the next Section introduces personalized searching framework as one of the possible application areas of automatic concept extraction systems. Section three describes the summarization process, providing details on system architecture, used methodology and tools. Section four provides an overview about document summarization approaches that have been recently developed. Section five summarizes a number of real-world applications which might benefit from WSD. Section six introduces Wikipedia and WordNet as used in our project. Section seven describes the logical structure of the project, describing software components and databases. Finally, Section eight provides some consideration..

    Risk information recommendation for engineering workers.

    Get PDF
    Within any sufficiently expertise-reliant and work-driven domain there is a requirement to understand the similarities between specific work tasks. Though mechanisms to develop similarity models for these areas do exist, in practice they have been criticised within various domains by experts who feel that the output is not indicative of their viewpoint. In field service provision for telecommunication organisations, it can be particularly challenging to understand task similarity from the perspective of an expert engineer. With that in mind, this paper demonstrates a similarity model developed from text recorded by engineer’s themselves to develop a metric directly indicative of expert opinion. We evaluate several methods of learning text representations on a classification task developed from engineers' notes. Furthermore, we introduce a means to make use of the complex and multi-faceted aspect of the notes to recommend additional information to support engineers in the field

    GramError: a quality metric for machine generated songs.

    Get PDF
    This paper explores whether a simple grammar-based metric can accurately predict human opinion of machine-generated song lyrics quality. The proposed metric considers the percentage of words written in natural English and the number of grammatical errors to rate the quality of machine-generated lyrics. We use a state-of-the-art Recurrent Neural Network (RNN) model and adapt it to lyric generation by re-training on the lyrics of 5,000 songs. For our initial user trial, we use a small sample of songs generated by the RNN to calibrate the metric. Songs selected on the basis of this metric are further evaluated using ”Turinglike” tests to establish whether there is a correlation between metric score and human judgment. Our results show that there is strong correlation with human opinion, especially at lower levels of song quality. They also show that 75% of the RNN-generated lyrics passed for human-generated over 30% of the time
    • …
    corecore