184,207 research outputs found

    User Story Extraction from Online News with FeatureBased and Maximum Entropy Method for Software Requirements Elicitation

    Get PDF
    Software requirements query is the frst stage in software requirements engineering. Elicitation is the process of identifying software requirements from various sources such as interviews with resource persons, questionnaires, document analysis, etc. The user story is easy to adapt according to changing system requirements. The user story is a semi-structured language because the compilation of user stories must follow the syntax as a standard for writing features in agile software development methods. In addition, user story also easily understood by end-users who do not have an information technology background because they contain descriptions of system requirements in natural language. In making user stories, there are three aspects, namely the who aspect (actor), what aspect (activity), and the why aspect (reason). This study proposes the extraction of user stories consisting of who and what aspects of online news sites using feature extraction and maximum entropy as a classifcation method. The systems analyst can use the actual information related to the lessons obtained in the online news to get the required software requirements. The expected result of the extraction method in this research is to produce user stories relevant to the software requirements to assist systems analysts in generating requirements. This proposed method shows that the average precision and recall are 98.21% and 95.16% for the who aspect; 87,14% and 87,50% for what aspects; 81.21% and 78.60% for user stories. Thus, this result suggests that the proposed method generates user stories relevant to functional software

    Machine Understandable Contracts with Deep Learning

    Get PDF
    This research investigates the automatic translation of contracts to computer understandable rules trough Natural Language Processing. The most challenging aspect, which is studied throughout this paper, is to understand the meaning of the contract and express it into a structured format. This problem can be reduced to the Named Entity Recognition and Rule Extraction tasks, the latter handles the extraction of terms and conditions. These two problems are difficult, but deep learning models can tackle them. We think that this paper is the first work to approach Rule Extraction with deep learning. This method is data-hungry, so the research also introduces data sets for these two tasks. Additionally, it contributes to the literature by introducing Law-Bert, a model based on BERT which is pre-trained on unlabelled contracts. The results obtained on Named Entity Recognition and Rule Extraction show that pre-training on contracts has a positive effect on performance for the downstream tasks

    Comprehensive Review of Opinion Summarization

    Get PDF
    The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe

    Unsupervised Extraction of Representative Concepts from Scientific Literature

    Full text link
    This paper studies the automated categorization and extraction of scientific concepts from titles of scientific articles, in order to gain a deeper understanding of their key contributions and facilitate the construction of a generic academic knowledgebase. Towards this goal, we propose an unsupervised, domain-independent, and scalable two-phase algorithm to type and extract key concept mentions into aspects of interest (e.g., Techniques, Applications, etc.). In the first phase of our algorithm we propose PhraseType, a probabilistic generative model which exploits textual features and limited POS tags to broadly segment text snippets into aspect-typed phrases. We extend this model to simultaneously learn aspect-specific features and identify academic domains in multi-domain corpora, since the two tasks mutually enhance each other. In the second phase, we propose an approach based on adaptor grammars to extract fine grained concept mentions from the aspect-typed phrases without the need for any external resources or human effort, in a purely data-driven manner. We apply our technique to study literature from diverse scientific domains and show significant gains over state-of-the-art concept extraction techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System
    • …
    corecore