Search CORE

184,207 research outputs found

User Story Extraction from Online News with FeatureBased and Maximum Entropy Method for Software Requirements Elicitation

Author: Ngaliah Nafingatun
Raharjana Indra Kharisma
Siahaan Daniel
Publication venue: 'Lembaga Penelitian dan Pengabdian kepada Masyarakat ITS'
Publication date: 08/01/2022
Field of study

Software requirements query is the frst stage in software requirements engineering. Elicitation is the process of identifying software requirements from various sources such as interviews with resource persons, questionnaires, document analysis, etc. The user story is easy to adapt according to changing system requirements. The user story is a semi-structured language because the compilation of user stories must follow the syntax as a standard for writing features in agile software development methods. In addition, user story also easily understood by end-users who do not have an information technology background because they contain descriptions of system requirements in natural language. In making user stories, there are three aspects, namely the who aspect (actor), what aspect (activity), and the why aspect (reason). This study proposes the extraction of user stories consisting of who and what aspects of online news sites using feature extraction and maximum entropy as a classifcation method. The systems analyst can use the actual information related to the lessons obtained in the online news to get the required software requirements. The expected result of the extraction method in this research is to produce user stories relevant to the software requirements to assist systems analysts in generating requirements. This proposed method shows that the average precision and recall are 98.21% and 95.16% for the who aspect; 87,14% and 87,50% for what aspects; 81.21% and 78.60% for user stories. Thus, this result suggests that the proposed method generates user stories relevant to functional software

Center for Scientific Publication

IPTEK The Journal for Technology and Science

Machine Understandable Contracts with Deep Learning

Author: Denny MT
Dolga R
Treleaven P
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/06/2021
Field of study

This research investigates the automatic translation of contracts to computer understandable rules trough Natural Language Processing. The most challenging aspect, which is studied throughout this paper, is to understand the meaning of the contract and express it into a structured format. This problem can be reduced to the Named Entity Recognition and Rule Extraction tasks, the latter handles the extraction of terms and conditions. These two problems are difficult, but deep learning models can tackle them. We think that this paper is the first work to approach Rule Extraction with deep learning. This method is data-hungry, so the research also introduces data sets for these two tasks. Additionally, it contributes to the literature by introducing Law-Bert, a model based on BERT which is pre-trained on unlabelled contracts. The results obtained on Named Entity Recognition and Rule Extraction show that pre-training on contracts has a positive effect on performance for the downstream tasks

UCL Discovery

Comprehensive Review of Opinion Summarization

Author: Ganesan Kavita
Kim Hyun Duk
Sondhi Parikshit
Zhai ChengXiang
Publication venue
Publication date: 01/01/2011
Field of study

The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe

CiteSeerX

Illinois Digital Environment for Access to Learning and Scholarship Repository

Unsupervised Extraction of Representative Concepts from Scientific Literature

Author: Han Jiawei
Krishnan Adit
Sankar Aravind
Zhi Shi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/11/2017
Field of study

This paper studies the automated categorization and extraction of scientific concepts from titles of scientific articles, in order to gain a deeper understanding of their key contributions and facilitate the construction of a generic academic knowledgebase. Towards this goal, we propose an unsupervised, domain-independent, and scalable two-phase algorithm to type and extract key concept mentions into aspects of interest (e.g., Techniques, Applications, etc.). In the first phase of our algorithm we propose PhraseType, a probabilistic generative model which exploits textual features and limited POS tags to broadly segment text snippets into aspect-typed phrases. We extend this model to simultaneously learn aspect-specific features and identify academic domains in multi-domain corpora, since the two tasks mutually enhance each other. In the second phase, we propose an approach based on adaptor grammars to extract fine grained concept mentions from the aspect-typed phrases without the need for any external resources or human effort, in a purely data-driven manner. We apply our technique to study literature from diverse scientific domains and show significant gains over state-of-the-art concept extraction techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201

arXiv.org e-Print Archive

Crossref

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

Recommended from our members

OBOME - Ontology based opinion mining in UBIPOL

Author: Husani M
Ko A
Kocyigit A
Lee H
Tapucu D
Publication venue: Brunel University
Publication date: 01/01/2012
Field of study

Ontologies have a special role in the UBIPOL system, they help to structure the policy related context, provide conceptualization for policy domain and use in the opinion mining process. In this work we presented a system called Ontology Based Opinion Mining Engine (OBOME) for analyzing a domain-specific opinion corpus by first assisting the user with the creation of a domain ontology from the corpus. We determined the polarity of opinion on the various domain aspects. In the former step, the policy domain aspect has are identified (namely which policy category is represented by the concept). This identification is supported by the policy modelling ontology, which describe the most important policy – related classes and structure. Then the most informative documents from the corpus are extracted and asked the user to create a set of aspects and related keywords using these documents. In the latter step, we used the corpus specific ontology to model the domain and extracted aspect-polarity associations using grammatical dependencies between words. Later, summarized results are shown to the user to analyze and store. Finally, in an offline process policy modeling ontology is updated

Brunel University Research Archive