582 research outputs found

    Automatic generation of business process models from user stories

    Get PDF
    In this paper, we propose an automated approach to extract business process models from requirements, which are presented as user stories. In agile software development, the user story is a simple description of the functionality of the software. It is presented from the user's point of view and is written in natural language. Acceptance criteria are a list of specifications on how a new software feature is expected to operate. Our approach analyzes a set of acceptance criteria accompanying the user story, in order, first, to automatically generate the components of the business model, and then to produce the business model as an activity diagram which is a unified modeling language (UML) behavioral diagram. We start with the use of natural language processing (NLP) techniques to extract the elements necessary to define the rules for retrieving artifacts from the business model. These rules are then developed in Prolog language and imported into Python code. The proposed approach was evaluated on a set of use cases using different performance measures. The results indicate that our method is capable of generating correct and accurate process models

    Comparing knowledge sources for nominal anaphora resolution

    Get PDF
    We compare two ways of obtaining lexical knowledge for antecedent selection in other-anaphora and definite noun phrase coreference. Specifically, we compare an algorithm that relies on links encoded in the manually created lexical hierarchy WordNet and an algorithm that mines corpora by means of shallow lexico-semantic patterns. As corpora we use the British National Corpus (BNC), as well as the Web, which has not been previously used for this task. Our results show that (a) the knowledge encoded in WordNet is often insufficient, especially for anaphor-antecedent relations that exploit subjective or context-dependent knowledge; (b) for other-anaphora, the Web-based method outperforms the WordNet-based method; (c) for definite NP coreference, the Web-based method yields results comparable to those obtained using WordNet over the whole dataset and outperforms the WordNet-based method on subsets of the dataset; (d) in both case studies, the BNC-based method is worse than the other methods because of data sparseness. Thus, in our studies, the Web-based method alleviated the lexical knowledge gap often encountered in anaphora resolution, and handled examples with context-dependent relations between anaphor and antecedent. Because it is inexpensive and needs no hand-modelling of lexical knowledge, it is a promising knowledge source to integrate in anaphora resolution systems

    Advanced Knowledge Technologies at the Midterm: Tools and Methods for the Semantic Web

    Get PDF
    The University of Edinburgh and research sponsors are authorised to reproduce and distribute reprints and on-line copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are the author’s and shouldn’t be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of other parties.In a celebrated essay on the new electronic media, Marshall McLuhan wrote in 1962:Our private senses are not closed systems but are endlessly translated into each other in that experience which we call consciousness. Our extended senses, tools, technologies, through the ages, have been closed systems incapable of interplay or collective awareness. Now, in the electric age, the very instantaneous nature of co-existence among our technological instruments has created a crisis quite new in human history. Our extended faculties and senses now constitute a single field of experience which demands that they become collectively conscious. Our technologies, like our private senses, now demand an interplay and ratio that makes rational co-existence possible. As long as our technologies were as slow as the wheel or the alphabet or money, the fact that they were separate, closed systems was socially and psychically supportable. This is not true now when sight and sound and movement are simultaneous and global in extent. (McLuhan 1962, p.5, emphasis in original)Over forty years later, the seamless interplay that McLuhan demanded between our technologies is still barely visible. McLuhan’s predictions of the spread, and increased importance, of electronic media have of course been borne out, and the worlds of business, science and knowledge storage and transfer have been revolutionised. Yet the integration of electronic systems as open systems remains in its infancy.Advanced Knowledge Technologies (AKT) aims to address this problem, to create a view of knowledge and its management across its lifecycle, to research and create the services and technologies that such unification will require. Half way through its sixyear span, the results are beginning to come through, and this paper will explore some of the services, technologies and methodologies that have been developed. We hope to give a sense in this paper of the potential for the next three years, to discuss the insights and lessons learnt in the first phase of the project, to articulate the challenges and issues that remain.The WWW provided the original context that made the AKT approach to knowledge management (KM) possible. AKT was initially proposed in 1999, it brought together an interdisciplinary consortium with the technological breadth and complementarity to create the conditions for a unified approach to knowledge across its lifecycle. The combination of this expertise, and the time and space afforded the consortium by the IRC structure, suggested the opportunity for a concerted effort to develop an approach to advanced knowledge technologies, based on the WWW as a basic infrastructure.The technological context of AKT altered for the better in the short period between the development of the proposal and the beginning of the project itself with the development of the semantic web (SW), which foresaw much more intelligent manipulation and querying of knowledge. The opportunities that the SW provided for e.g., more intelligent retrieval, put AKT in the centre of information technology innovation and knowledge management services; the AKT skill set would clearly be central for the exploitation of those opportunities.The SW, as an extension of the WWW, provides an interesting set of constraints to the knowledge management services AKT tries to provide. As a medium for the semantically-informed coordination of information, it has suggested a number of ways in which the objectives of AKT can be achieved, most obviously through the provision of knowledge management services delivered over the web as opposed to the creation and provision of technologies to manage knowledge.AKT is working on the assumption that many web services will be developed and provided for users. The KM problem in the near future will be one of deciding which services are needed and of coordinating them. Many of these services will be largely or entirely legacies of the WWW, and so the capabilities of the services will vary. As well as providing useful KM services in their own right, AKT will be aiming to exploit this opportunity, by reasoning over services, brokering between them, and providing essential meta-services for SW knowledge service management.Ontologies will be a crucial tool for the SW. The AKT consortium brings a lot of expertise on ontologies together, and ontologies were always going to be a key part of the strategy. All kinds of knowledge sharing and transfer activities will be mediated by ontologies, and ontology management will be an important enabling task. Different applications will need to cope with inconsistent ontologies, or with the problems that will follow the automatic creation of ontologies (e.g. merging of pre-existing ontologies to create a third). Ontology mapping, and the elimination of conflicts of reference, will be important tasks. All of these issues are discussed along with our proposed technologies.Similarly, specifications of tasks will be used for the deployment of knowledge services over the SW, but in general it cannot be expected that in the medium term there will be standards for task (or service) specifications. The brokering metaservices that are envisaged will have to deal with this heterogeneity.The emerging picture of the SW is one of great opportunity but it will not be a wellordered, certain or consistent environment. It will comprise many repositories of legacy data, outdated and inconsistent stores, and requirements for common understandings across divergent formalisms. There is clearly a role for standards to play to bring much of this context together; AKT is playing a significant role in these efforts. But standards take time to emerge, they take political power to enforce, and they have been known to stifle innovation (in the short term). AKT is keen to understand the balance between principled inference and statistical processing of web content. Logical inference on the Web is tough. Complex queries using traditional AI inference methods bring most distributed computer systems to their knees. Do we set up semantically well-behaved areas of the Web? Is any part of the Web in which semantic hygiene prevails interesting enough to reason in? These and many other questions need to be addressed if we are to provide effective knowledge technologies for our content on the web

    Model-Agnostic process modelling

    Get PDF
    Modeling techniques in Business Process Management often suffer from low adoption due to the variety of profiles found in organizations. This project aims to provide a novel alternative to BPM documentation, ATD, based on annotated process descriptions in natural language

    One, no one and one hundred thousand events: Defining and processing events in an inter-disciplinary perspective

    Get PDF
    We present an overview of event definition and processing spanning 25 years of research in NLP. We first provide linguistic background to the notion of event, and then present past attempts to formalize this concept in annotation standards to foster the development of benchmarks for event extraction systems. This ranges from MUC-3 in 1991 to the Time and Space Track challenge at SemEval 2015. Besides, we shed light on other disciplines in which the notion of event plays a crucial role, with a focus on the historical domain. Our goal is to provide a comprehensive study on event definitions and investigate which potential past efforts in the NLP community may have in a different research domain. We present the results of a questionnaire, where the notion of event for historians is put in relation to the NLP perspective

    Eesti keele ĂŒldvaldkonna tekstide laia kattuvusega automaatne sĂŒndmusanalĂŒĂŒs

    Get PDF
    Seoses tekstide suuremahulise digitaliseerimisega ning digitaalse tekstiloome jĂ€rjest laiema levikuga on tohutul hulgal loomuliku keele tekste muutunud ja muutumas masinloetavaks. Masinloetavus omab potentsiaali muuta tekstimassiivid inimeste jaoks lihtsamini hallatavaks, nt lubada rakendusi nagu automaatne sisukokkuvĂ”tete tegemine ja tekstide pĂ”hjal kĂŒsimustele vastamine, ent paraku ei ulatu praegused automaatanalĂŒĂŒsi vĂ”imalused tekstide sisu tegeliku mĂ”istmiseni. Oletatakse, tekstide sisu mĂ”istvale automaatanalĂŒĂŒsile viib meid lĂ€hemale sĂŒndmusanalĂŒĂŒs – kuna paljud tekstid on narratiivse ĂŒlesehitusega, tĂ”lgendatavad kui „sĂŒndmuste kirjeldused”, peaks tekstidest sĂŒndmuste eraldamine ja formaalsel kujul esitamine pakkuma alust mitmete „teksti mĂ”istmist” nĂ”udvate keeletehnoloogia rakenduste loomisel. KĂ€esolevas vĂ€itekirjas uuritakse, kuivĂ”rd saab eestikeelsete tekstide sĂŒndmusanalĂŒĂŒsi kĂ€sitleda kui avatud sĂŒndmuste hulka ja ĂŒldvaldkonna tekste hĂ”lmavat automaatse lingvistilise analĂŒĂŒsi ĂŒlesannet. Probleemile lĂ€henetakse eesti keele automaatanalĂŒĂŒsi kontekstis uudsest, sĂŒndmuste ajasemantikale keskenduvast perspektiivist. Töös kohandatakse eesti keelele TimeML mĂ€rgendusraamistik ja luuakse raamistikule toetuv automaatne ajavĂ€ljendite tuvastaja ning ajasemantilise mĂ€rgendusega (sĂŒndmusviidete, ajavĂ€ljendite ning ajaseoste mĂ€rgendusega) tekstikorpus; analĂŒĂŒsitakse korpuse pĂ”hjal inimmĂ€rgendajate kooskĂ”la sĂŒndmusviidete ja ajaseoste mÀÀramisel ning lĂ”puks uuritakse vĂ”imalusi ajasemantika-keskse sĂŒndmusanalĂŒĂŒsi laiendamiseks geneeriliseks sĂŒndmusanalĂŒĂŒsiks sĂŒndmust vĂ€ljendavate keelendite samaviitelisuse lahendamise nĂ€itel. Töö pakub suuniseid tekstide ajasemantika ja sĂŒndmusstruktuuri mĂ€rgenduse edasiarendamiseks tulevikus ning töös loodud keeleressurssid vĂ”imaldavad nii konkreetsete lĂ”pp-rakenduste (nt automaatne ajakĂŒsimustele vastamine) katsetamist kui ka automaatsete mĂ€rgendustööriistade edasiarendamist.  Due to massive scale digitalisation processes and a switch from traditional means of written communication to digital written communication, vast amounts of human language texts are becoming machine-readable. Machine-readability holds a potential for easing human effort on searching and organising large text collections, allowing applications such as automatic text summarisation and question answering. However, current tools for automatic text analysis do not reach for text understanding required for making these applications generic. It is hypothesised that automatic analysis of events in texts leads us closer to the goal, as many texts can be interpreted as stories/narratives that are decomposable into events. This thesis explores event analysis as broad-coverage and general domain automatic language analysis problem in Estonian, and provides an investigation starting from time-oriented event analysis and tending towards generic event analysis. We adapt TimeML framework to Estonian, and create an automatic temporal expression tagger and a news corpus manually annotated for temporal semantics (event mentions, temporal expressions, and temporal relations) for the language; we analyse consistency of human annotation of event mentions and temporal relations, and, finally, provide a preliminary study on event coreference resolution in Estonian news. The current work also makes suggestions on how future research can improve Estonian event and temporal semantic annotation, and the language resources developed in this work will allow future experimentation with end-user applications (such as automatic answering of temporal questions) as well as provide a basis for developing automatic semantic analysis tools

    Bridging the gap between textual and formal business process representations

    Get PDF
    Tesi en modalitat de compendi de publicacionsIn the era of digital transformation, an increasing number of organizations are start ing to think in terms of business processes. Processes are at the very heart of each business, and must be understood and carried out by a wide range of actors, from both technical and non-technical backgrounds alike. When embracing digital transformation practices, there is a need for all involved parties to be aware of the underlying business processes in an organization. However, the representational complexity and biases of the state-of-the-art modeling notations pose a challenge in understandability. On the other hand, plain language representations, accessible by nature and easily understood by everyone, are often frowned upon by technical specialists due to their ambiguity. The aim of this thesis is precisely to bridge this gap: Between the world of the techni cal, formal languages and the world of simpler, accessible natural languages. Structured as an article compendium, in this thesis we present four main contributions to address specific problems in the intersection between the fields of natural language processing and business process management.A l’era de la transformaciĂł digital, cada vegada mĂ©s organitzacions comencen a pensar en termes de processos de negoci. Els processos sĂłn el nucli principal de tota empresa i, com a tals, han de ser fĂ cilment comprensibles per un ampli ventall de rols, tant perfils tĂšcnics com no-tĂšcnics. Quan s’adopta la transformaciĂł digital, Ă©s necessari que totes les parts involucrades estiguin ben informades sobre els protocols implantats com a part del procĂ©s de digitalitzaciĂł. Tot i aixĂČ, la complexitat i biaixos de representaciĂł dels llenguatges de modelitzaciĂł que actualment conformen l’estat de l’art sovint en dificulten la seva com prensiĂł. D’altra banda, les representacions basades en documentaciĂł usant llenguatge natural, accessibles per naturalesa i fĂ cilment comprensibles per tothom, moltes vegades sĂłn vistes com un problema pels perfils mĂ©s tĂšcnics a causa de la presĂšncia d’ambigĂŒitats en els textos. L’objectiu d’aquesta tesi Ă©s precisament el de superar aquesta distĂ ncia: La distĂ ncia entre el mĂłn dels llenguatges tĂšcnics i formals amb el dels llenguatges naturals, mĂ©s accessibles i senzills. Amb una estructura de compendi d’articles, en aquesta tesi presentem quatre grans lĂ­nies de recerca per adreçar problemes especĂ­fics en aquesta intersecciĂł entre les tecnologies d’anĂ lisi de llenguatge natural i la gestiĂł dels processos de negoci.Postprint (published version

    BioNLP Shared Task - The Bacteria Track

    Get PDF
    Background: We present the BioNLP 2011 Shared Task Bacteria Track, the first Information Extraction challenge entirely dedicated to bacteria. It includes three tasks that cover different levels of biological knowledge. The Bacteria Gene Renaming supporting task is aimed at extracting gene renaming and gene name synonymy in PubMed abstracts. The Bacteria Gene Interaction is a gene/protein interaction extraction task from individual sentences. The interactions have been categorized into ten different sub-types, thus giving a detailed account of genetic regulations at the molecular level. Finally, the Bacteria Biotopes task focuses on the localization and environment of bacteria mentioned in textbook articles. We describe the process of creation for the three corpora, including document acquisition and manual annotation, as well as the metrics used to evaluate the participants' submissions. Results: Three teams submitted to the Bacteria Gene Renaming task; the best team achieved an F-score of 87%. For the Bacteria Gene Interaction task, the only participant's score had reached a global F-score of 77%, although the system efficiency varies significantly from one sub-type to another. Three teams submitted to the Bacteria Biotopes task with very different approaches; the best team achieved an F-score of 45%. However, the detailed study of the participating systems efficiency reveals the strengths and weaknesses of each participating system. Conclusions: The three tasks of the Bacteria Track offer participants a chance to address a wide range of issues in Information Extraction, including entity recognition, semantic typing and coreference resolution. We found commond trends in the most efficient systems: the systematic use of syntactic dependencies and machine learning. Nevertheless, the originality of the Bacteria Biotopes task encouraged the use of interesting novel methods and techniques, such as term compositionality, scopes wider than the sentence
    • 

    corecore