4 research outputs found

    Temporal expression normalisation in natural language texts

    Get PDF
    Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. In this report, I describe a novel rule-based architecture, built on top of a pre-existing system, which is able to normalise temporal expressions detected in English texts. Gold standard temporally-annotated resources are limited in size and this makes research difficult. The proposed system outperforms the state-of-the-art systems with respect to TempEval-2 Shared Task (value attribute) and achieves substantially better results with respect to the pre-existing system on top of which it has been developed. I will also introduce a new free corpus consisting of 2822 unique annotated temporal expressions. Both the corpus and the system are freely available on-line.Comment: 7 pages, 1 figure, 5 table

    An Analysis and Reasoning Framework for Project Data Software Repositories

    Get PDF
    As the requirements for software systems increase, their size, complexity and functionality consequently increases as well. This has a direct impact on the complexity of numerous artifacts related to the system such as specification, design, implementation and, testing models. Furthermore, as the software market becomes more and more competitive, the need for software products that are of high quality and require the least monetary, time and human resources for their development and maintenance becomes evident. Therefore, it is important that project managers and software engineers are given the necessary tools to obtain a more holistic and accurate perspective of the status of their projects in order to early identify potential risks, flaws, and quality issues that may arise during each stage of the software project life cycle. In this respect, practitioners and academics alike have recognized the significance of investigating new methods for supporting software management operations with respect to large software projects. The main target of this M.A.Sc. thesis is the design of a framework in terms of, first, a reference architecture for mining and analyzing of software project data repositories according to specific objectives and analytic knowledge, second, the techniques to model such analytic knowledge and, third, a reasoning methodology for verifying or denying hypotheses related to analysis objectives. Such a framework could assist project managers, team leaders and development teams towards more accurate prediction of project traits such as quality analysis, risk assessment, cost estimation and progress evaluation. More specifically, the framework utilizes goal models to specify analysis objectives as well as, possible ways by which these objectives can be achieved. Examples of such analysis objectives for a project could be to yield, high code quality, achieve low production cost or, cope with tight delivery deadlines. Such goal models are consequently transformed into collections of Markov Logic Network rules which are then applied to the repository data in order to verify or deny with a degree of probability, whether the particular project objectives can be met as the project evolves. The proposed framework has been applied, as a proof of concept, on a repository pertaining to three industrial projects with more that one hundred development tasks

    Requirement-based Root Cause Analysis Using Log Data

    Get PDF
    Root Cause Analysis for software systems is a challenging diagnostic task due to complexity emanating from the interactions between system components. Furthermore, the sheer size of the logged data makes it often difficult for human operators and administrators to perform problem diagnosis and root cause analysis. The diagnostic task is further complicated by the lack of models that could be used to support the diagnostic process. Traditionally, this diagnostic task is conducted by human experts who create mental models of systems, in order to generate hypotheses and conduct the analysis even in the presence of incomplete logged data. A challenge in this area is to provide the necessary concepts, tools, and techniques for the operators to focus their attention to specific parts of the logged data and ultimately to automate the diagnostic process. The work described in this thesis aims at proposing a framework that includes techniques, formalisms, and algorithms aimed at automating the process of root cause analysis. In particular, this work uses annotated requirement goal models to represent the monitored systems' requirements and runtime behavior. The goal models are used in combination with log data to generate a ranked set of diagnostics that represent the combination of tasks that failed leading to the observed failure. In addition, the framework uses a combination of word-based and topic-based information retrieval techniques to reduce the size of log data by filtering out a subset of log data to facilitate the diagnostic process. The process of log data filtering and reduction is based on goal model annotations and generates a sequence of logical literals that represent the possible systems' observations. A second level of investigation consists of looking for evidence for any malicious (i.e., intentionally caused by a third party) activity leading to task failures. This analysis uses annotated anti-goal models that denote possible actions that can be taken by an external user to threaten a given system task. The framework uses a novel probabilistic approach based on Markov Logic Networks. Our experiments show that our approach improves over existing proposals by handling uncertainty in observations, using natively generated log data, and by providing ranked diagnoses. The proposed framework has been evaluated using a test environment based on commercial off-the-shelf software components, publicly available Java Based ATM machine, and the large publicly available dataset (DARPA 2000)

    Real-World Learning with Markov Logic Networks

    No full text
    corecore