18 research outputs found

    A knowledge based reengineering approach via ontology and description logic.

    Get PDF
    Traditional software reengineering often involves a great deal of manual effort by software maintainers. This is time consuming and error prone. Due to the knowledge intensive properties of software reengineering, a knowledge-based solution is proposed in this thesis to semi-automate some of this manual effort. This thesis aims to explore the principle research question: “How can software systems be described by knowledge representation techniques in order to semi-automate the manual effort in software reengineering?” The underlying research procedure of this thesis is scientific method, which consists of: observation, proposition, test and conclusion. Ontology and description logic are employed to model and represent the knowledge in different software systems, which is integrated with domain knowledge. Model transformation is used to support ontology development. Description logic is used to implement ontology mapping algorithms, in which the problem of detecting semantic relationships is converted into the problem of deducing the satisfiability of logical formulae. Operating system ontology has been built with a top-down approach, and it was deployed to support platform specific software migration [132] and portable software development [18]. Data-dominant software ontology has been built via a bottom-up approach, and it was deployed to support program comprehension [131] and modularisation [130]. This thesis suggests that software systems can be represented by ontology and description logic. Consequently, it will help in semi-automating some of the manual tasks in software reengineering. However, there are also limitations: bottom-up ontology development may sacrifice some complexity of systems; top-down ontology development may become time consuming and complicated. In terms of future work, a greater number of diverse software system categories could be involved and different software system knowledge could be explored

    The Role of the Environment in Tissue P Systems with Cell Division

    Get PDF
    Classical tissue P systems with cell division have a special alphabet whose elements appear at the initial configuration of the system in an arbitrary large number of copies. These objects are shared in a distinguished place of the system, called the environment. Besides, the ability of these computing devices to have infinite copies of some objects has been widely exploited in the design of efficient solutions to computationally hard problems. This paper deals with computational aspects of tissue P systems with cell division where there is not an environment having the property mentioned above. Specifically, we establish the relationships between the polynomial complexity class associated with tissue P systems with cell division and with or without environment. As a consequence, we prove that it is not necessary to have infinite copies of some objects at the initial configuration in order to solve NP–complete problems in an efficient way.Ministerio de Ciencia e Innovación TIN2009-13192Junta de Andalucía P08 – TIC 0420

    LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Dr Jitender Deogu

    LEARNFCA: A FUZZY FCA AND PROBABILITY BASED APPROACH FOR LEARNING AND CLASSIFICATION

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Jitender Deogu

    Universal OWL Axiom Enrichment for Large Knowledge Bases

    Full text link
    Abstract. The Semantic Web has seen a rise in the availability and usage of knowledge bases over the past years, in particular in the Linked Open Data initiative. Despite this growth, there is still a lack of knowl-edge bases that consist of high quality schema information and instance data adhering to this schema. Several knowledge bases only consist of schema information, while others are, to a large extent, a mere collec-tion of facts without a clear structure. The combination of rich schema and instance data would allow powerful reasoning, consistency check-ing, and improved querying possibilities as well as provide more generic ways to interact with the underlying data. In this article, we present a light-weight method to enrich knowledge bases accessible via SPARQL endpoints with almost all types of OWL 2 axioms. This allows to semi-automatically create schemata, which we evaluate and discuss using DB-pedia.

    Learning Description Logic Knowledge Bases from Data Using Methods from Formal Concept Analysis

    Get PDF
    Description Logics (DLs) are a class of knowledge representation formalisms that can represent terminological and assertional knowledge using a well-defined semantics. Often, knowledge engineers are experts in their own fields, but not in logics, and require assistance in the process of ontology design. This thesis presents three methods that can extract terminological knowledge from existing data and thereby assist in the design process. They are based on similar formalisms from Formal Concept Analysis (FCA), in particular the Next-Closure Algorithm and Attribute-Exploration. The first of the three methods computes terminological knowledge from the data, without any expert interaction. The two other methods use expert interaction where a human expert can confirm each terminological axiom or refute it by providing a counterexample. These two methods differ only in the way counterexamples are provided

    Using Knowledge Anchors to Facilitate User Exploration of Data Graphs

    Get PDF
    YesThis paper investigates how to facilitate users’ exploration through data graphs for knowledge expansion. Our work focuses on knowledge utility – increasing users’ domain knowledge while exploring a data graph. We introduce a novel exploration support mechanism underpinned by the subsumption theory of meaningful learning, which postulates that new knowledge is grasped by starting from familiar concepts in the graph which serve as knowledge anchors from where links to new knowledge are made. A core algorithmic component for operationalising the subsumption theory for meaningful learning to generate exploration paths for knowledge expansion is the automatic identification of knowledge anchors in a data graph (KADG). We present several metrics for identifying KADG which are evaluated against familiar concepts in human cognitive structures. A subsumption algorithm that utilises KADG for generating exploration paths for knowledge expansion is presented, and applied in the context of a Semantic data browser in a music domain. The resultant exploration paths are evaluated in a task-driven experimental user study compared to free data graph exploration. The findings show that exploration paths, based on subsumption and using knowledge anchors, lead to significantly higher increase in the users’ conceptual knowledge and better usability than free exploration of data graphs. The work opens a new avenue in semantic data exploration which investigates the link between learning and knowledge exploration. This extends the value of exploration and enables broader applications of data graphs in systems where the end users are not experts in the specific domain

    Dealing with inconsistent and incomplete data in a semantic technology setting

    Get PDF
    Semantic and traditional databases are vulnerable to Inconsistent or Incomplete Data (IID). A data set stored in a traditional or semantic database is queried to retrieve record(s) in a tabular format. Such retrieved records can consist of many rows where each row contains an object and the associated fields (columns). However, a large set of records retrieved from a noisy data set may be wrongly analysed. For example, a data analyst may ascribe inconsistent data as consistent or incomplete data as complete where he did not identify the inconsistency or incompleteness in the data. Analysis on a large set of data can be undermined by the presence of IID in that data set. Reliance as a result is placed on the data analyst to identify and visualise the IID in the data set. The IID issues are heightened in open world assumptions as evident in semantic or Resource Description Framework (RDF) databases. Unlike the closed world assumption in traditional databases where data are assumed to be complete with its own issues, in the open world assumption the data might be assumed to be unknown and IID has to be tolerated at the outset. Formal Concept Analysis (FCA) can be used to deal with IID in such databases. That is because FCA is a mathematical method that uses a lattice structure to reveal the associations among objects and attributes in a data set. The existing FCA approaches that can be used in dealing with IID in RDF databases include fault tolerance, Dau's approach, and CUBIST approaches. The new FCA approaches include association rules, semi-automated and automated methods in FcaBedrock. These new FCA approaches were developed in the course of this study. To underpin this work, a series of empirical studies were carried out based on the single case study methodology. The case study, namely the Edinburgh Mouse Atlas Gene Expression Database (EMAGE) provided the real-life context according to that methodology. The existing and the new FCA approaches were used in identifying and visualising the IID in the EMAGE RDF data set. The empirical studies revealed that the existing approaches used in dealing with IID in EMAGE are tedious and do not allow the IID to be easily visualised in the database. It also revealed that existing FCA approaches for dealing with IID do not exclusively visualise the IID in a data set. This is unlike the new FCA approaches, notably the semi-automated and automated FcaBedrock that can separate out and thus exclusively visualise IID in objects associated with the many value attributes that characterise such data sets. The exclusive visualisation of IID in a data set enables the data analyst to identify holistically the IID in his or her investigated data set thereby avoiding mistaken conclusions. The aim was to discover how effective each FCA approach is in identifying and visualising IID, answering the research question: "How can FCA tools and techniques be used in identifying and visualising IID in RDF data?" The automated FcaBedrock approach emerged to be the best means for visually identifying IID in an RDF data set. The CUBIST approaches and the semi-automated approach were ranked as 2nd and 3rd, respectively, whilst Dau's approach ranked as 4th. Whilst the subject of IID in a semantic technology setting could be explored further, it can be concluded that the automated FcaBedrock approach best identifies and visualises the IID in an RDF thus semantic data set
    corecore