134 research outputs found

    Linked data and online classifications to organise mined patterns in patient data

    Get PDF
    In this paper, we investigate the use of web data resources in medicine, especially through medical classifications made available using the principles of Linked Data, to support the interpretation of patterns mined from patient care trajectories. Interpreting such patterns is naturally a challenge for an analyst, as it requires going through large amounts of results and access to sufficient background knowledge. We employ linked data, especially as exposed through the BioPortal system, to create a navigation structure within the patterns obtained form sequential pattern mining. We show how this approach provides a flexible way to explore data about trajectories of diagnoses and treatments according to different medical classifications

    Document management and retrieval for specialised domains : an evolutionary user-based approach

    Full text link
    Browsing marked-up documents by traversing hyperlinks has become probably the most important means by which documents are accessed, both via the World Wide Web (WWW) and organisational Intranets. However, there is a pressing demand for document management and retrieval systems to deal appropriately with the massive number of documents available. There are two classes of solution: general search engines, whether for the WWW or an Intranet, which make little use of specific domain knowledge or hand-crafted specialised systems which are costly to build and maintain. The aim of this thesis was to develop a document management and retrieval system suitable for small communities as well as individuals in specialised domains on the Web. The aim was to allow users to easily create and maintain their own organisation of documents while ensuring continual improvement in the retrieval performance of the system as it evolves. The system developed is based on the free annotation of documents by users and is browsed using the concept lattice of Formal Concept Analysis (FCA). A number of annotation support tools were developed to aid the annotation process so that a suitable system evolved. Experiments were conducted in using the system to assist in finding staff and student home pages at the School of Computer Science and Engineering, University of New South Wales. Results indicated that the annotation tools provided a good level of assistance so that documents were easily organised and a lattice-based browsing structure that evolves in an ad hoc fashion provided good efficiency in retrieval performance. An interesting result suggested that although an established external taxonomy can be useful in proposing annotation terms, users appear to be very selective in their use of terms proposed. Results also supported the hypothesis that the concept lattice of FCA helped take users beyond a narrow search to find other useful documents. In general, lattice-based browsing was considered as a more helpful method than Boolean queries or hierarchical browsing for searching a specialised domain. We conclude that the concept lattice of Formal Concept Analysis, supported by annotation techniques is a useful way of supporting the flexible open management of documents required by individuals, small communities and in specialised domains. It seems likely that this approach can be readily integrated with other developments such as further improvements in search engines and the use of semantically marked-up documents, and provide a unique advantage in supporting autonomous management of documents by individuals and groups - in a way that is closely aligned with the autonomy of the WWW

    From Domain Models to Components - A Formal Transformation Approach Towards Dependable Software Development

    Get PDF
    Many academic, industrial, and government research units have unanimously acknowledged the importance of developing dependable software systems. At the same time they have also concurred on the difficulties and challenges to be surmounted in achieving the goal. The importance of domain analysis and linking domain models to software artifacts were also recognized by various researchers. However, no formal approach to domain analysis was attempted. The primary motivation for this thesis stems from this context. Component-based software engineering offers some attractive mechanisms to tackle the inherent complexity in developing dependable systems. Recently a formal approach has been put forth for such a development. This thesis provides a formal approach for domain analysis, and transforms the domain model to components desired by this development process. Formal Concept Analysis (FCA) is a mathematical theory for identifying and classifying concepts. This thesis taps its potential to formally analyze the domain in a software development context. It turns out that the approach presented in this thesis cannot be fully automated; nevertheless several useful contributions have been made. These include (1) capturing formal concepts and defining them in FCA; (2) defining composition rules to categorize formal concepts and their trustworthy properties; (3) integrating partial formal context tables to build the concept lattice; (4) specifying and developing a model transformation approach to construct trustworthy OWL ontology; (5) implementing a model transformation technique to generate the TADL specification of the reusable component-based system. The proposed approach is applied to CoCoME, as a benchmark case study in the domain of component-based development

    Query-Based Multicontexts for Knowledge Base Browsing

    Get PDF

    Discovering Lexical Generalisations. A Supervised Machine Learning Approach to Inheritance Hierarchy Construction

    Get PDF
    Institute for Communicating and Collaborative SystemsGrammar development over the last decades has seen a shift away from large inventories of grammar rules to richer lexical structures. Many modern grammar theories are highly lexicalised. But simply listing lexical entries typically results in an undesirable amount of redundancy. Lexical inheritance hierarchies, on the other hand, make it possible to capture linguistic generalisations and thereby reduce redundancy. Inheritance hierarchies are usually constructed by hand but this is time-consuming and often impractical if a lexicon is very large. Constructing hierarchies automatically or semiautomatically facilitates a more systematic analysis of the lexical data. In addition, lexical data is often extracted automatically from corpora and this is likely to increase over the coming years. Therefore it makes sense to go a step further and automate the hierarchical organisation of lexical data too. Previous approaches to automatic lexical inheritance hierarchy construction tended to focus on minimality criteria, aiming for hierarchies that minimised one or more criteria such as the number of path-value pairs, the number of nodes or the number of inheritance links (Petersen 2001, Barg 1996a, and in a slightly different context: Light 1994). Aiming for minimality is motivated by the fact that the conciseness of inheritance hierarchies is a main reason for their use. However, I will argue that there are several problems with minimality-based approaches. First, minimality is not well defined in the context of lexical inheritance hierarchies as there is a tension between different minimality criteria. Second, minimality-based approaches tend to underestimate the importance of linguistic plausibility. While such approaches start with a definition of minimal redundancy and then try to prove that this leads to plausible hierarchies, the approach suggested here takes the opposite direction. It starts with a manually built hierarchy to which a supervised machine learning algorithm is applied with the aim of finding a set of formal criteria that can guide the construction of plausible hierarchies. Taking this direction means that it is more likely that the selected criteria do in fact lead to plausible hierarchies. Using a machine learning technique also has the advantage that the set of criteria can be much larger than in hand-crafted definitions. Consequently, one can define conciseness in very broad terms, taking into account interdependencies in the data as well as simple minimality criteria. This leads to a more fine-grained model of hierarchy quality. In practice, the method proposed here consists of two components: Galois lattices are used to define the search space as the set of all generalisations over the input lexicon. Maximum entropy models which have been trained on a manually built hierarchy are then applied to the lattice of the input lexicon to distinguish between plausible and implausible generalisations based on the formal criteria that were found in the training step. An inheritance hierarchy is then derived by pruning implausible generalisations. The hierarchy is automatically evaluated by matching it to a manually built hierarchy for the input lexicon. Automatically constructing lexical hierarchies is a hard task, partly because what is considered the best hierarchy for a lexicon is to some extent subjective. Supervised learning methods also suffer from a lack of suitable training data. Hence, a semi-automatic architecture may be best suited for the task. Therefore, the performance of the system has been tested using a semi-automatic as well as an automatic architecture and it has also been compared to the performance achieved by the pruning algorithm suggested by Petersen (2001). The findings show that the method proposed here is well suited for semi-automatic hierarchy construction

    Ontology Learning Using Formal Concept Analysis and WordNet

    Full text link
    Manual ontology construction takes time, resources, and domain specialists. Supporting a component of this process for automation or semi-automation would be good. This project and dissertation provide a Formal Concept Analysis and WordNet framework for learning concept hierarchies from free texts. The process has steps. First, the document is Part-Of-Speech labeled, then parsed to produce sentence parse trees. Verb/noun dependencies are derived from parse trees next. After lemmatizing, pruning, and filtering the word pairings, the formal context is created. The formal context may contain some erroneous and uninteresting pairs because the parser output may be erroneous, not all derived pairs are interesting, and it may be large due to constructing it from a large free text corpus. Deriving lattice from the formal context may take longer, depending on the size and complexity of the data. Thus, decreasing formal context may eliminate erroneous and uninteresting pairs and speed up idea lattice derivation. WordNet-based and Frequency-based approaches are tested. Finally, we compute formal idea lattice and create a classical concept hierarchy. The reduced concept lattice is compared to the original to evaluate the outcomes. Despite several system constraints and component discrepancies that may prevent logical conclusion, the following data imply idea hierarchies in this project and dissertation are promising. First, the reduced idea lattice and original concept have commonalities. Second, alternative language or statistical methods can reduce formal context size. Finally, WordNet-based and Frequency-based approaches reduce formal context differently, and the order of applying them is examined to reduce context efficiently

    24th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    In the last three decades information modelling and knowledge bases have become essentially important subjects not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The series of European – Japanese Conference on Information Modelling and Knowledge Bases (EJC) originally started as a co-operation initiative between Japan and Finland in 1982. The practical operations were then organised by professor Ohsuga in Japan and professors Hannu Kangassalo and Hannu Jaakkola in Finland (Nordic countries). Geographical scope has expanded to cover Europe and also other countries. Workshop characteristic - discussion, enough time for presentations and limited number of participants (50) / papers (30) - is typical for the conference. Suggested topics include, but are not limited to: 1. Conceptual modelling: Modelling and specification languages; Domain-specific conceptual modelling; Concepts, concept theories and ontologies; Conceptual modelling of large and heterogeneous systems; Conceptual modelling of spatial, temporal and biological data; Methods for developing, validating and communicating conceptual models. 2. Knowledge and information modelling and discovery: Knowledge discovery, knowledge representation and knowledge management; Advanced data mining and analysis methods; Conceptions of knowledge and information; Modelling information requirements; Intelligent information systems; Information recognition and information modelling. 3. Linguistic modelling: Models of HCI; Information delivery to users; Intelligent informal querying; Linguistic foundation of information and knowledge; Fuzzy linguistic models; Philosophical and linguistic foundations of conceptual models. 4. Cross-cultural communication and social computing: Cross-cultural support systems; Integration, evolution and migration of systems; Collaborative societies; Multicultural web-based software systems; Intercultural collaboration and support systems; Social computing, behavioral modeling and prediction. 5. Environmental modelling and engineering: Environmental information systems (architecture); Spatial, temporal and observational information systems; Large-scale environmental systems; Collaborative knowledge base systems; Agent concepts and conceptualisation; Hazard prediction, prevention and steering systems. 6. Multimedia data modelling and systems: Modelling multimedia information and knowledge; Contentbased multimedia data management; Content-based multimedia retrieval; Privacy and context enhancing technologies; Semantics and pragmatics of multimedia data; Metadata for multimedia information systems. Overall we received 56 submissions. After careful evaluation, 16 papers have been selected as long paper, 17 papers as short papers, 5 papers as position papers, and 3 papers for presentation of perspective challenges. We thank all colleagues for their support of this issue of the EJC conference, especially the program committee, the organising committee, and the programme coordination team. The long and the short papers presented in the conference are revised after the conference and published in the Series of “Frontiers in Artificial Intelligence” by IOS Press (Amsterdam). The books “Information Modelling and Knowledge Bases” are edited by the Editing Committee of the conference. We believe that the conference will be productive and fruitful in the advance of research and application of information modelling and knowledge bases. Bernhard Thalheim Hannu Jaakkola Yasushi Kiyok
