    Authorization schema for electronic health-care records: for Uganda

    This thesis discusses how to design an authorization schema focused on ensuring each patient's data privacy within a hospital information system

    Head to head: Semantic similarity of multi-word terms

    Terms are linguistic signifiers of domain–specific concepts. Semantic similarity between terms refers to the corresponding distance in the conceptual space. In this study, we use lexico–syntactic information to define a vector space representation in which cosine similarity closely approximates semantic similarity between the corresponding terms. Given a multi–word term, each word is weighed in terms of its defining properties. In this context, the head noun is given the highest weight. Other words are weighed depending on their relations to the head noun. We formalized the problem as that of determining a topological ordering of a direct acyclic graph, which is based on constituency and dependency relations within a noun phrase. To counteract the errors associated with automatically inferred constituency and dependency relations, we implemented a heuristic approach to approximating the topological ordering. Different weights are assigned to different words based on their positions. Clustering experiments performed on such a vector space representation showed considerable improvement over the conventional bag–of–word representation. Specifically, it more consistently reflected semantic similarity between the terms. This was established by analyzing the differences between automatically generated dendrograms and manually constructed taxonomies. In conclusion, our method can be used to semi–automate taxonomy construction

    A critical review of PASBio's argument structures for biomedical verbs

    BACKGROUND: Propositional representations of biomedical knowledge are a critical component of most aspects of semantic mining in biomedicine. However, the proper set of propositions has yet to be determined. Recently, the PASBio project proposed a set of propositions and argument structures for biomedical verbs. This initial set of representations presents an opportunity for evaluating the suitability of predicate-argument structures as a scheme for representing verbal semantics in the biomedical domain. Here, we quantitatively evaluate several dimensions of the initial PASBio propositional structure repository. RESULTS: We propose a number of metrics and heuristics related to arity, role labelling, argument realization, and corpus coverage for evaluating large-scale predicate-argument structure proposals. We evaluate the metrics and heuristics by applying them to PASBio 1.0. CONCLUSION: PASBio demonstrates the suitability of predicate-argument structures for representing aspects of the semantics of biomedical verbs. Metrics related to theta-criterion violations and to the distribution of arguments are able to detect flaws in semantic representations, given a set of predicate-argument structures and a relatively small corpus annotated with them

    Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches

    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships, as well as difficulty in updating the ontology as domain knowledge changes. Methodologies developed in the fields of Natural Language Processing (NLP), Information Extraction (IE), Information Retrieval (IR), and Machine Learning (ML) provide techniques for automating the enrichment of ontology from free-text documents. In this dissertation, I extended these methodologies into biomedical ontology development. First, I reviewed existing methodologies and systems developed in the fields of NLP, IR, and IE, and discussed how existing methods can benefit the development of biomedical ontologies. This previously unconducted review was published in the Journal of Biomedical Informatics. Second, I compared the effectiveness of three methods from two different approaches, the symbolic (the Hearst method) and the statistical (the Church and Lin methods), using clinical free-text documents. Third, I developed a methodological framework for Ontology Learning (OL) evaluation and comparison. This framework permits evaluation of the two types of OL approaches that include three OL methods. The significance of this work is as follows: 1) The results from the comparative study showed the potential of these methods for biomedical ontology enrichment. For the two targeted domains (NCIT and RadLex), the Hearst method revealed an average of 21% and 11% new concept acceptance rates, respectively. The Lin method produced a 74% acceptance rate for NCIT; the Church method, 53%. As a result of this study (published in the Journal of Methods of Information in Medicine), many suggested candidates have been incorporated into the NCIT; 2) The evaluation framework is flexible and general enough that it can analyze the performance of ontology enrichment methods for many domains, thus expediting the process of automation and minimizing the likelihood that key concepts and relationships would be missed as domain knowledge evolves

    Delno avtomatizirana rekonstrukcija in dokumentiranje metod razvoja programske opreme

    Software development is a complex and creative process. In contrast to a typical business process it tends to be more dynamic and dependent on a number of circumstances. Empirical studies show that companies still don’t document their development practices, or if they do, these are not up-to-date and do not reflect how they really develop software. On the other hand, various supporting tools such as issue tracking system, revision control system, document management system, etc. are used by developers and project managers during their work, capturing a vast body of knowledge about how a software development process has been performed. The main objective of this dissertation is to propose an approach that can help companies in documenting their real development practice. Comparing to existing approaches that require substantial effort on the side of project members, our approach extracts information on development practice directly from software repositories. Five companies have been studied to identify information that can be retrieved from software repositories. Based on this, an approach to reconstruct development practice has been developed. The approach has been evaluated on a real software repository shared by an additional company. The results confirm that software repository information suffice for the reconstruction of various aspects of development process, i.e. disciplines, activities, user roles, and artifacts.Razvoj programske opreme je kompleksen in ustvarjalen proces. V primerjavi s tipičnim poslovnim procesom je bolj dinamičen in odvisen od številnih okoliščin. Empirične študije kažejo, da podjetja še vedno ne dokumentirajo svoje razvojne prakse, če pa že, le-te ne vzdržujejo in posledično ne odražajo, kako dejansko razvijajo programsko opremo. Po drugi strani pa razvijalci in vodje projektov med svojim delom uporabljajo različna podporna orodja, kot so sistem za sledenje zahtevkom, sistem za nadzor verzij, sistem za upravljanje dokumentov, itd., ki zajamejo veliko znanja o tem, kako je bil izveden proces razvoja programske opreme. Glavni cilj pričujoče disertacije je predlagati pristop, ki lahko pomaga podjetjem pri dokumentiranju njihove dejanske razvojne prakse. V primerjavi z obstoječimi pristopi, ki zahtevajo veliko napora na strani članov projekta, naš pristop rekonstruira informacije o razvojni praksi neposredno iz programskih repozitorijev. Na podlagi podatkov petih podjetij so bile identificirane informacije, ki jih je mogoče pridobiti iz programskih repozitorijev. Na podlagi tega je bil razvit pristop za rekonstrukcijo razvojne prakse. Pristop je bil evalviran na resničnem programskem repozitoriju, ki ga je zagotovilo dodatno podjetje. Rezultati potrjujejo, da informacije iz programskih repozitorijev zadostujejo za rekonstrukcijo različnih vidikov razvojnega procesa, tj. disciplin, aktivnosti, uporabniških vlog in artefaktov

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    A Semantic Framework for Declarative and Procedural Knowledge

    In any scientic domain, the full set of data and programs has reached an-ome status, i.e. it has grown massively. The original article on the Semantic Web describes the evolution of a Web of actionable information, i.e.\ud information derived from data through a semantic theory for interpreting the symbols. In a Semantic Web, methodologies are studied for describing, managing and analyzing both resources (domain knowledge) and applications (operational knowledge) - without any restriction on what and where they\ud are respectively suitable and available in the Web - as well as for realizing automatic and semantic-driven work\ud ows of Web applications elaborating Web resources.\ud This thesis attempts to provide a synthesis among Semantic Web technologies, Ontology Research, Knowledge and Work\ud ow Management. Such a synthesis is represented by Resourceome, a Web-based framework consisting of two components which strictly interact with each other: an ontology-based and domain-independent knowledge manager system (Resourceome KMS) - relying on a knowledge model where resource and operational knowledge are contextualized in any domain - and a semantic-driven work ow editor, manager and agent-based execution system (Resourceome WMS).\ud The Resourceome KMS and the Resourceome WMS are exploited in order to realize semantic-driven formulations of work\ud ows, where activities are semantically linked to any involved resource. In the whole, combining the use of domain ontologies and work ow techniques, Resourceome provides a exible domain and operational knowledge organization, a powerful engine for semantic-driven work\ud ow composition, and a distributed, automatic and\ud transparent environment for work ow execution

    Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design

    The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface

    A novel and validated agile Ontology Engineering methodology for the development of ontology-based applications

    The goal of this Thesis is to investigate the status of Ontology Engineering, underlining the main key issues still characterizing this discipline. Among these issues, the problem of reconciling macro-level methodologies with authoring techniques is pivotal in supporting novel ontology engineers. The latest approach characterizing ontology engineering methodologies leverages the agile paradigm to support collaborative ontology development and deliver efficient ontologies. However, so far, the investigations in the current support provided by these methodologies and the delivery of efficient ontologies have not been investigated. Thus, this work proposes a novel framework for the investigation of agile methodologies, with the objective of identifying the strong point of each agile methodology and their limitations. Leveraging on the findings of this analysis, the Thesis introduces a novel agile methodology – AgiSCOnt – aimed at tackling some of the key issues characterizing Ontology Engineering and weaknesses identified in existing agile approaches. The novel methodology is then put to the test as it is adopted for the development of two new domain ontologies in the field of health: the first is dedicated to patients struggling with dysphagia, while the second addresses patients affected by Chronic obstructive pulmonary disease.
