148 research outputs found

    Optimizing Ontology Alignments through NSGA-II without Using Reference Alignment

    Get PDF
    Ontology is widely used to solve the data heterogeneity problems on the semantic web, but the available ontologies could themselves introduce heterogeneity. In order to reconcile these ontologies to implement the semantic interoperability, we need to find the relationships among the entities in various ontologies, and the process of identifying them is called ontology alignment. In all the existing matching systems that use evolutionary approaches to optimize their parameters, a reference alignment between two ontologies to be aligned should be given in advance which could be very expensive to obtain especially when the scale of ontologies is considerably large. To address this issue, in this paper we propose a novel approach to utilize the NSGA-II to optimize the ontology alignments without using the reference alignment. In our approach, an adaptive aggregation strategy is presented to improve the efficiency of optimizing process and two approximate evaluation measures, namely match coverage and match ratio, are introduced to replace the classic recall and precision on reference alignment to evaluate the quality of the alignments. Experimental results show that our approach is effective and can find the solutions that are very close to those obtained by the approaches using reference alignment, and the quality of alignments is in general better than that of state of the art ontology matching systems such as GOAL and SAMBO

    Semantic Biclustering

    Get PDF
    Tato disertační práce se zaměřuje na problém hledání interpretovatelných a prediktivních vzorů, které jsou vyjádřeny formou dvojshluků, se specializací na biologická data. Prezentované metody jsou souhrnně označovány jako sémantické dvojshlukování, jedná se o podobor dolování dat. Termín sémantické dvojshlukování je použit z toho důvodu, že zohledňuje proces hledání koherentních podmnožin řádků a sloupců, tedy dvojshluků, v 2-dimensionální binární matici a zárove ň bere také v potaz sémantický význam prvků v těchto dvojshlucích. Ačkoliv byla práce motivována biologicky orientovanými daty, vyvinuté algoritmy jsou obecně aplikovatelné v jakémkoli jiném výzkumném oboru. Je nutné pouze dodržet požadavek na formát vstupních dat. Disertační práce představuje dva originální a v tomto ohledu i základní přístupy pro hledání sémantických dvojshluků, jako je Bicluster enrichment analysis a Rule a tree learning. Jelikož tyto metody nevyužívají vlastní hierarchické uspořádání termů v daných ontologiích, obecně je běh těchto algoritmů dlouhý čin může docházet k indukci hypotéz s redundantními termy. Z toho důvodu byl vytvořen nový operátor zjemnění. Tento operátor byl včleněn do dobře známého algoritmu CN2, kde zavádí dvě redukční procedury: Redundant Generalization a Redundant Non-potential. Obě procedury pomáhají dramaticky prořezat prohledávaný prostor pravidel a tím umožňují urychlit proces indukce pravidel v porovnání s tradičním operátorem zjemnění tak, jak je původně prezentován v CN2. Celý algoritmus spolu s redukčními metodami je publikován ve formě R balííčku, který jsme nazvali sem1R. Abychom ukázali i možnost praktického užití metody sémantického dvojshlukování na reálných biologických problémech, v disertační práci dále popisujeme a specificky upravujeme algoritmus sem1R pro dv+ úlohy. Zaprvé, studujeme praktickou aplikaci algoritmu sem1R v analýze E-3 ubikvitin ligázy v trávicí soustavě s ohledem na potenciál regenerace tkáně. Zadruhé, kromě objevování dvojshluků v dat ech genové exprese, adaptujeme algoritmus sem1R pro hledání potenciálne patogenních genetických variant v kohortě pacientů.This thesis focuses on the problem of finding interpretable and predic tive patterns, which are expressed in the form of biclusters, with an orientation to biological data. The presented methods are collectively called semantic biclustering, as a subfield of data mining. The term semantic biclustering is used here because it reflects both a process of finding coherent subsets of rows and columns in a 2-dimensional binary matrix and simultaneously takes into account a mutual semantic meaning of elements in such biclusters. In spite of focusing on applications of algorithms in biological data, the developed algorithms are generally applicable to any other research field, there are only limitations on the format of the input data. The thesis introduces two novel, and in that context basic, approaches for finding semantic biclusters, as Bicluster enrichment analysis and Rule and tree learning. Since these methods do not exploit the native hierarchical order of terms of input ontologies, the run-time of algorithms is relatively long in general or an induced hypothesis might have terms that are redundant. For this reason, a new refinement operator has been invented. The refinement operator was incorporated into the well-known CN2 algorithm and uses two reduction procedures: Redundant Generalization and Redundant Non-potential, both of which help to dramatically prune the rule space and consequently, speed-up the entire process of rule induction in comparison with the traditional refinement operator as is presented in CN2. The reduction procedures were published as an R package that we called sem1R. To show a possible practical usage of semantic biclustering in real biological problems, the thesis also describes and specifically adapts the algorithm for two real biological problems. Firstly, we studied a practical application of sem1R algorithm in an analysis of E-3 ubiquitin ligase in the gastrointestinal tract with respect to tissue regeneration potential. Secondly, besides discovering biclusters in gene expression data, we adapted the sem1R algorithm for a different task, concretely for finding potentially pathogenic genetic variants in a cohort of patients

    A Step Toward Improving Healthcare Information Integration & Decision Support: Ontology, Sustainability and Resilience

    Get PDF
    The healthcare industry is a complex system with numerous stakeholders, including patients, providers, insurers, and government agencies. To improve healthcare quality and population well-being, there is a growing need to leverage data and IT (Information Technology) to support better decision-making. Healthcare information systems (HIS) are developed to store, process, and disseminate healthcare data. One of the main challenges with HIS is effectively managing the large amounts of data to support decision-making. This requires integrating data from disparate sources, such as electronic health records, clinical trials, and research databases. Ontology is one approach to address this challenge. However, understanding ontology in the healthcare domain is complex and difficult. Another challenge is to use HIS on scheduling and resource allocation in a sustainable and resilient way that meets multiple conflicting objectives. This is especially important in times of crisis when demand for resources may be high, and supply may be limited. This research thesis aims to explore ontology theory and develop a methodology for constructing HIS that can effectively support better decision-making in terms of scheduling and resource allocation while considering system resiliency and social sustainability. The objectives of the thesis are: (1) studying the theory of ontology in healthcare data and developing a deep model for constructing HIS; (2) advancing our understanding of healthcare system resiliency and social sustainability; (3) developing a methodology for scheduling with multi-objectives; and (4) developing a methodology for resource allocation with multi-objectives. The following conclusions can be drawn from the research results: (1) A data model for rich semantics and easy data integration can be created with a clearer definition of the scope and applicability of ontology; (2) A healthcare system's resilience and sustainability can be significantly increased by the suggested design principles; (3) Through careful consideration of both efficiency and patients' experiences and a novel optimization algorithm, a scheduling problem can be made more patient-accessible; (4) A systematic approach to evaluating efficiency, sustainability, and resilience enables the simultaneous optimization of all three criteria at the system design stage, leading to more efficient distributions of resources and locations for healthcare facilities. The contributions of the thesis can be summarized as follows. Scientifically, this thesis work has expanded our knowledge of ontology and data modelling, as well as our comprehension of the healthcare system's resilience and sustainability. Technologically or methodologically, the work has advanced the state of knowledge for system modelling and decision-making. Overall, this thesis examines the characteristics of healthcare systems from a system viewpoint. Three ideas in this thesis—the ontology-based data modelling approach, multi-objective optimization models, and the algorithms for solving the models—can be adapted and used to affect different aspects of disparate systems

    Assessing the Quality of Mobile Graphical User Interfaces Using Multi-Objective Optimization

    Get PDF
    Aesthetic defects are a violation of quality attributes that are symptoms of bad interface design programming decisions. They lead to deteriorating the perceived usability of mobile user interfaces and negatively impact the Users eXperience (UX) with the mobile app. Most existing studies relied on a subjective evaluation of aesthetic defects depending on end-users feedback, which makes the manual evaluation of mobile user interfaces human-centric, time-consuming, and error-prone. Therefore, recent studies have dedicated their effort to focus on the definition of mathematical formulas that each targets a specific structural quality of the interface. As the UX is tightly dependent on the user profile, the combi-nation and calibration of quality attributes, formulas, and users characteristics, when defining a defect, is not straightforward. In this context, we propose a fully automated framework which combines literature quality attributes with the users profile to identify aesthetic defects of MUI. More precisely, we consider the mobile user interface evaluation as a multi-objective optimization problem where the goal is to maximize the number of detected violations while minimizing the detection complexity of detection rules and enhancing the interfaces overall quality in means

    Owl ontology quality assessment and optimization in the cybersecurity domain

    Get PDF
    The purpose of this dissertation is to assess the quality of ontologies in patterns perceived by cybersecurity context. A content analysis between ontologies indicated that there were more pronounced differences in OWL ontologies in the cybersecurity field. Results showed an increase of relevance from expressivity to variability. Additionally, no differences were found in strategies used in most of the incidents. The ontology background needs to be emphasized to understand the quality of the phenomena. In addition, ontologies are a means of representing an area of knowledge through their semantic structure. The search of information and integration of data from different origins provides a common base that guarantees the coherence of the data. This can be categorized and described in a normative way. The unification of information with the world that surrounds us allows to create synergies between entities and relationships. However, the area of cybersecurity is one of the real-world domains where knowledge is uncertain. It is therefore necessary to analyze the challenges of choosing the appropriate representation of un-structured information. Vulnerabilities are identified, but incident response is not an automatic mechanism for understanding and processing unstructured text found on the web.O objetivo desta dissertação foi avaliar a qualidade das ontologias, em padrões percebidos pelo contexto de cibersegurança. Uma análise de conteúdo entre ontologias indicou que havia diferenças mais pronunciadas por ontologias OWL no campo da cibersegurança. Os resultados mostram um aumento da relevância de expressividade para a variabilidade. Além disso, não foram encontradas diferenças em estratégias utilizadas na maioria dos incidentes. O conhecimento das ontologias precisa de ser enfatizado para se entender os fenómenos de qualidade. Além disso, as ontologias são um meio de representar uma área de conhecimento através da sua estrutura semântica e facilita a pesquisa de informações e a integração de dados de diferentes origens, pois fornecem uma base comum que garante a coerência dos dados, categorizados e descritos, de forma normativa. A unificação da informação com o mundo que nos rodeia permite criar sinergias entre entidades e relacionamentos. No entanto, a área de cibersegurança é um dos domínios do mundo real em que o conhecimento é incerto e é fundamental analisar os desafios de escolher a representação apropriada de informações não estruturadas. As vulnerabilidades são identificadas, mas a resposta a incidentes não é um mecanismo automático para se entender e processar textos não estruturados encontrados na web

    An interactive metaheuristic search framework for software serviceidentification from business process models

    Get PDF
    In recent years, the Service-Oriented Architecture (SOA) model of computing has become widely used and has provided efficient and agile business solutions in response to inevitable and rapid changes in business requirements. Software service identification is a crucial component in the production of a service-oriented architecture and subsequent successful software development, yet current service identification methods have limitations. For example, service identification methods are either not sufficiently comprehensive to handle the totality of service identification activities, or they lack computational support, or they pay insufficient attention to quality checks of resulting services. To address these limitations, comprehensive computationally intelligent support for software engineers when deriving software services from an organisation’s business process models shows great potential, especially when the impact of human preference on the quality of the resulting solutions can be incorporated. Accordingly, this research attempts to apply interactive metaheuristic search to effectively bridge the gap between business and SOA technology and so increase business agility.A novel, comprehensive framework is introduced that is driven by domain independent role-based business process models, and uses an interactive metaheuristic search-based service identification approach based on a genetic algorithm, while adhering to SOA principles. Termed BPMiSearch, the framework is composed of three main layers. The first layer is concerned with processing inputs from business process models into search space elements by modelling input data and presenting them at an appropriate level of granularity. The second layer focuses on identifying software services from the specified search space. The third layer refines the resulting services to map the business elements in the resulting candidate services to the corresponding service components. The proposed BPMiSearch framework has been evaluated by applying it to a healthcare domain case study, specifically, Cancer Care and Registration (CCR) business processes at the King Hussein Cancer Centre, Amman, Jordan.Experiments show that the impact of software engineer interaction on the quality of the outcomes in terms of search effectiveness, efficiency, and level of user satisfaction, is assessed. Results show that BPMiSearch has rapid search performance to positively support software engineers in the identification of services from role-based business process models while adhering to SOA principles. High-quality services are identified that might not have been arrived at manually by software engineers. Furthermore, it is found that BPMiSearch is sensitive and responsive to software engineer interaction resulting in a positive level of user trust, acceptance, and satisfaction with the candidate services

    Real-time Multi-scale Smart Energy Management and Optimisation (REMO) for buildings and their district

    Get PDF
    Energy management systems in buildings and their district today use automation systems and artificial intelligence (AI) solutions for smart energy management, but they fail to achieve the desired results due to the lack of holistic and optimised decision-making. A reason for this is the silo-oriented approach to the decision-making failing to consider cross-domain data. Ontologies, as a new way of processing domain knowledge, have been increasingly applied to different domains using formal and explicit knowledge representation to conduct smart decision-making. In this PhD research, Real-time Multiscale Smart Energy Management and Optimisation (REMO) ontology was developed, as a cross-domain knowledge-base, which consequently can be used to support holistic real-time energy management in districts considering both demand and supply side optimisation. The ontology here, is also presented as the core of a proposed framework which facilitates the running of AI solutions and automation systems, aiming to minimise energy use, emissions, and costs, while maintaining comfort for users. The state of the art AI solutions for prediction and optimisation were concluded through authors involvement in European Union research projects. The AI techniques were independently validated through action research and achieved about 30 - 40 % reduction in energy demand of the buildings, and 36% reduction in carbon emissions through optimisation of the generation mix in the district. The research here also concludes a smart way to capture the generic knowledge behind AI models in ontologies through rule axiom features, which also meant this knowledge can be used to replicate these AI models in future sites. Both semantic and syntactic validation were performed on the ontology before demonstrating how the ontology supports the various use cases of the framework for holistic energy management. Further development of the framework is recommended for the future which is needed for it to facilitate real-time energy management and optimisation in buildings and their district

    A Digital Twin framework for multi-objective optimization

    Get PDF
    This thesis represents the culmination of the Msc civil engineering course at the University of Agder. This thesis aims to attempt to define a framework for implementing digital twins in an investment cost/energy consumption optimization process. The methodology applied is a complex software hierarchy. The original dataset rests on randomly generated values of thermal transmittance, which are analysed in IDA ICE simulations, and compared to existing materials identified in the Norsk Prisbok for cost estimation. The results are optimized using a combination of Artificial Neural Networks and a multi-objective optimization algorithm, the elitist non-dominated sorting algorithm NSGA-II. The research question this thesis attempts to answer is: How can digital twins be implemented to reduce energy-consumption and costs in buildings? This thesis concludes that “A digital twin may be implemented to translate energy consumption and cost-optimization into an easily interpreted result that serves as a foundation for efficient decision-making.” This conclusion is based on the functionality of the various steps in the framework: Accuracy of ANN models, NSGA-II performance and visual presentation. The thesis presents a functional framework with a high degree of automation. Furthermore, applying said framework to a case study identified a potential energy consumption reduction of 35 % and a reduction in investment costs by 5 %
    corecore