Search CORE

156 research outputs found

Optimizing Ontology Alignments through NSGA-II without Using Reference Alignment

Author: Hao Weichen
Hou Juan
Wang Yuping
Xue Xingsi
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 10/02/2015
Field of study

Ontology is widely used to solve the data heterogeneity problems on the semantic web, but the available ontologies could themselves introduce heterogeneity. In order to reconcile these ontologies to implement the semantic interoperability, we need to find the relationships among the entities in various ontologies, and the process of identifying them is called ontology alignment. In all the existing matching systems that use evolutionary approaches to optimize their parameters, a reference alignment between two ontologies to be aligned should be given in advance which could be very expensive to obtain especially when the scale of ontologies is considerably large. To address this issue, in this paper we propose a novel approach to utilize the NSGA-II to optimize the ontology alignments without using the reference alignment. In our approach, an adaptive aggregation strategy is presented to improve the efficiency of optimizing process and two approximate evaluation measures, namely match coverage and match ratio, are introduced to replace the classic recall and precision on reference alignment to evaluate the quality of the alignments. Experimental results show that our approach is effective and can find the solutions that are very close to those obtained by the approaches using reference alignment, and the quality of alignments is in general better than that of state of the art ontology matching systems such as GOAL and SAMBO

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Comparison of ontology alignment systems across single matching task via the McNemar's test

Author: Atashin Amir Ahooye
Hofman Wout
Mohammadi Majid
Tan Yao-Hua
Publication venue
Publication date: 01/01/2018
Field of study

Ontology alignment is widely-used to find the correspondences between different ontologies in diverse fields.After discovering the alignments,several performance scores are available to evaluate them.The scores typically require the identified alignment and a reference containing the underlying actual correspondences of the given ontologies.The current trend in the alignment evaluation is to put forward a new score(e.g., precision, weighted precision, etc.)and to compare various alignments by juxtaposing the obtained scores. However,it is substantially provocative to select one measure among others for comparison.On top of that, claiming if one system has a better performance than one another cannot be substantiated solely by comparing two scalars.In this paper,we propose the statistical procedures which enable us to theoretically favor one system over one another.The McNemar's test is the statistical means by which the comparison of two ontology alignment systems over one matching task is drawn.The test applies to a 2x2 contingency table which can be constructed in two different ways based on the alignments,each of which has their own merits/pitfalls.The ways of the contingency table construction and various apposite statistics from the McNemar's test are elaborated in minute detail.In the case of having more than two alignment systems for comparison, the family-wise error rate is expected to happen. Thus, the ways of preventing such an error are also discussed.A directed graph visualizes the outcome of the McNemar's test in the presence of multiple alignment systems.From this graph, it is readily understood if one system is better than one another or if their differences are imperceptible.The proposed statistical methodologies are applied to the systems participated in the OAEI 2016 anatomy track, and also compares several well-known similarity metrics for the same matching problem

arXiv.org e-Print Archive

Crossref

TU Delft Repository

Semantic Biclustering

Author: František Malinka
Publication venue: Czech Technical University in Prague. Computing and Information Centre.
Publication date: 20/11/2021
Field of study

Tato disertační práce se zaměřuje na problém hledání interpretovatelných a prediktivních vzorů, které jsou vyjádřeny formou dvojshluků, se specializací na biologická data. Prezentované metody jsou souhrnně označovány jako sémantické dvojshlukování, jedná se o podobor dolování dat. Termín sémantické dvojshlukování je použit z toho důvodu, že zohledňuje proces hledání koherentních podmnožin řádků a sloupců, tedy dvojshluků, v 2-dimensionální binární matici a zárove ň bere také v potaz sémantický význam prvků v těchto dvojshlucích. Ačkoliv byla práce motivována biologicky orientovanými daty, vyvinuté algoritmy jsou obecně aplikovatelné v jakémkoli jiném výzkumném oboru. Je nutné pouze dodržet požadavek na formát vstupních dat. Disertační práce představuje dva originální a v tomto ohledu i základní přístupy pro hledání sémantických dvojshluků, jako je Bicluster enrichment analysis a Rule a tree learning. Jelikož tyto metody nevyužívají vlastní hierarchické uspořádání termů v daných ontologiích, obecně je běh těchto algoritmů dlouhý čin může docházet k indukci hypotéz s redundantními termy. Z toho důvodu byl vytvořen nový operátor zjemnění. Tento operátor byl včleněn do dobře známého algoritmu CN2, kde zavádí dvě redukční procedury: Redundant Generalization a Redundant Non-potential. Obě procedury pomáhají dramaticky prořezat prohledávaný prostor pravidel a tím umožňují urychlit proces indukce pravidel v porovnání s tradičním operátorem zjemnění tak, jak je původně prezentován v CN2. Celý algoritmus spolu s redukčními metodami je publikován ve formě R balííčku, který jsme nazvali sem1R. Abychom ukázali i možnost praktického užití metody sémantického dvojshlukování na reálných biologických problémech, v disertační práci dále popisujeme a specificky upravujeme algoritmus sem1R pro dv+ úlohy. Zaprvé, studujeme praktickou aplikaci algoritmu sem1R v analýze E-3 ubikvitin ligázy v trávicí soustavě s ohledem na potenciál regenerace tkáně. Zadruhé, kromě objevování dvojshluků v dat ech genové exprese, adaptujeme algoritmus sem1R pro hledání potenciálne patogenních genetických variant v kohortě pacientů.This thesis focuses on the problem of finding interpretable and predic tive patterns, which are expressed in the form of biclusters, with an orientation to biological data. The presented methods are collectively called semantic biclustering, as a subfield of data mining. The term semantic biclustering is used here because it reflects both a process of finding coherent subsets of rows and columns in a 2-dimensional binary matrix and simultaneously takes into account a mutual semantic meaning of elements in such biclusters. In spite of focusing on applications of algorithms in biological data, the developed algorithms are generally applicable to any other research field, there are only limitations on the format of the input data. The thesis introduces two novel, and in that context basic, approaches for finding semantic biclusters, as Bicluster enrichment analysis and Rule and tree learning. Since these methods do not exploit the native hierarchical order of terms of input ontologies, the run-time of algorithms is relatively long in general or an induced hypothesis might have terms that are redundant. For this reason, a new refinement operator has been invented. The refinement operator was incorporated into the well-known CN2 algorithm and uses two reduction procedures: Redundant Generalization and Redundant Non-potential, both of which help to dramatically prune the rule space and consequently, speed-up the entire process of rule induction in comparison with the traditional refinement operator as is presented in CN2. The reduction procedures were published as an R package that we called sem1R. To show a possible practical usage of semantic biclustering in real biological problems, the thesis also describes and specifically adapts the algorithm for two real biological problems. Firstly, we studied a practical application of sem1R algorithm in an analysis of E-3 ubiquitin ligase in the gastrointestinal tract with respect to tissue regeneration potential. Secondly, besides discovering biclusters in gene expression data, we adapted the sem1R algorithm for a different task, concretely for finding potentially pathogenic genetic variants in a cohort of patients

Digital Library of the Czech Technical University in Prague

ArchMine: Learning from non-machine-readable documents for additional insights

Author: Mariana Ferreira Dias
Publication venue
Publication date: 17/03/2023
Field of study

Repositório Aberto da Universidade do Porto

Asset selection and optimisation for robotic assembly cell reconfiguration

Author: Mo Fan
Publication venue
Publication date: 15/03/2024
Field of study

With the development of Industry 4.0, the manufacturing industry has revolutionized a lot. Product manufacture becomes more and more customized. This trend is achieved by innovative techniques, such as the reconfigurable manufacturing system. This system is designed at the outset for rapid change in its structure, as well as in software and hardware components, to respond to market changes quickly. Robots are important in these systems because they provide the agility and precision required to adapt rapidly to new manufacturing processes and customization demands. Despite the importance of applying robots in these systems, there might be some challenges. For example, there is data from multiple sources, such as the technical manual sensor data. Besides, robot applications must react quickly to the ever-changing process requirements to meet customer's requirements. Furthermore, further optimization, especially layout optimization, is needed to ensure production efficiency after adaptation to the current process requirements. To address these challenges, this doctoral thesis presents a framework for reconfiguring robotic assembly cells in manufacturing. This framework consists of three parts: the experience databank, the methodology for optimal manufacturing asset selection, and the methodology for layout optimization. The experience databank is introduced to confront the challenge of assimilating and processing heterogeneous data from numerous manufacturing sources, which is achieved by proposing a vendor-neutral ontology model. This model is specifically designed for encapsulating information about robotic assembly cells and is subsequently applied to a knowledge graph. The resulting knowledge graph, constituting the experience databank, facilitates the effective organization and interpretation of the diverse data. An optimal manufacturing asset selection methodology is introduced to adapt to shifting processes and product requirements, which focuses on identifying potential assets and their subsequent evaluation. This approach integrates a modular evaluation framework that considers multiple criteria such as cost, energy consumption, and robot maneuverability, ensuring the selection process remains robust in changing market demands and product requirements. A scalable methodology for layout optimization within the reconfigurable robotic assembly cells is proposed to resolve the need for further optimization post-adaption. It introduces a scalable, multi-decision modular optimization framework that synergizes a simulation environment, optimization environment, and robust optimization algorithms. This strategy utilizes the insights garnered from the experience databank to facilitate informed decision-making, thereby enabling the robotic assembly cells to not only meet the immediate production exigencies but also align with the manufacturing landscape's evolving dynamics. The validation of the three methodologies presented in this doctoral thesis encompasses both software development and practical application through three distinct use cases. For the experience databank, an interface was developed using Protégé, Neo4j, and Py2neo, allowing for effective organization and processing of varied manufacturing data. The programming interface for the asset selection methodology was built using Python, integrating with the experience databank via Py2neo and Neo4j to facilitate dynamic and informed decision-making in asset selection. In terms of software for the layout optimization framework, two different applications were developed to demonstrate the framework's scalability and adaptability. The first application, combining Python and C# programming with Siemens Tecnomatix Process Simulate, is geared towards optimizing layouts involving multiple machines. The second application utilizes Python programming alongside the RoboDK API and RoboDK software, tailored for layout optimization in scenarios involving a single robot. Complementing these software developments, the methodologies were further validated through three use cases, each addressing a unique aspect of the framework. Use Case 1 focused on implementing asset selection and system layout optimization based on a single objective, leveraging the experience databank. The required assets are selected, and the required cycle time for executing the whole robotic assembly operation has been reduced by 15.6% from 47.17 seconds to 39.83 seconds. Use Case 2 extended the layout optimization to single-robot operations with an emphasis on multi-criteria decision-making. The energy consumption was minimized to 5613.59 Wh after implementing optimization strategies, demonstrating a significant enhancement in energy efficiency. Compared with the baseline of 6164.98 Wh, this represents an 8.9% reduction in energy usage. For minimized cycle time, a reduction of 6.0% from the baseline of 57.11 seconds is achieved, resulting in a cycle time of 53.15 seconds. Regarding the pursuit of a maximized robot maneuverability index, an increase of 140.8% from the baseline of 0.4891235 is achieved, resulting in a maximized value of 1.1786125. Lastly, Use Case 3 tested the modular and multi-objective asset selection methodology, demonstrating its efficacy across diverse operational scenarios. Evaluations conducted with two multi-objective optimization algorithms, Non-Dominated Sorting Genetic Algorithm II and Strength Pareto Evolutionary Algorithm II, revealed interesting implications for selecting and optimizing robotic assets in response to new customer requests. Specifically, Strength Pareto Evolutionary Algorithm II identified a Pareto solution that was more cost-effective (£20,920) compared to Non-Dominated Sorting Genetic Algorithm II (£21,090), while maintaining a competitive specification efficiency score (0.865 vs. 0.879). Consequently, Strength Pareto Evolutionary Algorithm II is preferred for optimizing robotic asset selection in scenarios prioritizing cost. However, should the requirement shift towards maximizing specification efficiency, the Non-Dominated Sorting Genetic Algorithm II would be the more suitable choice. These use cases not only showcased the practical applicability of the developed software but also underlined the robustness and adaptability of the proposed methodologies in real-world manufacturing environments. In conclusion, this doctoral thesis presents a methodology for reconfiguring robotic assembly cells in manufacturing. By harnessing the capabilities of artificial intelligence, knowledge graphs, and simulation methodologies, it addresses the challenges of processing data from diverse sources, adapting to fluctuating market demands, and establishing further optimizations for enhanced operational efficiency in the modern manufacturing landscape. To affirm the viability of this framework, the thesis integrates software development procedures tailored to the proposed methodologies and furnishes evidence through three use cases, which are evaluated against well-defined criteria

Nottingham eTheses

A Step Toward Improving Healthcare Information Integration & Decision Support: Ontology, Sustainability and Resilience

Author: Lin Randy
Publication venue: 'University of Saskatchewan Library'
Publication date: 20/04/2023
Field of study

The healthcare industry is a complex system with numerous stakeholders, including patients, providers, insurers, and government agencies. To improve healthcare quality and population well-being, there is a growing need to leverage data and IT (Information Technology) to support better decision-making. Healthcare information systems (HIS) are developed to store, process, and disseminate healthcare data. One of the main challenges with HIS is effectively managing the large amounts of data to support decision-making. This requires integrating data from disparate sources, such as electronic health records, clinical trials, and research databases. Ontology is one approach to address this challenge. However, understanding ontology in the healthcare domain is complex and difficult. Another challenge is to use HIS on scheduling and resource allocation in a sustainable and resilient way that meets multiple conflicting objectives. This is especially important in times of crisis when demand for resources may be high, and supply may be limited. This research thesis aims to explore ontology theory and develop a methodology for constructing HIS that can effectively support better decision-making in terms of scheduling and resource allocation while considering system resiliency and social sustainability. The objectives of the thesis are: (1) studying the theory of ontology in healthcare data and developing a deep model for constructing HIS; (2) advancing our understanding of healthcare system resiliency and social sustainability; (3) developing a methodology for scheduling with multi-objectives; and (4) developing a methodology for resource allocation with multi-objectives. The following conclusions can be drawn from the research results: (1) A data model for rich semantics and easy data integration can be created with a clearer definition of the scope and applicability of ontology; (2) A healthcare system's resilience and sustainability can be significantly increased by the suggested design principles; (3) Through careful consideration of both efficiency and patients' experiences and a novel optimization algorithm, a scheduling problem can be made more patient-accessible; (4) A systematic approach to evaluating efficiency, sustainability, and resilience enables the simultaneous optimization of all three criteria at the system design stage, leading to more efficient distributions of resources and locations for healthcare facilities. The contributions of the thesis can be summarized as follows. Scientifically, this thesis work has expanded our knowledge of ontology and data modelling, as well as our comprehension of the healthcare system's resilience and sustainability. Technologically or methodologically, the work has advanced the state of knowledge for system modelling and decision-making. Overall, this thesis examines the characteristics of healthcare systems from a system viewpoint. Three ideas in this thesis—the ontology-based data modelling approach, multi-objective optimization models, and the algorithms for solving the models—can be adapted and used to affect different aspects of disparate systems

University of Saskatchewan Research Archive

Asset selection and optimisation for robotic assembly cell reconfiguration

Author: Mo Fan
Publication venue
Publication date
Field of study

Nottingham ePrints

Real-time Multi-scale Smart Energy Management and Optimisation (REMO) for buildings and their district

Author: Jayan Bejay
Publication venue
Publication date
Field of study

Energy management systems in buildings and their district today use automation systems and artificial intelligence (AI) solutions for smart energy management, but they fail to achieve the desired results due to the lack of holistic and optimised decision-making. A reason for this is the silo-oriented approach to the decision-making failing to consider cross-domain data. Ontologies, as a new way of processing domain knowledge, have been increasingly applied to different domains using formal and explicit knowledge representation to conduct smart decision-making. In this PhD research, Real-time Multiscale Smart Energy Management and Optimisation (REMO) ontology was developed, as a cross-domain knowledge-base, which consequently can be used to support holistic real-time energy management in districts considering both demand and supply side optimisation. The ontology here, is also presented as the core of a proposed framework which facilitates the running of AI solutions and automation systems, aiming to minimise energy use, emissions, and costs, while maintaining comfort for users. The state of the art AI solutions for prediction and optimisation were concluded through authors involvement in European Union research projects. The AI techniques were independently validated through action research and achieved about 30 - 40 % reduction in energy demand of the buildings, and 36% reduction in carbon emissions through optimisation of the generation mix in the district. The research here also concludes a smart way to capture the generic knowledge behind AI models in ontologies through rule axiom features, which also meant this knowledge can be used to replicate these AI models in future sites. Both semantic and syntactic validation were performed on the ontology before demonstrating how the ontology supports the various use cases of the framework for holistic energy management. Further development of the framework is recommended for the future which is needed for it to facilitate real-time energy management and optimisation in buildings and their district

Online Research @ Cardiff

Assessing the Quality of Mobile Graphical User Interfaces Using Multi-Objective Optimization

Author: Chouchane Mabrouka
Ghedira Khaled
Kessentini Marouane
Makram Soui,
Mkaouer Mohamed Wiem
Publication venue: RIT Scholar Works
Publication date: 08/10/2019
Field of study

Aesthetic defects are a violation of quality attributes that are symptoms of bad interface design programming decisions. They lead to deteriorating the perceived usability of mobile user interfaces and negatively impact the Users eXperience (UX) with the mobile app. Most existing studies relied on a subjective evaluation of aesthetic defects depending on end-users feedback, which makes the manual evaluation of mobile user interfaces human-centric, time-consuming, and error-prone. Therefore, recent studies have dedicated their effort to focus on the definition of mathematical formulas that each targets a specific structural quality of the interface. As the UX is tightly dependent on the user profile, the combi-nation and calibration of quality attributes, formulas, and users characteristics, when defining a defect, is not straightforward. In this context, we propose a fully automated framework which combines literature quality attributes with the users profile to identify aesthetic defects of MUI. More precisely, we consider the mobile user interface evaluation as a multi-objective optimization problem where the goal is to maximize the number of detected violations while minimizing the detection complexity of detection rules and enhancing the interfaces overall quality in means

RIT Scholar Works