9,880 research outputs found
DeepOnto: A Python Package for Ontology Engineering with Deep Learning
Applying deep learning techniques, particularly language models (LMs), in
ontology engineering has raised widespread attention. However, deep learning
frameworks like PyTorch and Tensorflow are predominantly developed for Python
programming, while widely-used ontology APIs, such as the OWL API and Jena, are
primarily Java-based. To facilitate seamless integration of these frameworks
and APIs, we present Deeponto, a Python package designed for ontology
engineering. The package encompasses a core ontology processing module founded
on the widely-recognised and reliable OWL API, encapsulating its fundamental
features in a more "Pythonic" manner and extending its capabilities to include
other essential components including reasoning, verbalisation, normalisation,
projection, and more. Building on this module, Deeponto offers a suite of
tools, resources, and algorithms that support various ontology engineering
tasks, such as ontology alignment and completion, by harnessing deep learning
methodologies, primarily pre-trained LMs. In this paper, we also demonstrate
the practical utility of Deeponto through two use-cases: the Digital Health
Coaching in Samsung Research UK and the Bio-ML track of the Ontology Alignment
Evaluation Initiative (OAEI).Comment: under review at Semantic Web Journa
Using knowledge graphs to infer gene expression in plants
IntroductionClimate change is already affecting ecosystems around the world and forcing us to adapt to meet societal needs. The speed with which climate change is progressing necessitates a massive scaling up of the number of species with understood genotype-environment-phenotype (GĂEĂP) dynamics in order to increase ecosystem and agriculture resilience. An important part of predicting phenotype is understanding the complex gene regulatory networks present in organisms. Previous work has demonstrated that knowledge about one species can be applied to another using ontologically-supported knowledge bases that exploit homologous structures and homologous genes. These types of structures that can apply knowledge about one species to another have the potential to enable the massive scaling up that is needed through in silico experimentation.MethodsWe developed one such structure, a knowledge graph (KG) using information from Planteome and the EMBL-EBI Expression Atlas that connects gene expression, molecular interactions, functions, and pathways to homology-based gene annotations. Our preliminary analysis uses data from gene expression studies in Arabidopsis thaliana and Populus trichocarpa plants exposed to drought conditions.ResultsA graph query identified 16 pairs of homologous genes in these two taxa, some of which show opposite patterns of gene expression in response to drought. As expected, analysis of the upstream cis-regulatory region of these genes revealed that homologs with similar expression behavior had conserved cis-regulatory regions and potential interaction with similar trans-elements, unlike homologs that changed their expression in opposite ways.DiscussionThis suggests that even though the homologous pairs share common ancestry and functional roles, predicting expression and phenotype through homology inference needs careful consideration of integrating cis and trans-regulatory components in the curated and inferred knowledge graph
Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study
Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the methodâs value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits
Serving to secure "Global Korea": Gender, mobility, and flight attendant labor migrants
This dissertation is an ethnography of mobility and modernity in contemporary South Korea (the Republic of Korea) following neoliberal restructuring precipitated by the Asian Financial Crisis (1997). It focuses on how comparative âservice,â âsecurity,â and âsafetyâ fashioned âGlobal Koreaâ: an ongoing state-sponsored project aimed at promoting the economic, political, and cultural maturation of South Korea from a once notoriously inhospitable, âbackwardâ country (hujinâguk) to a now welcoming, âadvanced countryâ (sĆnjinâguk). Through physical embodiments of the culturally-specific idiom of âsuperiorâ service (sĆbisĆ), I argue that aspiring, current, and former Korean flight attendants have driven the production and maintenance of this national project.
More broadly, as a driver of this national project, this occupation has emerged out of the countryâs own aspirational flights from an earlier history of authoritarian rule, labor violence, and xenophobia. Against the backdrop of the Korean stateâs aggressive neoliberal restructuring, globalization efforts, and current âHell Chosunâ (HelchosĆn) economy, a group of largely academically and/or class disadvantaged young women have been able secure individualized modes of pleasure, self-fulfillment, and class advancement via what I deem âservice mobilities.â Service mobilities refers to the participation of mostly women in a traditionally devalued but growing sector of the global labor market, the âpink collarâ economy centered around âfeminineâ care labor. Korean female flight attendants share labor skills resembling those of other foreign labor migrants (chiefly from the âGlobal Southâ), who perform care work deemed less desirable. Yet, Korean female flight attendants elude the stigmatizing, classed, and racialized category of âlabor migrant.â Moreover, within the context of South Koreaâs unique history of rapid modernization, the flight attendant occupation also commands considerable social prestige.
Based on ethnographic and archival research on aspiring, current, and former Korean flight attendants, this dissertation asks how these unique care laborers negotiate a metaphorical and literal series of sustained border crossings and inspections between Korean flight attendantsâ contingent status as lowly care-laboring migrants, on the one hand, and ostensibly glamorous, globetrotting elites, on the other. This study contends the following: first, the flight attendant occupation in South Korea represents new politics of pleasure and pain in contemporary East Asia. Second, Korean female flight attendantsâ enactments of soft, sanitized, and glamorous (hwaryĆhada) service help to purify South Koreaâs less savory past. In so doing, Korean flight attendants reconstitute the historical role of female laborers as burden bearers and caretakers of the Korean state.U of I OnlyAuthor submitted a 2-year U of I restriction extension request
Knowledge Graph Building Blocks: An easy-to-use Framework for developing FAIREr Knowledge Graphs
Knowledge graphs and ontologies provide promising technical solutions for
implementing the FAIR Principles for Findable, Accessible, Interoperable, and
Reusable data and metadata. However, they also come with their own challenges.
Nine such challenges are discussed and associated with the criterion of
cognitive interoperability and specific FAIREr principles (FAIR + Explorability
raised) that they fail to meet. We introduce an easy-to-use, open source
knowledge graph framework that is based on knowledge graph building blocks
(KGBBs). KGBBs are small information modules for knowledge-processing, each
based on a specific type of semantic unit. By interrelating several KGBBs, one
can specify a KGBB-driven FAIREr knowledge graph. Besides implementing semantic
units, the KGBB Framework clearly distinguishes and decouples an internal
in-memory data model from data storage, data display, and data access/export
models. We argue that this decoupling is essential for solving many problems of
knowledge management systems. We discuss the architecture of the KGBB Framework
as we envision it, comprising (i) an openly accessible KGBB-Repository for
different types of KGBBs, (ii) a KGBB-Engine for managing and operating FAIREr
knowledge graphs (including automatic provenance tracking, editing changelog,
and versioning of semantic units); (iii) a repository for KGBB-Functions; (iv)
a low-code KGBB-Editor with which domain experts can create new KGBBs and
specify their own FAIREr knowledge graph without having to think about semantic
modelling. We conclude with discussing the nine challenges and how the KGBB
Framework provides solutions for the issues they raise. While most of what we
discuss here is entirely conceptual, we can point to two prototypes that
demonstrate the principle feasibility of using semantic units and KGBBs to
manage and structure knowledge graphs
MulCNN: An efficient and accurate deep learning method based on gene embedding for cell type identification in single-cell RNA-seq data
Advancements in single-cell sequencing research have revolutionized our understanding of cellular heterogeneity and functional diversity through the analysis of single-cell transcriptomes and genomes. A crucial step in single-cell RNA sequencing (scRNA-seq) analysis is identifying cell types. However, scRNA-seq data are often high dimensional and sparse, and manual cell type identification can be time-consuming, subjective, and lack reproducibility. Consequently, analyzing scRNA-seq data remains a computational challenge. With the increasing availability of well-annotated scRNA-seq datasets, advanced methods are emerging to aid in cell type identification by leveraging this information. Deep learning neural networks have great potential for analyzing single-cell data. This paper proposes MulCNN, a multi-level convolutional neural network that uses a unique cell type-specific gene expression feature extraction method. This method extracts critical features through multi-scale convolution while filtering noise. Extensive testing using datasets from various species and comparisons with popular classification methods show that MulCNN has outstanding performance and offers a new and scalable direction for scRNA-seq analysis
Knowledge-based Modelling of Additive Manufacturing for Sustainability Performance Analysis and Decision Making
Additiivista valmistusta on pidetty kÀyttökelpoisena monimutkaisissa geometrioissa, topologisesti optimoiduissa kappaleissa ja kappaleissa joita on muuten vaikea valmistaa perinteisillÀ valmistusprosesseilla. Eduista huolimatta, yksi additiivisen valmistuksen vallitsevista haasteista on ollut heikko kyky tuottaa toimivia osia kilpailukykyisillÀ tuotantomÀÀrillÀ perinteisen valmistuksen kanssa. Mallintaminen ja simulointi ovat tehokkaita työkaluja, jotka voivat auttaa lyhentÀmÀÀn suunnittelun, rakentamisen ja testauksen sykliÀ mahdollistamalla erilaisten tuotesuunnitelmien ja prosessiskenaarioiden nopean analyysin. Perinteisten ja edistyneiden valmistusteknologioiden mahdollisuudet ja rajoitukset mÀÀrittelevÀt kuitenkin rajat uusille tuotekehityksille. Siksi on tÀrkeÀÀ, ettÀ suunnittelijoilla on kÀytettÀvissÀÀn menetelmÀt ja työkalut, joiden avulla he voivat mallintaa ja simuloida tuotteen suorituskykyÀ ja siihen liittyvÀn valmistusprosessin suorituskykyÀ, toimivien korkea arvoisten tuotteiden toteuttamiseksi. Motivaation tÀmÀn vÀitöstutkimuksen tekemiselle on, meneillÀÀn oleva kehitystyö uudenlaisen korkean lÀmpötilan suprajohtavan (high temperature superconducting (HTS)) magneettikokoonpanon kehittÀmisessÀ, joka toimii kryogeenisissÀ lÀmpötiloissa. Sen monimutkaisuus edellyttÀÀ monitieteisen asiantuntemuksen lÀhentymistÀ suunnittelun ja prototyyppien valmistuksen aikana. Tutkimus hyödyntÀÀ tietopohjaista mallinnusta valmistusprosessin analysoinnin ja pÀÀtöksenteon apuna HTS-magneettien mekaanisten komponenttien suunnittelussa. TÀmÀn lisÀksi, tutkimus etsii mahdollisuuksia additiivisen valmistuksen toteutettavuuteen HTS-magneettikokoonpanon tuotannossa. Kehitetty lÀhestymistapa kÀyttÀÀ fysikaalisiin kokeisiin perustuvaa tuote-prosessi-integroitua mallinnusta tuottamaan kvantitatiivista ja laadullista tietoa, joka mÀÀrittelee prosessi-rakenne-ominaisuus-suorituskyky-vuorovaikutuksia tietyille materiaali-prosessi-yhdistelmille. Tuloksina saadut vuorovaikutukset integroidaan kaaviopohjaiseen malliin, joka voi auttaa suunnittelutilan tutkimisessa ja tÀten auttaa varhaisessa suunnittelu- ja valmistuspÀÀtöksenteossa. TÀtÀ varten testikomponentit valmistetaan kÀyttÀmÀllÀ kahta metallin additiivista valmistus prosessia: lankakaarihitsaus additiivista valmistusta (wire arc additive manufacturing) ja selektiivistÀ lasersulatusta (selective laser melting). Rakenteellisissa sovelluksissa yleisesti kÀytetyistÀ metalliseoksista (ruostumaton terÀs, pehmeÀ terÀs, luja niukkaseosteinen terÀs, alumiini ja kupariseokset) testataan niiden mekaaniset, lÀmpö- ja sÀhköiset ominaisuudet. LisÀksi tehdÀÀn metalliseosten mikrorakenteen karakterisointi, jotta voidaan ymmÀrtÀÀ paremmin valmistusprosessin parametrien vaikutusta materiaalin ominaisuuksiin. Integroitu mallinnustapa yhdistÀÀ kerÀtyn kokeellisen tiedon, olemassa olevat analyyttiset ja empiiriset vuorovaikutus suhteet, sekÀ muut tietopohjaiset mallit (esim. elementtimallit, koneoppimismallit) pÀÀtöksenteon tukijÀrjestelmÀn muodossa, joka mahdollistaa optimaalisen materiaalin, valmistustekniikan, prosessiparametrien ja muitten ohjausmuuttujien valinnan, lopullisen 3d-tulosteun komponentin halutun rakenteen, ominaisuuksien ja suorituskyvyn saavuttamiseksi. ValmistuspÀÀtöksenteko tapahtuu todennÀköisyysmallin, eli Bayesin verkkomallin toteuttamisen kautta, joka on vankka, modulaarinen ja sovellettavissa muihin valmistusjÀrjestelmiin ja tuotesuunnitelmiin. VÀitöstyössÀ esitetyn mallin kyky parantaa additiivisien valmistusprosessien suorituskykyÀ ja laatua, tÀten edistÀÀ kestÀvÀn tuotannon tavoitteita.Additive manufacturing (AM) has been considered viable for complex geometries, topology optimized parts, and parts that are otherwise difficult to produce using conventional manufacturing processes. Despite the advantages, one of the prevalent challenges in AM has been the poor capability of producing functional parts at production volumes that are competitive with traditional manufacturing. Modelling and simulation are powerful tools that can help shorten the design-build-test cycle by enabling rapid analysis of various product designs and process scenarios. Nevertheless, the capabilities and limitations of traditional and advanced manufacturing technologies do define the bounds for new product development. Thus, it is important that the designers have access to methods and tools that enable them to model and simulate product performance and associated manufacturing process performance to realize functional high value products. The motivation for this dissertation research stems from ongoing development of a novel high temperature superconducting (HTS) magnet assembly, which operates in cryogenic environment. Its complexity requires the convergence of multidisciplinary expertise during design and prototyping. The research applies knowledge-based modelling to aid manufacturing process analysis and decision making in the design of mechanical components of the HTS magnet. Further, it explores the feasibility of using AM in the production of the HTS magnet assembly. The developed approach uses product-process integrated modelling based on physical experiments to generate quantitative and qualitative information that define process-structure-property-performance interactions for given material-process combinations. The resulting interactions are then integrated into a graph-based model that can aid in design space exploration to assist early design and manufacturing decision-making. To do so, test components are fabricated using two metal AM processes: wire and arc additive manufacturing and selective laser melting. Metal alloys (stainless steel, mild steel, high-strength low-alloyed steel, aluminium, and copper alloys) commonly used in structural applications are tested for their mechanical-, thermal-, and electrical properties. In addition, microstructural characterization of the alloys is performed to further understand the impact of manufacturing process parameters on material properties. The integrated modelling approach combines the collected experimental data, existing analytical and empirical relationships, and other data-driven models (e.g., finite element models, machine learning models) in the form of a decision support system that enables optimal selection of material, manufacturing technology, process parameters, and other control variables for attaining desired structure, property, and performance characteristics of the final printed component. The manufacturing decision making is performed through implementation of a probabilistic model i.e., a Bayesian network model, which is robust, modular, and can be adapted for other manufacturing systems and product designs. The ability of the model to improve throughput and quality of additive manufacturing processes will boost sustainable manufacturing goals
Meta-ontology fault detection
Ontology engineering is the field, within knowledge representation, concerned with using logic-based formalisms to represent knowledge, typically moderately sized knowledge bases called ontologies. How to best develop, use and maintain these ontologies has produced relatively large bodies of both formal, theoretical and methodological research.
One subfield of ontology engineering is ontology debugging, and is concerned with preventing, detecting and repairing errors (or more generally pitfalls, bad practices or faults) in ontologies. Due to the logical nature of ontologies and, in particular, entailment, these faults are often both hard to prevent and detect and have far reaching consequences. This makes ontology debugging one of the principal challenges to more widespread adoption of ontologies in applications.
Moreover, another important subfield in ontology engineering is that of ontology alignment: combining multiple ontologies to produce more powerful results than the simple sum of the parts. Ontology alignment further increases the issues, difficulties and challenges of ontology debugging by introducing, propagating and exacerbating faults in ontologies.
A relevant aspect of the field of ontology debugging is that, due to the challenges and difficulties, research within it is usually notably constrained in its scope, focusing on particular aspects of the problem or on the application to only certain subdomains or under specific methodologies. Similarly, the approaches are often ad hoc and only related to other approaches at a conceptual level. There are no well established and widely used formalisms, definitions or benchmarks that form a foundation of the field of ontology debugging.
In this thesis, I tackle the problem of ontology debugging from a more abstract than usual point of view, looking at existing literature in the field and attempting to extract common ideas and specially focussing on formulating them in a common language and under a common approach. Meta-ontology fault detection is a framework for detecting faults in ontologies that utilizes semantic fault patterns to express schematic entailments that typically indicate faults in a systematic way. The formalism that I developed to represent these patterns is called existential second-order query logic (abbreviated as ESQ logic). I further reformulated a large proportion of the ideas present in some of the existing research pieces into this framework and as patterns in ESQ logic, providing a pattern catalogue.
Most of the work during my PhD has been spent in designing and implementing
an algorithm to effectively automatically detect arbitrary ESQ patterns in arbitrary ontologies. The result is what we call minimal commitment resolution for ESQ logic, an extension of first-order resolution, drawing on important ideas from higher-order unification and implementing a novel approach to unification problems using dependency graphs. I have proven important theoretical properties about this algorithm such as its soundness, its termination (in a certain sense and under certain conditions) and its fairness or completeness in the enumeration of infinite spaces of solutions.
Moreover, I have produced an implementation of minimal commitment resolution for ESQ logic in Haskell that has passed all unit tests and produces non-trivial results on small examples. However, attempts to apply this algorithm to examples of a more realistic size have proven unsuccessful, with computation times that exceed our tolerance levels.
In this thesis, I have provided both details of the challenges faced in this regard,
as well as other successful forms of qualitative evaluation of the meta-ontology fault detection approach, and discussions about both what I believe are the main causes of the computational feasibility problems, ideas on how to overcome them, and also ideas on other directions of future work that could use the results in the thesis to contribute to the production of foundational formalisms, ideas and approaches to ontology debugging that can properly combine existing constrained research. It is unclear to me whether minimal commitment resolution for ESQ logic can, in its current shape, be implemented efficiently or not, but I believe that, at the very least, the theoretical and conceptual underpinnings that I have presented in this thesis will be useful to produce more
foundational results in the field
Identifying and modelling polysemous senses of spatial prepositions in referring expressions
In this paper we analyse the issue of reference using spatial language and examine how the polysemy exhibited by spatial prepositions can be incorporated into semantic models for situated dialogue. After providing a brief overview of polysemy in spatial language and a review of related work, we describe an experimental study we used to collect data on a set of relevant spatial prepositions. We then establish a semantic model in which to integrate polysemy (the Baseline Prototype Model), which we test against a Simple Relation Model and a Perceptron Model. To incorporate polysemy into the baseline model we introduce two methods of identifying polysemes in grounded settings. The first is based on âideal meaningsâ and a modification of the âprincipled polysemyâ framework and the second is based on âobject-specific featuresâ. In order to compare polysemes and aid typicality judgements we then introduce a notion of âpolyseme hierarchyâ. Finally, we test the performance of the polysemy models against the Baseline Prototype Model and Perceptron Model and discuss the improvements shown by the polysemy models
- âŠ