Search CORE

11,670 research outputs found

Vermeidung von Repräsentationsheterogenitäten in realweltlichen Wissensgraphen

Author: Kalo Jan-Christoph
Publication venue
Publication date: 01/01/2021
Field of study

Knowledge graphs are repositories providing factual knowledge about entities. They are a great source of knowledge to support modern AI applications for Web search, question answering, digital assistants, and online shopping. The advantages of machine learning techniques and the Web's growth have led to colossal knowledge graphs with billions of facts about hundreds of millions of entities collected from a large variety of sources. While integrating independent knowledge sources promises rich information, it inherently leads to heterogeneities in representation due to a large variety of different conceptualizations. Thus, real-world knowledge graphs are threatened in their overall utility. Due to their sheer size, they are hardly manually curatable anymore. Automatic and semi-automatic methods are needed to cope with these vast knowledge repositories. We first address the general topic of representation heterogeneity by surveying the problem throughout various data-intensive fields: databases, ontologies, and knowledge graphs. Different techniques for automatically resolving heterogeneity issues are presented and discussed, while several open problems are identified. Next, we focus on entity heterogeneity. We show that automatic matching techniques may run into quality problems when working in a multi-knowledge graph scenario due to incorrect transitive identity links. We present four techniques that can be used to improve the quality of arbitrary entity matching tools significantly. Concerning relation heterogeneity, we show that synonymous relations in knowledge graphs pose several difficulties in querying. Therefore, we resolve these heterogeneities with knowledge graph embeddings and by Horn rule mining. All methods detect synonymous relations in knowledge graphs with high quality. Furthermore, we present a novel technique for avoiding heterogeneity issues at query time using implicit knowledge storage. We show that large neural language models are a valuable source of knowledge that is queried similarly to knowledge graphs already solving several heterogeneity issues internally.Wissensgraphen sind eine wichtige Datenquelle von Entitätswissen. Sie unterstützen viele moderne KI-Anwendungen. Dazu gehören unter anderem Websuche, die automatische Beantwortung von Fragen, digitale Assistenten und Online-Shopping. Neue Errungenschaften im maschinellen Lernen und das außerordentliche Wachstum des Internets haben zu riesigen Wissensgraphen geführt. Diese umfassen häufig Milliarden von Fakten über Hunderte von Millionen von Entitäten; häufig aus vielen verschiedenen Quellen. Während die Integration unabhängiger Wissensquellen zu einer großen Informationsvielfalt führen kann, führt sie inhärent zu Heterogenitäten in der Wissensrepräsentation. Diese Heterogenität in den Daten gefährdet den praktischen Nutzen der Wissensgraphen. Durch ihre Größe lassen sich die Wissensgraphen allerdings nicht mehr manuell bereinigen. Dafür werden heutzutage häufig automatische und halbautomatische Methoden benötigt. In dieser Arbeit befassen wir uns mit dem Thema Repräsentationsheterogenität. Wir klassifizieren Heterogenität entlang verschiedener Dimensionen und erläutern Heterogenitätsprobleme in Datenbanken, Ontologien und Wissensgraphen. Weiterhin geben wir einen knappen Überblick über verschiedene Techniken zur automatischen Lösung von Heterogenitätsproblemen. Im nächsten Kapitel beschäftigen wir uns mit Entitätsheterogenität. Wir zeigen Probleme auf, die in einem Multi-Wissensgraphen-Szenario aufgrund von fehlerhaften transitiven Links entstehen. Um diese Probleme zu lösen stellen wir vier Techniken vor, mit denen sich die Qualität beliebiger Entity-Alignment-Tools deutlich verbessern lässt. Wir zeigen, dass Relationsheterogenität in Wissensgraphen zu Problemen bei der Anfragenbeantwortung führen kann. Daher entwickeln wir verschiedene Methoden um synonyme Relationen zu finden. Eine der Methoden arbeitet mit hochdimensionalen Wissensgrapheinbettungen, die andere mit einem Rule Mining Ansatz. Beide Methoden können synonyme Relationen in Wissensgraphen mit hoher Qualität erkennen. Darüber hinaus stellen wir eine neuartige Technik zur Vermeidung von Heterogenitätsproblemen vor, bei der wir eine implizite Wissensrepräsentation verwenden. Wir zeigen, dass große neuronale Sprachmodelle eine wertvolle Wissensquelle sind, die ähnlich wie Wissensgraphen angefragt werden können. Im Sprachmodell selbst werden bereits viele der Heterogenitätsprobleme aufgelöst, so dass eine Anfrage heterogener Wissensgraphen möglich wird

Digitale Bibliothek Braunschweig

Proceedings of the 1st Computer Science Student Workshop: Koc University Istinye Campus, Istanbul, Turkey, February 21, 2010

Author
Publication venue: Sabancı University
Publication date: 01/01/2010
Field of study

Sabanci University Research Database

Identification of evolutionary trajectories shared across human betacoronaviruses

Author: Bowden Thomas A
Castelán-Sánchez Hugo G
Escalera-Zamudio Marina
Gutiérrez Bernardo
Hulswit Ruben JG
Inward Rhys PD
Kosakovsky Pond Sergei L
Martínez de la Viña Natalia
Pybus Oliver G
Thézé Julien
van Dorp Lucy
Publication venue: 'Oxford University Press (OUP)'
Publication date: 23/05/2023
Field of study

Comparing the evolution of distantly related viruses can provide insights into common adaptive processes related to shared ecological niches. Phylogenetic approaches, coupled with other molecular evolution tools, can help identify mutations informative on adaptation, whilst the structural contextualization of these to functional sites of proteins may help gain insight into their biological properties. Two zoonotic betacoronaviruses capable of sustained human-to-human transmission have caused pandemics in recent times (SARS-CoV-1 and SARS-CoV-2), whilst a third virus (MERS-CoV) is responsible for sporadic outbreaks linked to animal infections. Moreover, two other betacoronaviruses have circulated endemically in humans for decades (HKU1 and OC43). To search for evidence of adaptive convergence between established and emerging betacoronaviruses capable of sustained human-to-human transmission (HKU1, OC43, SARS-CoV-1 and SARS-CoV-2), we developed a methodological pipeline to classify shared non-synonymous mutations as putatively denoting homoplasy (repeated mutations that do not share direct common ancestry) or stepwise evolution (sequential mutations leading towards a novel genotype). In parallel, we look for evidence of positive selection, and draw upon protein structure data to identify potential biological implications. We find 30 candidate mutations, from which four [codon sites 18121 (nsp14/residue 28), 21623 (spike/21), 21635 (spike/25) and 23948 (spike/796); SARS-CoV-2 genome numbering] further display evolution under positive selection and proximity to functional protein regions. Our findings shed light on potential mechanisms underlying betacoronavirus adaptation to the human host and pinpoint common mutational pathways that may occur during establishment of human endemicity

UCL Discovery

Inference of Evolutionary Forces Acting on Human Biological Pathways.

Author: Daub J.T.
Dupanloup I.
Excoffier L.
Robinson-Rechavi M.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Because natural selection is likely to act on multiple genes underlying a given phenotypic trait, we study here the potential effect of ongoing and past selection on the genetic diversity of human biological pathways. We first show that genes included in gene sets are generally under stronger selective constraints than other genes and that their evolutionary response is correlated. We then introduce a new procedure to detect selection at the pathway level based on a decomposition of the classical McDonald-Kreitman test extended to multiple genes. This new test, called 2DNS, detects outlier gene sets and takes into account past demographic effects and evolutionary constraints specific to gene sets. Selective forces acting on gene sets can be easily identified by a mere visual inspection of the position of the gene sets relative to their two-dimensional null distribution. We thus find several outlier gene sets that show signals of positive, balancing, or purifying selection but also others showing an ancient relaxation of selective constraints. The principle of the 2DNS test can also be applied to other genomic contrasts. For instance, the comparison of patterns of polymorphisms private to African and non-African populations reveals that most pathways show a higher proportion of nonsynonymous mutations in non-Africans than in Africans, potentially due to different demographic histories and selective pressures

Serveur académique lausannois

PubMed Central

Digital.CSIC

Bern Open Repository and Information System (BORIS)

Using Artificial Intelligence for the Specification of m-Health and e-Health Systems

Author: Alwakeel Lyan
Lano Kevin
Tehrani Sobhan Y.
Umar Mohammad
Publication venue: Springer Nature
Publication date: 01/01/2022
Field of study

Artificial intelligence (AI) techniques such as machine learning (ML) have wide application in medical informatics systems. In this chapter, we employ AI techniques to assist in deriving software specifications of e-Health and m-Health systems from informal requirements statements. We use natural language processing (NLP), optical character recognition (OCR), and machine learning to identify required data and behaviour elements of systems from textual and graphical requirements documents. Heuristic rules are used to extract formal specification models of the systems from these documents. The extracted specifications can then be used as the starting point for automated software production using model-driven engineering (MDE). We illustrate the process using an example of a stroke recovery assistant app and evaluate the techniques on several representative systems

UCL Discovery

Recommended from our members

Intrahost Selection Pressures Drive Rapid Dengue Virus Microevolution in Acute Human Infections.

Author: Balmaseda Angel
Eswarappa Meghana
Harris Eva
Montoya Magelda
Parameswaran Poornima
Trivedi Surbhi Bharat
Wang Chunling
Publication venue: eScholarship, University of California
Publication date: 01/09/2017
Field of study

Dengue, caused by four dengue virus serotypes (DENV-1 to DENV-4), is a highly prevalent mosquito-borne viral disease in humans. Yet, selection pressures driving DENV microevolution within human hosts (intrahost) remain unknown. We employed a whole-genome segmented amplification approach coupled with deep sequencing to profile DENV-3 intrahost diversity in peripheral blood mononuclear cell (PBMC) and plasma samples from 77 dengue patients. DENV-3 intrahost diversity appears to be driven by immune pressures as well as replicative success in PBMCs and potentially other replication sites. Hotspots for intrahost variation were detected in 59%-78% of patients in the viral Envelope and pre-Membrane/Membrane proteins, which together form the virion surface. Dominant variants at the hotspots arose via convergent microevolution, appear to be immune-escape variants, and were evolutionarily constrained at the macro level due to viral replication defects. Dengue is thus an example of an acute infection in which selection pressures within infected individuals drive rapid intrahost virus microevolution

eScholarship - University of California