Search CORE

31 research outputs found

Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions

Author: Dehmer Matthias
Holzinger Andreas
Jurisica Igor
Publication venue
Publication date: 01/01/2014
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

TUGraz OPEN Library

Completing and Debugging Ontologies: state of the art and challenges

Author: Lambrix Patrick
Publication venue
Publication date: 02/11/2020
Field of study

As semantically-enabled applications require high-quality ontologies, developing and maintaining ontologies that are as correct and complete as possible is an important although difficult task in ontology engineering. A key step is ontology debugging and completion. In general, there are two steps: detecting defects and repairing defects. In this paper we discuss the state of the art regarding the repairing step. We do this by formalizing the repairing step as an abduction problem and situating the state of the art with respect to this framework. We show that there are still many open research problems and show opportunities for further work and advancing the field.Comment: 56 page

arXiv.org e-Print Archive

Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions

Author: A Blandford
A Blandford
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Holzinger
A Inselberg
A Saad
AB Olshen
AJ Williams
Andreas Holzinger
B Allen
B Shneiderman
BLW Wong
C Patrello
C Ware
CE D'Negri
CE Shannon
CG Begley
CJ Baker
CW Lin
D Butler
D Furniss
D Keim
D Koslicki
D Schroeder
DJ Cook
DR Catchpoole
E Kolker
E Weippl
EH Shortliffe
F Jeanquartier
F Nake
F Wang
G Bell
G Beslon
G Gigerenzer
G Gigerenzer
G Petz
G Singh
H Akil
H Gao
H Hauser
H Hirsh
H Müller
HA Simon
HA Simon
I Fischer
I Jurisica
Igor Jurisica
J Barrera
J Bleiholder
J Zhou
JH Bullard
JP Lee
K Morik
KR Popper
L Hood
M Bloice
M Boisot
M Dehmer
M Dehmer
M Dehmer
M Dehmer
M Dugas
M Elloumi
M Jarke
M Kreuzthaler
M Kreuzthaler
M Kreuzthaler
M Ouzzani
M Polanyi
M Randic
M Sultan
M Tory
M Viceconti
M Wiltgen
M Wiltgen
MA Hall
MA Hernández
Matthias Dehmer
ML Lee
ML Raymer
MM Reeder
N Gehlenborg
N Pržulj
N Pržulj
P Berka
P Shelokar
P Yildirim
PA Kiberstis
R Beale
R Kosara
R Kruse
R Ponzielli
R Todeschini
S Jurack
S Ranganathan
T Fawcett
T Munzner
U Fayyad
U Fayyad
U Fayyad
V Dhar
V Garg
V Pascucci
VL Patel
W Aigner
W Kim
W Li
Y Kotseruba
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Geo-L: Topological Link Discovery for Geospatial Linked Data Made Easy

Author: Kirschenbaum Amit
Zinke-Wehlmann Christian
Publication venue: 'MDPI AG'
Publication date: 04/05/2023
Field of study

Geospatial linked data are an emerging domain, with growing interest in research and the industry. There is an increasing number of publicly available geospatial linked data resources, which can also be interlinked and easily integrated with private and industrial linked data on the web. The present paper introduces Geo-L, a system for the discovery of RDF spatial links based on topological relations. Experiments show that the proposed system improves state-of-the-art spatial linking processes in terms of mapping time and accuracy, as well as concerning resources retrieval efficiency and robustness

Qucosa - Publikationsserver der Universität Leipzig

Discovering relations between indirectly connected biomedical concepts

Author: A McCallum
B Goertzel
C Knox
D Hristovski
DM Blei
DM Mimno
DR Swanson
FM Suchanek
H Liu
I Pohl
J Nivre
K Ravikumar
L Yao
L Yao
M Collins
M Craven
ME Vidal
N Lao
P Srinivasan
R Frijters
R Hoffmann
RC Bunescu
S Deerwester
T Cohen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Dance performance in cyberspace - transfer and transformation

Author: Varanda P.
Varanda P.
Publication venue
Publication date: 01/01/2015
Field of study

The aim of this research undertaking is to understand the potential development of dance performance in the context of cyberculture, by examining the way practitioners use new media to create artworks that include audience participation, and by endeavouring in their theorization. With specific reference to cyberspace as a concept of electronic, networked and navigable space, the enquiry traces the connections such practices have with conventions of the medium of dance, which operate in its widely known condition as a live performing art. But acknowledgement that new media and new contexts of production and reception inform the characteristics of these artworks and their discursive articulation, in terms of the way people and digital technologies interact in contemporary culture, is a major principle to their analysis and evaluation. This qualitative research is based on case-study design as a means of finding pragmatic evidence in particulars, to illustrate abstract concepts, technological processes and aesthetic values that are underway in a new area of knowledge. The field where this research operates within is located by a mapping of published literature that informs a theoretical interdisciplinary framework, which contextualizes the interpretation of artworks. The selected case studies have been subject to a process of systematic and detailed analysis, entailed with a model devised for the purpose of this enquiry. From this undertaking it can be claimed that while an extensive array of technologies, media and interactive models is available in this field, the artists pursue a commitment to demonstrate their worth for specifically developing (new media) dance performance, and for dance performance to articulate technological and critical issues for cyberculture studies. The results of this enquiry also contribute to conceptual understanding of what dance can be, today, in the light of technological changes

Middlesex University Research Repository

Recent advances of wearable antennas in materials, fabrication methods, designs, and their applications: state-of-the-art

Author: Abbasi Qammer H.
Abidin Zuhairiah Zainal
Ali Shahid M.
Imran Muhammad A.
Socheatra Soeung
Sovuthy Cheab
Publication venue: 'MDPI AG'
Publication date: 24/09/2020
Field of study

The demand for wearable technologies has grown tremendously in recent years. Wearable antennas are used for various applications, in many cases within the context of wireless body area networks (WBAN). In WBAN, the presence of the human body poses a significant challenge to the wearable antennas. Specifically, such requirements are required to be considered on a priority basis in the wearable antennas, such as structural deformation, precision, and accuracy in fabrication methods and their size. Various researchers are active in this field and, accordingly, some significant progress has been achieved recently. This article attempts to critically review the wearable antennas especially in light of new materials and fabrication methods, and novel designs, such as miniaturized button antennas and miniaturized single and multi-band antennas, and their unique smart applications in WBAN. Finally, the conclusion has been drawn with respect to some future directions

Enlighten

Federated Query Processing over Heterogeneous Data Sources in a Semantic Data Lake

Author: Endris Kemele M.
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Data provides the basis for emerging scientific and interdisciplinary data-centric applications with the potential of improving the quality of life for citizens. Big Data plays an important role in promoting both manufacturing and scientific development through industrial digitization and emerging interdisciplinary research. Open data initiatives have encouraged the publication of Big Data by exploiting the decentralized nature of the Web, allowing for the availability of heterogeneous data generated and maintained by autonomous data providers. Consequently, the growing volume of data consumed by different applications raise the need for effective data integration approaches able to process a large volume of data that is represented in different format, schema and model, which may also include sensitive data, e.g., financial transactions, medical procedures, or personal data. Data Lakes are composed of heterogeneous data sources in their original format, that reduce the overhead of materialized data integration. Query processing over Data Lakes require the semantic description of data collected from heterogeneous data sources. A Data Lake with such semantic annotations is referred to as a Semantic Data Lake. Transforming Big Data into actionable knowledge demands novel and scalable techniques for enabling not only Big Data ingestion and curation to the Semantic Data Lake, but also for efficient large-scale semantic data integration, exploration, and discovery. Federated query processing techniques utilize source descriptions to find relevant data sources and find efficient execution plan that minimize the total execution time and maximize the completeness of answers. Existing federated query processing engines employ a coarse-grained description model where the semantics encoded in data sources are ignored. Such descriptions may lead to the erroneous selection of data sources for a query and unnecessary retrieval of data, affecting thus the performance of query processing engine. In this thesis, we address the problem of federated query processing against heterogeneous data sources in a Semantic Data Lake. First, we tackle the challenge of knowledge representation and propose a novel source description model, RDF Molecule Templates, that describe knowledge available in a Semantic Data Lake. RDF Molecule Templates (RDF-MTs) describes data sources in terms of an abstract description of entities belonging to the same semantic concept. Then, we propose a technique for data source selection and query decomposition, the MULDER approach, and query planning and optimization techniques, Ontario, that exploit the characteristics of heterogeneous data sources described using RDF-MTs and provide a uniform access to heterogeneous data sources. We then address the challenge of enforcing privacy and access control requirements imposed by data providers. We introduce a privacy-aware federated query technique, BOUNCER, able to enforce privacy and access control regulations during query processing over data sources in a Semantic Data Lake. In particular, BOUNCER exploits RDF-MTs based source descriptions in order to express privacy and access control policies as well as their automatic enforcement during source selection, query decomposition, and planning. Furthermore, BOUNCER implements query decomposition and optimization techniques able to identify query plans over data sources that not only contain the relevant entities to answer a query, but also are regulated by policies that allow for accessing these relevant entities. Finally, we tackle the problem of interest based update propagation and co-evolution of data sources. We present a novel approach for interest-based RDF update propagation that consistently maintains a full or partial replication of large datasets and deal with co-evolution

bonndoc – Der Publikationsserver der Universität Bonn

Cross-Domain information extraction from scientific articles for research knowledge graphs

Author: Brack Arthur
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2022
Field of study

Today’s scholarly communication is a document-centred process and as such, rather inefficient. Fundamental contents of research papers are not accessible by computers since they are only present in unstructured PDF files. Therefore, current research infrastructures are not able to assist scientists appropriately in their core research tasks. This thesis addresses this issue and proposes methods to automatically extract relevant information from scientific articles for Research Knowledge Graphs (RKGs) that represent scholarly knowledge structured and interlinked. First, this thesis conducts a requirements analysis for an Open Research Knowledge Graph (ORKG). We present literature-related use cases of researchers that should be supported by an ORKG-based system and their specific requirements for the underlying ontology and instance data. Based on this analysis, the identified use cases are categorised into two groups: The first group of use cases needs manual or semi-automatic approaches for knowledge graph (KG) construction since they require high correctness of the instance data. The second group requires high completeness and can tolerate noisy instance data. Thus, this group needs automatic approaches for KG population. This thesis focuses on the second group of use cases and provides contributions for machine learning tasks that aim to support them. To assess the relevance of a research paper, scientists usually skim through titles, abstracts, introductions, and conclusions. An organised presentation of the articles' essential information would make this process more time-efficient. The task of sequential sentence classification addresses this issue by classifying sentences in an article in categories like research problem, used methods, or obtained results. To address this problem, we propose a novel unified cross-domain multi-task deep learning approach that makes use of datasets from different scientific domains (e.g. biomedicine and computer graphics) and varying structures (e.g. datasets covering either only abstracts or full papers). Our approach outperforms the state of the art on full paper datasets significantly while being competitive for datasets consisting of abstracts. Moreover, our approach enables the categorisation of sentences in a domain-independent manner. Furthermore, we present the novel task of domain-independent information extraction to extract scientific concepts from research papers in a domain-independent manner. This task aims to support the use cases find related work and get recommended articles. For this purpose, we introduce a set of generic scientific concepts that are relevant over ten domains in Science, Technology, and Medicine (STM) and release an annotated dataset of 110 abstracts from these domains. Since the annotation of scientific text is costly, we suggest an active learning strategy based on a state-of-the-art deep learning approach. The proposed method enables us to nearly halve the amount of required training data. Then, we extend this domain-independent information extraction approach with the task of \textit{coreference resolution}. Coreference resolution aims to identify mentions that refer to the same concept or entity. Baseline results on our corpus with current state-of-the-art approaches for coreference resolution showed that current approaches perform poorly on scientific text. Therefore, we propose a sequential transfer learning approach that exploits annotated datasets from non-academic domains. Our experimental results demonstrate that our approach noticeably outperforms the state-of-the-art baselines. Additionally, we investigate the impact of coreference resolution on KG population. We demonstrate that coreference resolution has a small impact on the number of resulting concepts in the KG, but improved its quality significantly. Consequently, using our domain-independent information extraction approach, we populate an RKG from 55,485 abstracts of the ten investigated STM domains. We show that every domain mainly uses its own terminology and that the populated RKG contains useful concepts. Moreover, we propose a novel approach for the task of \textit{citation recommendation}. This task can help researchers improve the quality of their work by finding or recommending relevant related work. Our approach exploits RKGs that interlink research papers based on mentioned scientific concepts. Using our automatically populated RKG, we demonstrate that the combination of information from RKGs with existing state-of-the-art approaches is beneficial. Finally, we conclude the thesis and sketch possible directions of future work.Die Kommunikation von Forschungsergebnissen erfolgt heutzutage in Form von Dokumenten und ist aus verschiedenen Gründen ineffizient. Wesentliche Inhalte von Forschungsarbeiten sind für Computer nicht zugänglich, da sie in unstrukturierten PDF-Dateien verborgen sind. Daher können derzeitige Forschungsinfrastrukturen Forschende bei ihren Kernaufgaben nicht angemessen unterstützen. Diese Arbeit befasst sich mit dieser Problemstellung und untersucht Methoden zur automatischen Extraktion von relevanten Informationen aus Forschungspapieren für Forschungswissensgraphen (Research Knowledge Graphs). Solche Graphen sollen wissenschaftliches Wissen maschinenlesbar strukturieren und verknüpfen. Zunächst wird eine Anforderungsanalyse für einen Open Research Knowledge Graph (ORKG) durchgeführt. Wir stellen literaturbezogene Anwendungsfälle von Forschenden vor, die durch ein ORKG-basiertes System unterstützt werden sollten, und deren spezifische Anforderungen an die zugrundeliegende Ontologie und die Instanzdaten. Darauf aufbauend werden die identifizierten Anwendungsfälle in zwei Gruppen eingeteilt: Die erste Gruppe von Anwendungsfällen benötigt manuelle oder halbautomatische Ansätze für die Konstruktion eines ORKG, da sie eine hohe Korrektheit der Instanzdaten erfordern. Die zweite Gruppe benötigt eine hohe Vollständigkeit der Instanzdaten und kann fehlerhafte Daten tolerieren. Daher erfordert diese Gruppe automatische Ansätze für die Konstruktion des ORKG. Diese Arbeit fokussiert sich auf die zweite Gruppe von Anwendungsfällen und schlägt Methoden für maschinelle Aufgabenstellungen vor, die diese Anwendungsfälle unterstützen können. Um die Relevanz eines Forschungsartikels effizient beurteilen zu können, schauen sich Forschende in der Regel die Titel, Zusammenfassungen, Einleitungen und Schlussfolgerungen an. Durch eine strukturierte Darstellung von wesentlichen Informationen des Artikels könnte dieser Prozess zeitsparender gestaltet werden. Die Aufgabenstellung der sequenziellen Satzklassifikation befasst sich mit diesem Problem, indem Sätze eines Artikels in Kategorien wie Forschungsproblem, verwendete Methoden oder erzielte Ergebnisse automatisch klassifiziert werden. In dieser Arbeit wird für diese Aufgabenstellung ein neuer vereinheitlichter Multi-Task Deep-Learning-Ansatz vorgeschlagen, der Datensätze aus verschiedenen wissenschaftlichen Bereichen (z. B. Biomedizin und Computergrafik) mit unterschiedlichen Strukturen (z. B. Datensätze bestehend aus Zusammenfassungen oder vollständigen Artikeln) nutzt. Unser Ansatz übertrifft State-of-the-Art-Verfahren der Literatur auf Benchmark-Datensätzen bestehend aus vollständigen Forschungsartikeln. Außerdem ermöglicht unser Ansatz die Klassifizierung von Sätzen auf eine domänenunabhängige Weise. Darüber hinaus stellen wir die neue Aufgabenstellung domänenübergreifende Informationsextraktion vor. Hierbei werden, unabhängig vom behandelten wissenschaftlichen Fachgebiet, inhaltliche Konzepte aus Forschungspapieren extrahiert. Damit sollen die Anwendungsfälle Finden von verwandten Arbeiten und Empfehlung von Artikeln unterstützt werden. Zu diesem Zweck führen wir eine Reihe von generischen wissenschaftlichen Konzepten ein, die in zehn Bereichen der Wissenschaft, Technologie und Medizin (STM) relevant sind, und veröffentlichen einen annotierten Datensatz von 110 Zusammenfassungen aus diesen Bereichen. Da die Annotation wissenschaftlicher Texte aufwändig ist, kombinieren wir ein Active-Learning-Verfahren mit einem aktuellen Deep-Learning-Ansatz, um die notwendigen Trainingsdaten zu reduzieren. Die vorgeschlagene Methode ermöglicht es uns, die Menge der erforderlichen Trainingsdaten nahezu zu halbieren. Anschließend erweitern wir unseren domänenunabhängigen Ansatz zur Informationsextraktion um die Aufgabe der Koreferenzauflösung. Die Auflösung von Koreferenzen zielt darauf ab, Erwähnungen zu identifizieren, die sich auf dasselbe Konzept oder dieselbe Entität beziehen. Experimentelle Ergebnisse auf unserem Korpus mit aktuellen Ansätzen zur Koreferenzauflösung haben gezeigt, dass diese bei wissenschaftlichen Texten unzureichend abschneiden. Daher schlagen wir eine Transfer-Learning-Methode vor, die annotierte Datensätze aus nicht-akademischen Bereichen nutzt. Die experimentellen Ergebnisse zeigen, dass unser Ansatz deutlich besser abschneidet als die bisherigen Ansätze. Darüber hinaus untersuchen wir den Einfluss der Koreferenzauflösung auf die Erstellung von Wissensgraphen. Wir zeigen, dass diese einen geringen Einfluss auf die Anzahl der resultierenden Konzepte in dem Wissensgraphen hat, aber die Qualität des Wissensgraphen deutlich verbessert. Mithilfe unseres domänenunabhängigen Ansatzes zur Informationsextraktion haben wir aus 55.485 Zusammenfassungen der zehn untersuchten STM-Domänen einen Forschungswissensgraphen erstellt. Unsere Analyse zeigt, dass jede Domäne hauptsächlich ihre eigene Terminologie verwendet und dass der erstellte Wissensgraph nützliche Konzepte enthält. Schließlich schlagen wir einen Ansatz für die Empfehlung von passenden Referenzen vor. Damit können Forschende einfacher relevante verwandte Arbeiten finden oder passende Empfehlungen erhalten. Unser Ansatz nutzt Forschungswissensgraphen, die Forschungsarbeiten mit in ihnen erwähnten wissenschaftlichen Konzepten verknüpfen. Wir zeigen, dass aktuelle Verfahren zur Empfehlung von Referenzen von zusätzlichen Informationen aus einem automatisch erstellten Wissensgraphen profitieren. Zum Schluss wird ein Fazit gezogen und ein Ausblick für mögliche zukünftige Arbeiten gegeben

Institutionelles Repositorium der Leibniz Universität Hannover

Novel SMART Textiles

Author
Publication venue: 'MDPI AG'
Publication date: 01/04/2020
Field of study

Heriot Watt Pure