31 research outputs found

    Completing and Debugging Ontologies: state of the art and challenges

    Full text link
    As semantically-enabled applications require high-quality ontologies, developing and maintaining ontologies that are as correct and complete as possible is an important although difficult task in ontology engineering. A key step is ontology debugging and completion. In general, there are two steps: detecting defects and repairing defects. In this paper we discuss the state of the art regarding the repairing step. We do this by formalizing the repairing step as an abduction problem and situating the state of the art with respect to this framework. We show that there are still many open research problems and show opportunities for further work and advancing the field.Comment: 56 page

    Geo-L: Topological Link Discovery for Geospatial Linked Data Made Easy

    Get PDF
    Geospatial linked data are an emerging domain, with growing interest in research and the industry. There is an increasing number of publicly available geospatial linked data resources, which can also be interlinked and easily integrated with private and industrial linked data on the web. The present paper introduces Geo-L, a system for the discovery of RDF spatial links based on topological relations. Experiments show that the proposed system improves state-of-the-art spatial linking processes in terms of mapping time and accuracy, as well as concerning resources retrieval efficiency and robustness

    Dance performance in cyberspace - transfer and transformation

    Get PDF
    The aim of this research undertaking is to understand the potential development of dance performance in the context of cyberculture, by examining the way practitioners use new media to create artworks that include audience participation, and by endeavouring in their theorization. With specific reference to cyberspace as a concept of electronic, networked and navigable space, the enquiry traces the connections such practices have with conventions of the medium of dance, which operate in its widely known condition as a live performing art. But acknowledgement that new media and new contexts of production and reception inform the characteristics of these artworks and their discursive articulation, in terms of the way people and digital technologies interact in contemporary culture, is a major principle to their analysis and evaluation. This qualitative research is based on case-study design as a means of finding pragmatic evidence in particulars, to illustrate abstract concepts, technological processes and aesthetic values that are underway in a new area of knowledge. The field where this research operates within is located by a mapping of published literature that informs a theoretical interdisciplinary framework, which contextualizes the interpretation of artworks. The selected case studies have been subject to a process of systematic and detailed analysis, entailed with a model devised for the purpose of this enquiry. From this undertaking it can be claimed that while an extensive array of technologies, media and interactive models is available in this field, the artists pursue a commitment to demonstrate their worth for specifically developing (new media) dance performance, and for dance performance to articulate technological and critical issues for cyberculture studies. The results of this enquiry also contribute to conceptual understanding of what dance can be, today, in the light of technological changes

    Recent advances of wearable antennas in materials, fabrication methods, designs, and their applications: state-of-the-art

    Get PDF
    The demand for wearable technologies has grown tremendously in recent years. Wearable antennas are used for various applications, in many cases within the context of wireless body area networks (WBAN). In WBAN, the presence of the human body poses a significant challenge to the wearable antennas. Specifically, such requirements are required to be considered on a priority basis in the wearable antennas, such as structural deformation, precision, and accuracy in fabrication methods and their size. Various researchers are active in this field and, accordingly, some significant progress has been achieved recently. This article attempts to critically review the wearable antennas especially in light of new materials and fabrication methods, and novel designs, such as miniaturized button antennas and miniaturized single and multi-band antennas, and their unique smart applications in WBAN. Finally, the conclusion has been drawn with respect to some future directions

    Federated Query Processing over Heterogeneous Data Sources in a Semantic Data Lake

    Get PDF
    Data provides the basis for emerging scientific and interdisciplinary data-centric applications with the potential of improving the quality of life for citizens. Big Data plays an important role in promoting both manufacturing and scientific development through industrial digitization and emerging interdisciplinary research. Open data initiatives have encouraged the publication of Big Data by exploiting the decentralized nature of the Web, allowing for the availability of heterogeneous data generated and maintained by autonomous data providers. Consequently, the growing volume of data consumed by different applications raise the need for effective data integration approaches able to process a large volume of data that is represented in different format, schema and model, which may also include sensitive data, e.g., financial transactions, medical procedures, or personal data. Data Lakes are composed of heterogeneous data sources in their original format, that reduce the overhead of materialized data integration. Query processing over Data Lakes require the semantic description of data collected from heterogeneous data sources. A Data Lake with such semantic annotations is referred to as a Semantic Data Lake. Transforming Big Data into actionable knowledge demands novel and scalable techniques for enabling not only Big Data ingestion and curation to the Semantic Data Lake, but also for efficient large-scale semantic data integration, exploration, and discovery. Federated query processing techniques utilize source descriptions to find relevant data sources and find efficient execution plan that minimize the total execution time and maximize the completeness of answers. Existing federated query processing engines employ a coarse-grained description model where the semantics encoded in data sources are ignored. Such descriptions may lead to the erroneous selection of data sources for a query and unnecessary retrieval of data, affecting thus the performance of query processing engine. In this thesis, we address the problem of federated query processing against heterogeneous data sources in a Semantic Data Lake. First, we tackle the challenge of knowledge representation and propose a novel source description model, RDF Molecule Templates, that describe knowledge available in a Semantic Data Lake. RDF Molecule Templates (RDF-MTs) describes data sources in terms of an abstract description of entities belonging to the same semantic concept. Then, we propose a technique for data source selection and query decomposition, the MULDER approach, and query planning and optimization techniques, Ontario, that exploit the characteristics of heterogeneous data sources described using RDF-MTs and provide a uniform access to heterogeneous data sources. We then address the challenge of enforcing privacy and access control requirements imposed by data providers. We introduce a privacy-aware federated query technique, BOUNCER, able to enforce privacy and access control regulations during query processing over data sources in a Semantic Data Lake. In particular, BOUNCER exploits RDF-MTs based source descriptions in order to express privacy and access control policies as well as their automatic enforcement during source selection, query decomposition, and planning. Furthermore, BOUNCER implements query decomposition and optimization techniques able to identify query plans over data sources that not only contain the relevant entities to answer a query, but also are regulated by policies that allow for accessing these relevant entities. Finally, we tackle the problem of interest based update propagation and co-evolution of data sources. We present a novel approach for interest-based RDF update propagation that consistently maintains a full or partial replication of large datasets and deal with co-evolution

    Cross-Domain information extraction from scientific articles for research knowledge graphs

    Get PDF
    Today’s scholarly communication is a document-centred process and as such, rather inefficient. Fundamental contents of research papers are not accessible by computers since they are only present in unstructured PDF files. Therefore, current research infrastructures are not able to assist scientists appropriately in their core research tasks. This thesis addresses this issue and proposes methods to automatically extract relevant information from scientific articles for Research Knowledge Graphs (RKGs) that represent scholarly knowledge structured and interlinked. First, this thesis conducts a requirements analysis for an Open Research Knowledge Graph (ORKG). We present literature-related use cases of researchers that should be supported by an ORKG-based system and their specific requirements for the underlying ontology and instance data. Based on this analysis, the identified use cases are categorised into two groups: The first group of use cases needs manual or semi-automatic approaches for knowledge graph (KG) construction since they require high correctness of the instance data. The second group requires high completeness and can tolerate noisy instance data. Thus, this group needs automatic approaches for KG population. This thesis focuses on the second group of use cases and provides contributions for machine learning tasks that aim to support them. To assess the relevance of a research paper, scientists usually skim through titles, abstracts, introductions, and conclusions. An organised presentation of the articles' essential information would make this process more time-efficient. The task of sequential sentence classification addresses this issue by classifying sentences in an article in categories like research problem, used methods, or obtained results. To address this problem, we propose a novel unified cross-domain multi-task deep learning approach that makes use of datasets from different scientific domains (e.g. biomedicine and computer graphics) and varying structures (e.g. datasets covering either only abstracts or full papers). Our approach outperforms the state of the art on full paper datasets significantly while being competitive for datasets consisting of abstracts. Moreover, our approach enables the categorisation of sentences in a domain-independent manner. Furthermore, we present the novel task of domain-independent information extraction to extract scientific concepts from research papers in a domain-independent manner. This task aims to support the use cases find related work and get recommended articles. For this purpose, we introduce a set of generic scientific concepts that are relevant over ten domains in Science, Technology, and Medicine (STM) and release an annotated dataset of 110 abstracts from these domains. Since the annotation of scientific text is costly, we suggest an active learning strategy based on a state-of-the-art deep learning approach. The proposed method enables us to nearly halve the amount of required training data. Then, we extend this domain-independent information extraction approach with the task of \textit{coreference resolution}. Coreference resolution aims to identify mentions that refer to the same concept or entity. Baseline results on our corpus with current state-of-the-art approaches for coreference resolution showed that current approaches perform poorly on scientific text. Therefore, we propose a sequential transfer learning approach that exploits annotated datasets from non-academic domains. Our experimental results demonstrate that our approach noticeably outperforms the state-of-the-art baselines. Additionally, we investigate the impact of coreference resolution on KG population. We demonstrate that coreference resolution has a small impact on the number of resulting concepts in the KG, but improved its quality significantly. Consequently, using our domain-independent information extraction approach, we populate an RKG from 55,485 abstracts of the ten investigated STM domains. We show that every domain mainly uses its own terminology and that the populated RKG contains useful concepts. Moreover, we propose a novel approach for the task of \textit{citation recommendation}. This task can help researchers improve the quality of their work by finding or recommending relevant related work. Our approach exploits RKGs that interlink research papers based on mentioned scientific concepts. Using our automatically populated RKG, we demonstrate that the combination of information from RKGs with existing state-of-the-art approaches is beneficial. Finally, we conclude the thesis and sketch possible directions of future work.Die Kommunikation von Forschungsergebnissen erfolgt heutzutage in Form von Dokumenten und ist aus verschiedenen GrĂŒnden ineffizient. Wesentliche Inhalte von Forschungsarbeiten sind fĂŒr Computer nicht zugĂ€nglich, da sie in unstrukturierten PDF-Dateien verborgen sind. Daher können derzeitige Forschungsinfrastrukturen Forschende bei ihren Kernaufgaben nicht angemessen unterstĂŒtzen. Diese Arbeit befasst sich mit dieser Problemstellung und untersucht Methoden zur automatischen Extraktion von relevanten Informationen aus Forschungspapieren fĂŒr Forschungswissensgraphen (Research Knowledge Graphs). Solche Graphen sollen wissenschaftliches Wissen maschinenlesbar strukturieren und verknĂŒpfen. ZunĂ€chst wird eine Anforderungsanalyse fĂŒr einen Open Research Knowledge Graph (ORKG) durchgefĂŒhrt. Wir stellen literaturbezogene AnwendungsfĂ€lle von Forschenden vor, die durch ein ORKG-basiertes System unterstĂŒtzt werden sollten, und deren spezifische Anforderungen an die zugrundeliegende Ontologie und die Instanzdaten. Darauf aufbauend werden die identifizierten AnwendungsfĂ€lle in zwei Gruppen eingeteilt: Die erste Gruppe von AnwendungsfĂ€llen benötigt manuelle oder halbautomatische AnsĂ€tze fĂŒr die Konstruktion eines ORKG, da sie eine hohe Korrektheit der Instanzdaten erfordern. Die zweite Gruppe benötigt eine hohe VollstĂ€ndigkeit der Instanzdaten und kann fehlerhafte Daten tolerieren. Daher erfordert diese Gruppe automatische AnsĂ€tze fĂŒr die Konstruktion des ORKG. Diese Arbeit fokussiert sich auf die zweite Gruppe von AnwendungsfĂ€llen und schlĂ€gt Methoden fĂŒr maschinelle Aufgabenstellungen vor, die diese AnwendungsfĂ€lle unterstĂŒtzen können. Um die Relevanz eines Forschungsartikels effizient beurteilen zu können, schauen sich Forschende in der Regel die Titel, Zusammenfassungen, Einleitungen und Schlussfolgerungen an. Durch eine strukturierte Darstellung von wesentlichen Informationen des Artikels könnte dieser Prozess zeitsparender gestaltet werden. Die Aufgabenstellung der sequenziellen Satzklassifikation befasst sich mit diesem Problem, indem SĂ€tze eines Artikels in Kategorien wie Forschungsproblem, verwendete Methoden oder erzielte Ergebnisse automatisch klassifiziert werden. In dieser Arbeit wird fĂŒr diese Aufgabenstellung ein neuer vereinheitlichter Multi-Task Deep-Learning-Ansatz vorgeschlagen, der DatensĂ€tze aus verschiedenen wissenschaftlichen Bereichen (z. B. Biomedizin und Computergrafik) mit unterschiedlichen Strukturen (z. B. DatensĂ€tze bestehend aus Zusammenfassungen oder vollstĂ€ndigen Artikeln) nutzt. Unser Ansatz ĂŒbertrifft State-of-the-Art-Verfahren der Literatur auf Benchmark-DatensĂ€tzen bestehend aus vollstĂ€ndigen Forschungsartikeln. Außerdem ermöglicht unser Ansatz die Klassifizierung von SĂ€tzen auf eine domĂ€nenunabhĂ€ngige Weise. DarĂŒber hinaus stellen wir die neue Aufgabenstellung domĂ€nenĂŒbergreifende Informationsextraktion vor. Hierbei werden, unabhĂ€ngig vom behandelten wissenschaftlichen Fachgebiet, inhaltliche Konzepte aus Forschungspapieren extrahiert. Damit sollen die AnwendungsfĂ€lle Finden von verwandten Arbeiten und Empfehlung von Artikeln unterstĂŒtzt werden. Zu diesem Zweck fĂŒhren wir eine Reihe von generischen wissenschaftlichen Konzepten ein, die in zehn Bereichen der Wissenschaft, Technologie und Medizin (STM) relevant sind, und veröffentlichen einen annotierten Datensatz von 110 Zusammenfassungen aus diesen Bereichen. Da die Annotation wissenschaftlicher Texte aufwĂ€ndig ist, kombinieren wir ein Active-Learning-Verfahren mit einem aktuellen Deep-Learning-Ansatz, um die notwendigen Trainingsdaten zu reduzieren. Die vorgeschlagene Methode ermöglicht es uns, die Menge der erforderlichen Trainingsdaten nahezu zu halbieren. Anschließend erweitern wir unseren domĂ€nenunabhĂ€ngigen Ansatz zur Informationsextraktion um die Aufgabe der Koreferenzauflösung. Die Auflösung von Koreferenzen zielt darauf ab, ErwĂ€hnungen zu identifizieren, die sich auf dasselbe Konzept oder dieselbe EntitĂ€t beziehen. Experimentelle Ergebnisse auf unserem Korpus mit aktuellen AnsĂ€tzen zur Koreferenzauflösung haben gezeigt, dass diese bei wissenschaftlichen Texten unzureichend abschneiden. Daher schlagen wir eine Transfer-Learning-Methode vor, die annotierte DatensĂ€tze aus nicht-akademischen Bereichen nutzt. Die experimentellen Ergebnisse zeigen, dass unser Ansatz deutlich besser abschneidet als die bisherigen AnsĂ€tze. DarĂŒber hinaus untersuchen wir den Einfluss der Koreferenzauflösung auf die Erstellung von Wissensgraphen. Wir zeigen, dass diese einen geringen Einfluss auf die Anzahl der resultierenden Konzepte in dem Wissensgraphen hat, aber die QualitĂ€t des Wissensgraphen deutlich verbessert. Mithilfe unseres domĂ€nenunabhĂ€ngigen Ansatzes zur Informationsextraktion haben wir aus 55.485 Zusammenfassungen der zehn untersuchten STM-DomĂ€nen einen Forschungswissensgraphen erstellt. Unsere Analyse zeigt, dass jede DomĂ€ne hauptsĂ€chlich ihre eigene Terminologie verwendet und dass der erstellte Wissensgraph nĂŒtzliche Konzepte enthĂ€lt. Schließlich schlagen wir einen Ansatz fĂŒr die Empfehlung von passenden Referenzen vor. Damit können Forschende einfacher relevante verwandte Arbeiten finden oder passende Empfehlungen erhalten. Unser Ansatz nutzt Forschungswissensgraphen, die Forschungsarbeiten mit in ihnen erwĂ€hnten wissenschaftlichen Konzepten verknĂŒpfen. Wir zeigen, dass aktuelle Verfahren zur Empfehlung von Referenzen von zusĂ€tzlichen Informationen aus einem automatisch erstellten Wissensgraphen profitieren. Zum Schluss wird ein Fazit gezogen und ein Ausblick fĂŒr mögliche zukĂŒnftige Arbeiten gegeben

    Novel SMART Textiles

    Get PDF
    corecore