1,654 research outputs found

    Semantically Resolving Type Mismatches in Scientific Workflows

    No full text
    Scientists are increasingly utilizing Grids to manage large data sets and execute scientific experiments on distributed resources. Scientific workflows are used as means for modeling and enacting scientific experiments. Windows Workflow Foundation (WF) is a major component of Microsoft’s .NET technology which offers lightweight support for long-running workflows. It provides a comfortable graphical and programmatic environment for the development of extended BPEL-style workflows. WF’s visual features ease the syntactic composition of Web services into scientific workflows but do nothing to assure that information passed between services has consistent semantic types or representations or that deviant flows, errors and compensations are handled meaningfully. In this paper we introduce SAWSDL-compliant annotations for WF and use them with a semantic reasoner to guarantee semantic type correctness in scientific workflows. Examples from bioinformatics are presented

    Semantically defined Analytics for Industrial Equipment Diagnostics

    Get PDF
    In this age of digitalization, industries everywhere accumulate massive amount of data such that it has become the lifeblood of the global economy. This data may come from various heterogeneous systems, equipment, components, sensors, systems and applications in many varieties (diversity of sources), velocities (high rate of changes) and volumes (sheer data size). Despite significant advances in the ability to collect, store, manage and filter data, the real value lies in the analytics. Raw data is meaningless, unless it is properly processed to actionable (business) insights. Those that know how to harness data effectively, have a decisive competitive advantage, through raising performance by making faster and smart decisions, improving short and long-term strategic planning, offering more user-centric products and services and fostering innovation. Two distinct paradigms in practice can be discerned within the field of analytics: semantic-driven (deductive) and data-driven (inductive). The first emphasizes logic as a way of representing the domain knowledge encoded in rules or ontologies and are often carefully curated and maintained. However, these models are often highly complex, and require intensive knowledge processing capabilities. Data-driven analytics employ machine learning (ML) to directly learn a model from the data with minimal human intervention. However, these models are tuned to trained data and context, making it difficult to adapt. Industries today that want to create value from data must master these paradigms in combination. However, there is great need in data analytics to seamlessly combine semantic-driven and data-driven processing techniques in an efficient and scalable architecture that allows extracting actionable insights from an extreme variety of data. In this thesis, we address these needs by providing: • A unified representation of domain-specific and analytical semantics, in form of ontology models called TechOnto Ontology Stack. It is highly expressive, platform-independent formalism to capture conceptual semantics of industrial systems such as technical system hierarchies, component partonomies etc and its analytical functional semantics. • A new ontology language Semantically defined Analytical Language (SAL) on top of the ontology model that extends existing DatalogMTL (a Horn fragment of Metric Temporal Logic) with analytical functions as first class citizens. • A method to generate semantic workflows using our SAL language. It helps in authoring, reusing and maintaining complex analytical tasks and workflows in an abstract fashion. • A multi-layer architecture that fuses knowledge- and data-driven analytics into a federated and distributed solution. To our knowledge, the work in this thesis is one of the first works to introduce and investigate the use of the semantically defined analytics in an ontology-based data access setting for industrial analytical applications. The reason behind focusing our work and evaluation on industrial data is due to (i) the adoption of semantic technology by the industries in general, and (ii) the common need in literature and in practice to allow domain expertise to drive the data analytics on semantically interoperable sources, while still harnessing the power of analytics to enable real-time data insights. Given the evaluation results of three use-case studies, our approach surpass state-of-the-art approaches for most application scenarios.Im Zeitalter der Digitalisierung sammeln die Industrien überall massive Daten-mengen, die zum Lebenselixier der Weltwirtschaft geworden sind. Diese Daten können aus verschiedenen heterogenen Systemen, Geräten, Komponenten, Sensoren, Systemen und Anwendungen in vielen Varianten (Vielfalt der Quellen), Geschwindigkeiten (hohe Änderungsrate) und Volumina (reine Datengröße) stammen. Trotz erheblicher Fortschritte in der Fähigkeit, Daten zu sammeln, zu speichern, zu verwalten und zu filtern, liegt der eigentliche Wert in der Analytik. Rohdaten sind bedeutungslos, es sei denn, sie werden ordnungsgemäß zu verwertbaren (Geschäfts-)Erkenntnissen verarbeitet. Wer weiß, wie man Daten effektiv nutzt, hat einen entscheidenden Wettbewerbsvorteil, indem er die Leistung steigert, indem er schnellere und intelligentere Entscheidungen trifft, die kurz- und langfristige strategische Planung verbessert, mehr benutzerorientierte Produkte und Dienstleistungen anbietet und Innovationen fördert. In der Praxis lassen sich im Bereich der Analytik zwei unterschiedliche Paradigmen unterscheiden: semantisch (deduktiv) und Daten getrieben (induktiv). Die erste betont die Logik als eine Möglichkeit, das in Regeln oder Ontologien kodierte Domänen-wissen darzustellen, und wird oft sorgfältig kuratiert und gepflegt. Diese Modelle sind jedoch oft sehr komplex und erfordern eine intensive Wissensverarbeitung. Datengesteuerte Analysen verwenden maschinelles Lernen (ML), um mit minimalem menschlichen Eingriff direkt ein Modell aus den Daten zu lernen. Diese Modelle sind jedoch auf trainierte Daten und Kontext abgestimmt, was die Anpassung erschwert. Branchen, die heute Wert aus Daten schaffen wollen, müssen diese Paradigmen in Kombination meistern. Es besteht jedoch ein großer Bedarf in der Daten-analytik, semantisch und datengesteuerte Verarbeitungstechniken nahtlos in einer effizienten und skalierbaren Architektur zu kombinieren, die es ermöglicht, aus einer extremen Datenvielfalt verwertbare Erkenntnisse zu gewinnen. In dieser Arbeit, die wir auf diese Bedürfnisse durch die Bereitstellung: • Eine einheitliche Darstellung der Domänen-spezifischen und analytischen Semantik in Form von Ontologie Modellen, genannt TechOnto Ontology Stack. Es ist ein hoch-expressiver, plattformunabhängiger Formalismus, die konzeptionelle Semantik industrieller Systeme wie technischer Systemhierarchien, Komponenten-partonomien usw. und deren analytische funktionale Semantik zu erfassen. • Eine neue Ontologie-Sprache Semantically defined Analytical Language (SAL) auf Basis des Ontologie-Modells das bestehende DatalogMTL (ein Horn fragment der metrischen temporären Logik) um analytische Funktionen als erstklassige Bürger erweitert. • Eine Methode zur Erzeugung semantischer workflows mit unserer SAL-Sprache. Es hilft bei der Erstellung, Wiederverwendung und Wartung komplexer analytischer Aufgaben und workflows auf abstrakte Weise. • Eine mehrschichtige Architektur, die Wissens- und datengesteuerte Analysen zu einer föderierten und verteilten Lösung verschmilzt. Nach unserem Wissen, die Arbeit in dieser Arbeit ist eines der ersten Werke zur Einführung und Untersuchung der Verwendung der semantisch definierten Analytik in einer Ontologie-basierten Datenzugriff Einstellung für industrielle analytische Anwendungen. Der Grund für die Fokussierung unserer Arbeit und Evaluierung auf industrielle Daten ist auf (i) die Übernahme semantischer Technologien durch die Industrie im Allgemeinen und (ii) den gemeinsamen Bedarf in der Literatur und in der Praxis zurückzuführen, der es der Fachkompetenz ermöglicht, die Datenanalyse auf semantisch inter-operablen Quellen voranzutreiben, und nutzen gleichzeitig die Leistungsfähigkeit der Analytik, um Echtzeit-Daten-einblicke zu ermöglichen. Aufgrund der Evaluierungsergebnisse von drei Anwendungsfällen Übertritt unser Ansatz für die meisten Anwendungsszenarien Modernste Ansätze

    Minibix: Item banking with web services

    Get PDF
    The Minibix system was developed from an existing prototype item bank system in use for high-stakes testing at the University of Cambridge. The system has been developed over the last year with support from the JISC e- Learning Programme. This project has redeveloped the system based on version 2 of the IMS Question and Test Interoperability (QTI) specification and is publishing the resulting system under an open source license. In this paper, we propose a simple service model for describing the authoring, banking, test construction and delivery of assessment content. The item banking model is implemented by the Minibix system and will be demonstrated in conjunction with authoring, test construction and delivery systems developed by the sister projects: AQuRate (Kingston University) and AsDel (University of Southampton). These services, as part of a wider e-Framework, could enable tool integration on a scale suitable for interacting with large-scale item banks. Private banks are already used routinely in high-stakes summative assessment but open repositories of items for formative use are now becoming available. For example, the E3AN item bank for Electrical and Electronic Engineering or the item bank for the Physical Sciences recently announced by the HEA

    Algorithm-aided Information Design: Hybrid Design approach on the edge of associative methodologies in AEC

    Get PDF
    Dissertação de mestrado em European Master in Building Information ModellingLast three decades have brought colossal progress to design methodologies within the common pursuit toward a seamless fusion between digital and physical worlds and augmenting it with the of computation power and network coverage. For this historically short period, two generations of methodologies and tools have emerged: Additive generation and parametric Associative generation of CAD. Currently, designers worldwide engaged in new forms of design exploration. From this race, two prominent methodologies have developed from Associative Design approach – Object-Oriented Design (OOD) and Algorithm-Aided Design (AAD). The primary research objective is to investigate, examine, and push boundaries between OOD and AAD for new design space determination, where advantages of both design methods are fused to produce a new generation methodology which is called in the present study AID (Algorithm-aided Information Design). The study methodology is structured into two flows. In the first flow, existing CAD methodologies are investigated, and the conceptual framework is extracted based on the state of art analysis, then analysed data is synthesized into the subject proposal. In the second flow, tools and workflows are elaborated and examined on practice to confirm the subject proposal. In compliance, the content of the research consists of two theoretical and practical parts. In the first theoretical part, a literature review is conducted, and assumptions are made to speculate about AID methodology, its tools, possible advantages and drawbacks. Next, case studies are performed according to sequential stages of digital design through the lens of practical AID methodology implementation. Case studies are covering such design aspects as model & documentation generation, design automation, interoperability, manufacturing control, performance analysis and optimization. Ultimately, a set of test projects is developed with the AID methodology applied. After the practical part, research returns to the theory where analytical information is gathered based on the literature review, conceptual framework, and experimental practice reports. In summary, the study synthesizes AID methodology as part of Hybrid Design, which enables creative use of tools and elaborating of agile design systems integrating additive and associative methodologies of Digital Design. In general, the study is based on agile methods and cyclic research development mixed between practice and theory to achieve a comprehensive vision of the subject.Last three decades have brought colossal progress to design methodologies within the common pursuit toward a seamless fusion between digital and physical worlds and augmenting it with the of computation power and network coverage. For this historically short period, two generations of methodologies and tools have emerged: Additive generation and parametric Associative generation of CAD. Currently, designers worldwide engaged in new forms of design exploration. From this race, two prominent methodologies have developed from Associative Design approach – Object-Oriented Design (OOD) and Algorithm-Aided Design (AAD). The primary research objective is to investigate, examine, and push boundaries between OOD and AAD for new design space determination, where advantages of both design methods are fused to produce a new generation methodology which is called in the present study AID (Algorithm-aided Information Design). The study methodology is structured into two flows. In the first flow, existing CAD methodologies are investigated, and the conceptual framework is extracted based on the state of art analysis, then analysed data is synthesized into the subject proposal. In the second flow, tools and workflows are elaborated and examined on practice to confirm the subject proposal. In compliance, the content of the research consists of two theoretical and practical parts. In the first theoretical part, a literature review is conducted, and assumptions are made to speculate about AID methodology, its tools, possible advantages and drawbacks. Next, case studies are performed according to sequential stages of digital design through the lens of practical AID methodology implementation. Case studies are covering such design aspects as model & documentation generation, design automation, interoperability, manufacturing control, performance analysis and optimization. Ultimately, a set of test projects is developed with the AID methodology applied. After the practical part, research returns to the theory where analytical information is gathered based on the literature review, conceptual framework, and experimental practice reports. In summary, the study synthesizes AID methodology as part of Hybrid Design, which enables creative use of tools and elaborating of agile design systems integrating additive and associative methodologies of Digital Design. In general, the study is based on agile methods and cyclic research development mixed between practice and theory to achieve a comprehensive vision of the subject

    ImageJ2: ImageJ for the next generation of scientific image data

    Full text link
    ImageJ is an image analysis program extensively used in the biological sciences and beyond. Due to its ease of use, recordable macro language, and extensible plug-in architecture, ImageJ enjoys contributions from non-programmers, amateur programmers, and professional developers alike. Enabling such a diversity of contributors has resulted in a large community that spans the biological and physical sciences. However, a rapidly growing user base, diverging plugin suites, and technical limitations have revealed a clear need for a concerted software engineering effort to support emerging imaging paradigms, to ensure the software's ability to handle the requirements of modern science. Due to these new and emerging challenges in scientific imaging, ImageJ is at a critical development crossroads. We present ImageJ2, a total redesign of ImageJ offering a host of new functionality. It separates concerns, fully decoupling the data model from the user interface. It emphasizes integration with external applications to maximize interoperability. Its robust new plugin framework allows everything from image formats, to scripting languages, to visualization to be extended by the community. The redesigned data model supports arbitrarily large, N-dimensional datasets, which are increasingly common in modern image acquisition. Despite the scope of these changes, backwards compatibility is maintained such that this new functionality can be seamlessly integrated with the classic ImageJ interface, allowing users and developers to migrate to these new methods at their own pace. ImageJ2 provides a framework engineered for flexibility, intended to support these requirements as well as accommodate future needs

    Document Automation Architectures: Updated Survey in Light of Large Language Models

    Full text link
    This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and assembling documents conforming to defined templates. There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies. The current survey of DA reviews the academic literature and provides a clearer definition and characterization of DA and its features, identifies state-of-the-art DA architectures and technologies in academic research, and provides ideas that can lead to new research opportunities within the DA field in light of recent advances in generative AI and large language models.Comment: The current paper is the updated version of an earlier survey on document automation [Ahmadi Achachlouei et al. 2021]. Updates in the current paper are as follows: We shortened almost all sections to reduce the size of the main paper (without references) from 28 pages to 10 pages, added a review of selected papers on large language models, removed certain sections and most of diagrams. arXiv admin note: substantial text overlap with arXiv:2109.1160

    Leveraging workflow control patterns in the domain of clinical practice guidelines

    Get PDF
    Background: Clinical practice guidelines (CPGs) include recommendations describing appropriate care for the management of patients with a specific clinical condition. A number of representation languages have been developed to support executable CPGs, with associated authoring/editing tools. Even with tool assistance, authoring of CPG models is a labor-intensive task. We aim at facilitating the early stages of CPG modeling task. In this context, we propose to support the authoring of CPG models based on a set of suitable procedural patterns described in an implementation-independent notation that can be then semi-automatically transformed into one of the alternative executable CPG languages. Methods: We have started with the workflow control patterns which have been identified in the fields of workflow systems and business process management. We have analyzed the suitability of these patterns by means of a qualitative analysis of CPG texts. Following our analysis we have implemented a selection of workflow patterns in the Asbru and PROforma CPG languages. As implementation-independent notation for the description of patterns we have chosen BPMN 2.0. Finally, we have developed XSLT transformations to convert the BPMN 2.0 version of the patterns into the Asbru and PROforma languages. Results: We showed that although a significant number of workflow control patterns are suitable to describe CPG procedural knowledge, not all of them are applicable in the context of CPGs due to their focus on single-patient care. Moreover, CPGs may require additional patterns not included in the set of workflow control patterns. We also showed that nearly all the CPG-suitable patterns can be conveniently implemented in the Asbru and PROforma languages. Finally, we demonstrated that individual patterns can be semi-automatically transformed from a process specification in BPMN 2.0 to executable implementations in these languages. Conclusions: We propose a pattern and transformation-based approach for the development of CPG models. Such an approach can form the basis of a valid framework for the authoring of CPG models. The identification of adequate patterns and the implementation of transformations to convert patterns from a process specification into different executable implementations are the first necessary steps for our approach.This research has been supported by: 1) Austrian Science Fund (FWF) through project TRP71-N23. 2) Spanish Ministry of Education through grant PR2010-0279, and by Universitat Jaume I through project P11B2009-38
    • …
    corecore