5,924 research outputs found

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Semantically defined Analytics for Industrial Equipment Diagnostics

    Get PDF
    In this age of digitalization, industries everywhere accumulate massive amount of data such that it has become the lifeblood of the global economy. This data may come from various heterogeneous systems, equipment, components, sensors, systems and applications in many varieties (diversity of sources), velocities (high rate of changes) and volumes (sheer data size). Despite significant advances in the ability to collect, store, manage and filter data, the real value lies in the analytics. Raw data is meaningless, unless it is properly processed to actionable (business) insights. Those that know how to harness data effectively, have a decisive competitive advantage, through raising performance by making faster and smart decisions, improving short and long-term strategic planning, offering more user-centric products and services and fostering innovation. Two distinct paradigms in practice can be discerned within the field of analytics: semantic-driven (deductive) and data-driven (inductive). The first emphasizes logic as a way of representing the domain knowledge encoded in rules or ontologies and are often carefully curated and maintained. However, these models are often highly complex, and require intensive knowledge processing capabilities. Data-driven analytics employ machine learning (ML) to directly learn a model from the data with minimal human intervention. However, these models are tuned to trained data and context, making it difficult to adapt. Industries today that want to create value from data must master these paradigms in combination. However, there is great need in data analytics to seamlessly combine semantic-driven and data-driven processing techniques in an efficient and scalable architecture that allows extracting actionable insights from an extreme variety of data. In this thesis, we address these needs by providing: ‱ A unified representation of domain-specific and analytical semantics, in form of ontology models called TechOnto Ontology Stack. It is highly expressive, platform-independent formalism to capture conceptual semantics of industrial systems such as technical system hierarchies, component partonomies etc and its analytical functional semantics. ‱ A new ontology language Semantically defined Analytical Language (SAL) on top of the ontology model that extends existing DatalogMTL (a Horn fragment of Metric Temporal Logic) with analytical functions as first class citizens. ‱ A method to generate semantic workflows using our SAL language. It helps in authoring, reusing and maintaining complex analytical tasks and workflows in an abstract fashion. ‱ A multi-layer architecture that fuses knowledge- and data-driven analytics into a federated and distributed solution. To our knowledge, the work in this thesis is one of the first works to introduce and investigate the use of the semantically defined analytics in an ontology-based data access setting for industrial analytical applications. The reason behind focusing our work and evaluation on industrial data is due to (i) the adoption of semantic technology by the industries in general, and (ii) the common need in literature and in practice to allow domain expertise to drive the data analytics on semantically interoperable sources, while still harnessing the power of analytics to enable real-time data insights. Given the evaluation results of three use-case studies, our approach surpass state-of-the-art approaches for most application scenarios.Im Zeitalter der Digitalisierung sammeln die Industrien ĂŒberall massive Daten-mengen, die zum Lebenselixier der Weltwirtschaft geworden sind. Diese Daten können aus verschiedenen heterogenen Systemen, GerĂ€ten, Komponenten, Sensoren, Systemen und Anwendungen in vielen Varianten (Vielfalt der Quellen), Geschwindigkeiten (hohe Änderungsrate) und Volumina (reine DatengrĂ¶ĂŸe) stammen. Trotz erheblicher Fortschritte in der FĂ€higkeit, Daten zu sammeln, zu speichern, zu verwalten und zu filtern, liegt der eigentliche Wert in der Analytik. Rohdaten sind bedeutungslos, es sei denn, sie werden ordnungsgemĂ€ĂŸ zu verwertbaren (GeschĂ€fts-)Erkenntnissen verarbeitet. Wer weiß, wie man Daten effektiv nutzt, hat einen entscheidenden Wettbewerbsvorteil, indem er die Leistung steigert, indem er schnellere und intelligentere Entscheidungen trifft, die kurz- und langfristige strategische Planung verbessert, mehr benutzerorientierte Produkte und Dienstleistungen anbietet und Innovationen fördert. In der Praxis lassen sich im Bereich der Analytik zwei unterschiedliche Paradigmen unterscheiden: semantisch (deduktiv) und Daten getrieben (induktiv). Die erste betont die Logik als eine Möglichkeit, das in Regeln oder Ontologien kodierte DomĂ€nen-wissen darzustellen, und wird oft sorgfĂ€ltig kuratiert und gepflegt. Diese Modelle sind jedoch oft sehr komplex und erfordern eine intensive Wissensverarbeitung. Datengesteuerte Analysen verwenden maschinelles Lernen (ML), um mit minimalem menschlichen Eingriff direkt ein Modell aus den Daten zu lernen. Diese Modelle sind jedoch auf trainierte Daten und Kontext abgestimmt, was die Anpassung erschwert. Branchen, die heute Wert aus Daten schaffen wollen, mĂŒssen diese Paradigmen in Kombination meistern. Es besteht jedoch ein großer Bedarf in der Daten-analytik, semantisch und datengesteuerte Verarbeitungstechniken nahtlos in einer effizienten und skalierbaren Architektur zu kombinieren, die es ermöglicht, aus einer extremen Datenvielfalt verwertbare Erkenntnisse zu gewinnen. In dieser Arbeit, die wir auf diese BedĂŒrfnisse durch die Bereitstellung: ‱ Eine einheitliche Darstellung der DomĂ€nen-spezifischen und analytischen Semantik in Form von Ontologie Modellen, genannt TechOnto Ontology Stack. Es ist ein hoch-expressiver, plattformunabhĂ€ngiger Formalismus, die konzeptionelle Semantik industrieller Systeme wie technischer Systemhierarchien, Komponenten-partonomien usw. und deren analytische funktionale Semantik zu erfassen. ‱ Eine neue Ontologie-Sprache Semantically defined Analytical Language (SAL) auf Basis des Ontologie-Modells das bestehende DatalogMTL (ein Horn fragment der metrischen temporĂ€ren Logik) um analytische Funktionen als erstklassige BĂŒrger erweitert. ‱ Eine Methode zur Erzeugung semantischer workflows mit unserer SAL-Sprache. Es hilft bei der Erstellung, Wiederverwendung und Wartung komplexer analytischer Aufgaben und workflows auf abstrakte Weise. ‱ Eine mehrschichtige Architektur, die Wissens- und datengesteuerte Analysen zu einer föderierten und verteilten Lösung verschmilzt. Nach unserem Wissen, die Arbeit in dieser Arbeit ist eines der ersten Werke zur EinfĂŒhrung und Untersuchung der Verwendung der semantisch definierten Analytik in einer Ontologie-basierten Datenzugriff Einstellung fĂŒr industrielle analytische Anwendungen. Der Grund fĂŒr die Fokussierung unserer Arbeit und Evaluierung auf industrielle Daten ist auf (i) die Übernahme semantischer Technologien durch die Industrie im Allgemeinen und (ii) den gemeinsamen Bedarf in der Literatur und in der Praxis zurĂŒckzufĂŒhren, der es der Fachkompetenz ermöglicht, die Datenanalyse auf semantisch inter-operablen Quellen voranzutreiben, und nutzen gleichzeitig die LeistungsfĂ€higkeit der Analytik, um Echtzeit-Daten-einblicke zu ermöglichen. Aufgrund der Evaluierungsergebnisse von drei AnwendungsfĂ€llen Übertritt unser Ansatz fĂŒr die meisten Anwendungsszenarien Modernste AnsĂ€tze

    A review of approaches to supply chain communications: from manufacturing to construction

    Get PDF
    With the increasing importance of computer-based communication technologies, communication networks are becoming crucial in supply chain management. Given the objectives of the supply chain: to have the right products in the right quantities, at the right place, at the right moment and at minimal cost, supply chain management is situated at the intersection of different professional sectors. This is particularly the case in construction, since building needs for its fabrication the incorporation of a number of industrial products. This paper provides a review of the main approaches to supply chain communications as used mainly in manufacturing industries. The paper analyses the extent to which these have been applied to construction. It also reviews the on-going developments and research activities in this domain

    The use of TRAO to manage evolution risks in e-government

    Get PDF
    The need to develop and provide more efficient ways of providing Electronic Government Services to key stakeholders in government has brought about varying degrees of evolution in government. This evolution is seen in different ways like the merging of government departments, the merging of assets or its components with legacy assets etc. This has involved the incorporation of several practices that are geared towards the elimination of processes that are repetitive and manual while attempting to progressively encourage the interaction that exists between the different stakeholders. However, some of these practices have further complicated processes in government thus creating avenues for vulnerabilities which if exploited expose government and government assets to risks and threats. Focusing on ways to manage the issues accompanied with evolution can better prepare governments for manging the associated vulnerabilities, risks and threats. The basis of a conceptual framework is provided to establish the relationships that exist between the E-Government, asset and security domains. Thus, this thesis presents a design research project used in the management of evolution-related risks. The first part of the project focusses on the development of a generic ontology known as TRAO and a scenario ontology TRAOSc made up of different hypothetical scenarios. The resulting efficiency of the development of these ontologies have facilitated the development of an intelligent tool TRAOSearch that supports high-level semantically enriched queries. Results from the use of a case study prove that there are existing evolution-related issues which governments may not be fully prepared for. Furthermore, an ontological approach in the management of evolution-related risks showed that government stakeholders were interested in the use of intelligent processes that could improve government effectiveness while analysing the risks associated with doing this. Of more importance to this research was the ability to make inferences from the ontology on existing complex relationships that exist in the form of dependencies and interdependencies between Stakeholders and Assets. Thus, this thesis presents contributions in the aspect of advancing stakeholders understanding on the types of relationships that exist in government and the effect these relationships may have on service provisioning. Another novel contribution can be seen in the correction of the ambiguity associated with the terms Service, IT Service and E-Government. Furthermore, the feedback obtained from the use of an ontology-based tool during the evaluation phase of the project provides insights on whether governments must always be at par with technological evolution

    Invest to Save: Report and Recommendations of the NSF-DELOS Working Group on Digital Archiving and Preservation

    Get PDF
    Digital archiving and preservation are important areas for research and development, but there is no agreed upon set of priorities or coherent plan for research in this area. Research projects in this area tend to be small and driven by particular institutional problems or concerns. As a consequence, proposed solutions from experimental projects and prototypes tend not to scale to millions of digital objects, nor do the results from disparate projects readily build on each other. It is also unclear whether it is worthwhile to seek general solutions or whether different strategies are needed for different types of digital objects and collections. The lack of coordination in both research and development means that there are some areas where researchers are reinventing the wheel while other areas are neglected. Digital archiving and preservation is an area that will benefit from an exercise in analysis, priority setting, and planning for future research. The WG aims to survey current research activities, identify gaps, and develop a white paper proposing future research directions in the area of digital preservation. Some of the potential areas for research include repository architectures and inter-operability among digital archives; automated tools for capture, ingest, and normalization of digital objects; and harmonization of preservation formats and metadata. There can also be opportunities for development of commercial products in the areas of mass storage systems, repositories and repository management systems, and data management software and tools.

    ONTODL+: an ontology description language and its compiler

    Get PDF
    Dissertação de mestrado em Engenharia InformĂĄticaOntologies are very powerful tools when it comes to handling knowledge. They offer a good solution to exchange, store, search and infer large volumes of information. Throughout the years various solutions for knowledge-based systems use ontologies at their core. OntoDL has been developed as a Domain Specific Language using ANTLR4, to allow for the specification of ontologies. This language has already been used by experts of various fields has a way to use computer-based solutions to solve their problems. In this thesis, included on the second year of the Master degree in Informatics Engineering, OntoDL+ was created as an expansion of the original OntoDL. Both the language and its compiler have been improved. The language was extended to improve usability and productivity for its users, while ensuring an easy to learn and understand language. The compiler was expanded to translate the language specifications to a vaster array of languages, increasing the potential uses of the DSL with the features provided by the languages. The compiler and some examples of the DSL can be downloaded at the website https: //epl.di.uminho.pt/∌gepl/GEPL DS/OntoDL/ created for the application and presented in the final chapters of the thesis.As ontologias sĂŁo formalismos muito poderosos no que toca a manipulação de conhecimento. Estas oferecem uma boa solução para trocar, armazenar, procurar e inferir grandes volumes de informação. Ao longo dos anos, vĂĄrias soluçÔes para sistemas baseados em conhecimento usaram ontologias como uma parte central do sistema. A OntoDL Ă© uma Linguagem de DomĂ­nio EspecĂ­fico que foi desenvolvida atravĂ©s do uso de ANTLR4, para permitir a especificação de ontologias. Esta linguagem foi jĂĄ utilizada por especialistas de diversas ĂĄreas como forma de utilizar soluçÔes informĂĄticas para resolver os seus problemas. Nesta tese, incluĂ­da no segundo ano do Mestrado em Engenharia InformĂĄtica, OntoDL+ foi criado como uma expansĂŁo tanto Ă  linguagem e como ao seu compilador. A linguagem foi extendida para melhorar a usabilidade e produtividade dos seus utilizadores, mantendo se fĂĄcil de aprender e perceber. O compilador foi expandido para ser capaz de traduzir as especificaçÔes de OntoDL+ para um leque de linguagens mais vasto, aumentando os potenciais usos da DSL atravĂ©s das funcionalidades providenciadas pelas linguagens alvo. O compilador e alguns exemplos da DSL podem ser acedidos no sĂ­tio https://epl.di. uminho.pt/∌gepl/GEPL DS/OntoDL/ criado para a aplicação e mostrado nos capĂ­tulos finais da tese
    • 

    corecore