5 research outputs found

    Concevoir des applications internet des objets sémantiques

    Get PDF
    According to Cisco's predictions, there will be more than 50 billions of devices connected to the Internet by 2020.The devices and produced data are mainly exploited to build domain-specific Internet of Things (IoT) applications. From a data-centric perspective, these applications are not interoperable with each other.To assist users or even machines in building promising inter-domain IoT applications, main challenges are to exploit, reuse, interpret and combine sensor data.To overcome interoperability issues, we designed the Machine-to-Machine Measurement (M3) framework consisting in:(1) generating templates to easily build Semantic Web of Things applications, (2) semantically annotating IoT data to infer high-level knowledge by reusing as much as possible the domain knowledge expertise, and (3) a semantic-based security application to assist users in designing secure IoT applications.Regarding the reasoning part, stemming from the 'Linked Open Data', we propose an innovative idea called the 'Linked Open Rules' to easily share and reuse rules to infer high-level abstractions from sensor data.The M3 framework has been suggested to standardizations and working groups such as ETSI M2M, oneM2M, W3C SSN ontology and W3C Web of Things. Proof-of-concepts of the flexible M3 framework have been developed on the cloud (http://www.sensormeasurement.appspot.com/) and embedded on Android-based constrained devices.Selon les prĂ©visions de Cisco , il y aura plus de 50 milliards d'appareils connectĂ©s Ă  Internet d'ici 2020. Les appareils et les donnĂ©es produites sont principalement exploitĂ©es pour construire des applications « Internet des Objets (IdO) ». D'un point de vue des donnĂ©es, ces applications ne sont pas interopĂ©rables les unes avec les autres. Pour aider les utilisateurs ou mĂȘme les machines Ă  construire des applications 'Internet des Objets' inter-domaines innovantes, les principaux dĂ©fis sont l'exploitation, la rĂ©utilisation, l'interprĂ©tation et la combinaison de ces donnĂ©es produites par les capteurs. Pour surmonter les problĂšmes d'interopĂ©rabilitĂ©, nous avons conçu le systĂšme Machine-to-Machine Measurement (M3) consistant Ă : (1) enrichir les donnĂ©es de capteurs avec les technologies du web sĂ©mantique pour dĂ©crire explicitement leur sens selon le contexte, (2) interprĂ©ter les donnĂ©es des capteurs pour en dĂ©duire des connaissances supplĂ©mentaires en rĂ©utilisant autant que possible la connaissance du domaine dĂ©finie par des experts, et (3) une base de connaissances de sĂ©curitĂ© pour assurer la sĂ©curitĂ© dĂšs la conception lors de la construction des applications IdO. Concernant la partie raisonnement, inspirĂ© par le « Web de donnĂ©es », nous proposons une idĂ©e novatrice appelĂ©e le « Web des rĂšgles » afin de partager et rĂ©utiliser facilement les rĂšgles pour interprĂ©ter et raisonner sur les donnĂ©es de capteurs. Le systĂšme M3 a Ă©tĂ© suggĂ©rĂ© Ă  des normalisations et groupes de travail tels que l'ETSI M2M, oneM2M, W3C SSN et W3C Web of Things. Une preuve de concept de M3 a Ă©tĂ© implĂ©mentĂ©e et est disponible sur le web (http://www.sensormeasurement.appspot.com/) mais aussi embarqu

    Machine learning for managing structured and semi-structured data

    Get PDF
    As the digitalization of private, commercial, and public sectors advances rapidly, an increasing amount of data is becoming available. In order to gain insights or knowledge from these enormous amounts of raw data, a deep analysis is essential. The immense volume requires highly automated processes with minimal manual interaction. In recent years, machine learning methods have taken on a central role in this task. In addition to the individual data points, their interrelationships often play a decisive role, e.g. whether two patients are related to each other or whether they are treated by the same physician. Hence, relational learning is an important branch of research, which studies how to harness this explicitly available structural information between different data points. Recently, graph neural networks have gained importance. These can be considered an extension of convolutional neural networks from regular grids to general (irregular) graphs. Knowledge graphs play an essential role in representing facts about entities in a machine-readable way. While great efforts are made to store as many facts as possible in these graphs, they often remain incomplete, i.e., true facts are missing. Manual verification and expansion of the graphs is becoming increasingly difficult due to the large volume of data and must therefore be assisted or substituted by automated procedures which predict missing facts. The field of knowledge graph completion can be roughly divided into two categories: Link Prediction and Entity Alignment. In Link Prediction, machine learning models are trained to predict unknown facts between entities based on the known facts. Entity Alignment aims at identifying shared entities between graphs in order to link several such knowledge graphs based on some provided seed alignment pairs. In this thesis, we present important advances in the field of knowledge graph completion. For Entity Alignment, we show how to reduce the number of required seed alignments while maintaining performance by novel active learning techniques. We also discuss the power of textual features and show that graph-neural-network-based methods have difficulties with noisy alignment data. For Link Prediction, we demonstrate how to improve the prediction for unknown entities at training time by exploiting additional metadata on individual statements, often available in modern graphs. Supported with results from a large-scale experimental study, we present an analysis of the effect of individual components of machine learning models, e.g., the interaction function or loss criterion, on the task of link prediction. We also introduce a software library that simplifies the implementation and study of such components and makes them accessible to a wide research community, ranging from relational learning researchers to applied fields, such as life sciences. Finally, we propose a novel metric for evaluating ranking results, as used for both completion tasks. It allows for easier interpretation and comparison, especially in cases with different numbers of ranking candidates, as encountered in the de-facto standard evaluation protocols for both tasks.Mit der rasant fortschreitenden Digitalisierung des privaten, kommerziellen und öffentlichen Sektors werden immer grĂ¶ĂŸere Datenmengen verfĂŒgbar. Um aus diesen enormen Mengen an Rohdaten Erkenntnisse oder Wissen zu gewinnen, ist eine tiefgehende Analyse unerlĂ€sslich. Das immense Volumen erfordert hochautomatisierte Prozesse mit minimaler manueller Interaktion. In den letzten Jahren haben Methoden des maschinellen Lernens eine zentrale Rolle bei dieser Aufgabe eingenommen. Neben den einzelnen Datenpunkten spielen oft auch deren ZusammenhĂ€nge eine entscheidende Rolle, z.B. ob zwei Patienten miteinander verwandt sind oder ob sie vom selben Arzt behandelt werden. Daher ist das relationale Lernen ein wichtiger Forschungszweig, der untersucht, wie diese explizit verfĂŒgbaren strukturellen Informationen zwischen verschiedenen Datenpunkten nutzbar gemacht werden können. In letzter Zeit haben Graph Neural Networks an Bedeutung gewonnen. Diese können als eine Erweiterung von CNNs von regelmĂ€ĂŸigen Gittern auf allgemeine (unregelmĂ€ĂŸige) Graphen betrachtet werden. Wissensgraphen spielen eine wesentliche Rolle bei der Darstellung von Fakten ĂŒber EntitĂ€ten in maschinenlesbaren Form. Obwohl große Anstrengungen unternommen werden, so viele Fakten wie möglich in diesen Graphen zu speichern, bleiben sie oft unvollstĂ€ndig, d. h. es fehlen Fakten. Die manuelle ÜberprĂŒfung und Erweiterung der Graphen wird aufgrund der großen Datenmengen immer schwieriger und muss daher durch automatisierte Verfahren unterstĂŒtzt oder ersetzt werden, die fehlende Fakten vorhersagen. Das Gebiet der WissensgraphenvervollstĂ€ndigung lĂ€sst sich grob in zwei Kategorien einteilen: Link Prediction und Entity Alignment. Bei der Link Prediction werden maschinelle Lernmodelle trainiert, um unbekannte Fakten zwischen EntitĂ€ten auf der Grundlage der bekannten Fakten vorherzusagen. Entity Alignment zielt darauf ab, gemeinsame EntitĂ€ten zwischen Graphen zu identifizieren, um mehrere solcher Wissensgraphen auf der Grundlage einiger vorgegebener Paare zu verknĂŒpfen. In dieser Arbeit stellen wir wichtige Fortschritte auf dem Gebiet der VervollstĂ€ndigung von Wissensgraphen vor. FĂŒr das Entity Alignment zeigen wir, wie die Anzahl der benötigten Paare reduziert werden kann, wĂ€hrend die Leistung durch neuartige aktive Lerntechniken erhalten bleibt. Wir erörtern auch die LeistungsfĂ€higkeit von Textmerkmalen und zeigen, dass auf Graph-Neural-Networks basierende Methoden Schwierigkeiten mit verrauschten Paar-Daten haben. FĂŒr die Link Prediction demonstrieren wir, wie die Vorhersage fĂŒr unbekannte EntitĂ€ten zur Trainingszeit verbessert werden kann, indem zusĂ€tzliche Metadaten zu einzelnen Aussagen genutzt werden, die oft in modernen Graphen verfĂŒgbar sind. GestĂŒtzt auf Ergebnisse einer groß angelegten experimentellen Studie prĂ€sentieren wir eine Analyse der Auswirkungen einzelner Komponenten von Modellen des maschinellen Lernens, z. B. der Interaktionsfunktion oder des Verlustkriteriums, auf die Aufgabe der Link Prediction. Außerdem stellen wir eine Softwarebibliothek vor, die die Implementierung und Untersuchung solcher Komponenten vereinfacht und sie einer breiten Forschungsgemeinschaft zugĂ€nglich macht, die von Forschern im Bereich des relationalen Lernens bis hin zu angewandten Bereichen wie den Biowissenschaften reicht. Schließlich schlagen wir eine neuartige Metrik fĂŒr die Bewertung von Ranking-Ergebnissen vor, wie sie fĂŒr beide Aufgaben verwendet wird. Sie ermöglicht eine einfachere Interpretation und einen leichteren Vergleich, insbesondere in FĂ€llen mit einer unterschiedlichen Anzahl von Kandidaten, wie sie in den de-facto Standardbewertungsprotokollen fĂŒr beide Aufgaben vorkommen

    Bridging the gap between textual and formal business process representations

    Get PDF
    Tesi en modalitat de compendi de publicacionsIn the era of digital transformation, an increasing number of organizations are start ing to think in terms of business processes. Processes are at the very heart of each business, and must be understood and carried out by a wide range of actors, from both technical and non-technical backgrounds alike. When embracing digital transformation practices, there is a need for all involved parties to be aware of the underlying business processes in an organization. However, the representational complexity and biases of the state-of-the-art modeling notations pose a challenge in understandability. On the other hand, plain language representations, accessible by nature and easily understood by everyone, are often frowned upon by technical specialists due to their ambiguity. The aim of this thesis is precisely to bridge this gap: Between the world of the techni cal, formal languages and the world of simpler, accessible natural languages. Structured as an article compendium, in this thesis we present four main contributions to address specific problems in the intersection between the fields of natural language processing and business process management.A l’era de la transformaciĂł digital, cada vegada mĂ©s organitzacions comencen a pensar en termes de processos de negoci. Els processos sĂłn el nucli principal de tota empresa i, com a tals, han de ser fĂ cilment comprensibles per un ampli ventall de rols, tant perfils tĂšcnics com no-tĂšcnics. Quan s’adopta la transformaciĂł digital, Ă©s necessari que totes les parts involucrades estiguin ben informades sobre els protocols implantats com a part del procĂ©s de digitalitzaciĂł. Tot i aixĂČ, la complexitat i biaixos de representaciĂł dels llenguatges de modelitzaciĂł que actualment conformen l’estat de l’art sovint en dificulten la seva com prensiĂł. D’altra banda, les representacions basades en documentaciĂł usant llenguatge natural, accessibles per naturalesa i fĂ cilment comprensibles per tothom, moltes vegades sĂłn vistes com un problema pels perfils mĂ©s tĂšcnics a causa de la presĂšncia d’ambigĂŒitats en els textos. L’objectiu d’aquesta tesi Ă©s precisament el de superar aquesta distĂ ncia: La distĂ ncia entre el mĂłn dels llenguatges tĂšcnics i formals amb el dels llenguatges naturals, mĂ©s accessibles i senzills. Amb una estructura de compendi d’articles, en aquesta tesi presentem quatre grans lĂ­nies de recerca per adreçar problemes especĂ­fics en aquesta intersecciĂł entre les tecnologies d’anĂ lisi de llenguatge natural i la gestiĂł dels processos de negoci.Postprint (published version

    Exploiting general-purpose background knowledge for automated schema matching

    Full text link
    The schema matching task is an integral part of the data integration process. It is usually the first step in integrating data. Schema matching is typically very complex and time-consuming. It is, therefore, to the largest part, carried out by humans. One reason for the low amount of automation is the fact that schemas are often defined with deep background knowledge that is not itself present within the schemas. Overcoming the problem of missing background knowledge is a core challenge in automating the data integration process. In this dissertation, the task of matching semantic models, so-called ontologies, with the help of external background knowledge is investigated in-depth in Part I. Throughout this thesis, the focus lies on large, general-purpose resources since domain-specific resources are rarely available for most domains. Besides new knowledge resources, this thesis also explores new strategies to exploit such resources. A technical base for the development and comparison of matching systems is presented in Part II. The framework introduced here allows for simple and modularized matcher development (with background knowledge sources) and for extensive evaluations of matching systems. One of the largest structured sources for general-purpose background knowledge are knowledge graphs which have grown significantly in size in recent years. However, exploiting such graphs is not trivial. In Part III, knowledge graph em- beddings are explored, analyzed, and compared. Multiple improvements to existing approaches are presented. In Part IV, numerous concrete matching systems which exploit general-purpose background knowledge are presented. Furthermore, exploitation strategies and resources are analyzed and compared. This dissertation closes with a perspective on real-world applications
    corecore