9 research outputs found
Proposing a Methodology for Designing an Enterprise Knowledge Graph to Ensure Interoperability Between Heterogeneous Data Sources
Ο κύριος ερευνητικός στόχος αυτής της διπλωματικής εργασίας είναι να προτείνει μια προσέγγιση για το σχεδιασμό ενός Γνωσιακού Γράφου Επιχειρήσεων (EKG) για τη διασφάλιση της διαλειτουργικότητας μεταξύ διαφορετικών ετερογενών πηγών, λαμβάνοντας υπόψη τις ήδη υπάρχουσες προσπάθειες και διαδικασίες αυτοματισμού που αναπτύχθηκαν από την ENGIE στην προσπάθειά τους να δημιουργήσουν ένα Γνωσιακό Γράφο ειδικού σκοπού. Για την επίτευξη αυτού του στόχου, απαιτείται η βαθιά κατανόηση των ήδη υπαρχόντων σύγχρονων προσεγγίσεων EKG, των τεχνολογιών τους με έμφαση στη μετατροπή δεδομένων και στις μεθόδους επερώτησης και τέλος η συγκριτική παρουσίαση τυχόν νέων ευρημάτων σε αυτή τη νέα πρόκληση καθορισμού ενός EKG. Τα κριτήρια αξιολόγησης των διαφόρων διαδικασιών έχουν αποφασιστεί με τρόπο που να καλύπτει τις ακόλουθες ερωτήσεις. (i) Ποιες είναι οι επιπτώσεις και τα πρακτικά αποτελέσματα των διαφορετικών στρατηγικών σχεδιασμού για τον ορισμό ενός EKG; (ii) Πώς αυτές οι στρατηγικές επηρεάζουν τη σημασιολογική πολυπλοκότητα και μειώνουν ή αυξάνουν την απόδοση; (iii) Είναι δυνατόν να διατηρηθεί χαμηλά η καθυστέρηση και να έχουμε μόνιμες ενημερώσεις;
Επιπλέον, η εργασία μας περιορίστηκε σε ένα σενάριο χρήσης που ορίστηκε από την ENGIE για να διερευνήσει τα ανοιχτά δεδομένα ατυχημάτων και τα δεδομένα του οδικού χάρτη ως μια αφετηρία στην κατασκευή ενός EKG. Θα πειραματιστούμε με τον μετασχηματισμό δεδομένων από ετερογενείς πηγές δεδομένων σε μία τελική ενιαία συλλογή δεδομένων (RDF) έτοιμη να χρησιμοποιηθεί ως τη βάση του Γνωσιακού Γράφου. Στη συνέχεια, θα παρουσιάσουμε τις τεχνικές προκλήσεις, το λεξιλόγιο και τις μεθόδους που χρησιμοποιούνται για την επίλυση του προβλήματος ορισμού ενός EKG.
Τέλος, ένας παράλληλος στόχος της διπλωματικής εργασίας είναι η πρακτική δοκιμή και σύγκριση τεχνικών μεθόδων για την ενσωμάτωση, τον εμπλουτισμό και τον μετασχηματισμό δεδομένων. Το σημαντικότερο για εμάς, είναι ότι θα δοκιμάσουμε την ικανότητα επερωτήσεων γεωχωρικών πληροφοριών που αποτελούν βασικό στοιχείο για αυτό το Γνωσιακό Γράφο ειδικού σκοπού. Σε αυτήν την εργασία, παρουσιάζουμε τις διαφορετικές υλοποιήσεις των RDF stores που υποστηρίζουν μια γλώσσα γεωγραφικών επερωτήσεων για δεδομένα RDF (GeoSPARQL), ένα πρότυπο του W3C για γεωγραφική και σημασιολογική αναπαράσταση γεωγραφικών δεδομένων. Επιπλέον, δημιουργήθηκαν δοκιμαστικά δεδομένα, τα οποία συγκρίνουν τον μετασχηματισμό και τη σύνδεση διαφορετικών δεδομένων, αξιοποιώντας τις αρχές του Σημασιολογικού Ιστού.The main research goal of this thesis is to propose an approach for designing an Enterprise Knowledge Graph (EKG) to ensure interoperability between different heterogeneous sources, taking into account the already existing efforts and automation processes developed by ENGIE in their attempt to build a domain-based EKG. Reaching this goal, demands a deep understanding of the already existing state-of-the-art on EKG approaches, their technologies with a focus on data-transformation and query methods and finally a comparative presentation of any new findings in this new challenge of defining an end-to-end formula for EKG construction. The criteria of evaluating the different works have been decided in a way to cover the following questions. (i) Which are the Implications and practical expectations of different design strategies to realize an EKG? (ii) How do those strategies affect semantic complexity and decrease or increase performance? (iii) Is it possible to maintain low latency and permanent updates?
Furthermore, our work was limited to one use case defined by ENGIE to explore the open data of accident and the data of road map as a starting point experience in EKG construction. We shall experiment with data transformation from heterogenous data sources into a final unified RDF datastore ready to be used as the foundation of an EKG. After, we are going to present the technical challenges, the vocabulary and the methods used to achieve a solution to the EKG definition problem.
Finally, a side goal of the thesis is to practically test and compare technical methods for data integration, enrichment and transformation. Most importantly we are going to test the ability to query Geospatial information which is a key element for this domain-based EKG. In this work, we are presenting the different implementations of RDF stores which support a Geographic Query Language for RDF Data (GeoSPARQL), a W3C standard for geo-related and Semantic representation of geographical data. Furthermore, we have formed our test data as a subset coming from ENGIE's big data. Our test data have been put and benchmarked against the knowledge transformation and linkage phases, using state-of-the-art Semantic tools
Towards a Decentralized, Trusted, Intelligent and Linked Public Sector: A Report from the Greek Trenches
This paper is a progress report on our recent work on two applications
that use Linked Data and Distributed Ledger technologies and aim to
transform the Greek public sector into a decentralized, trusted,
intelligent and linked organization. The first application is a
re-engineering of Diavgeia, the Greek government portal for open and
transparent public administration. The second application is Nomothesia,
a new portal that we have built, which makes Greek legislation available
on the Web as linked data to enable its effective use by citizens, legal
professionals and software developers who would like to build new
applications that utilize Greek legislation. The presented applications
have been implemented without funding from any source and are available
for free to any part of the Greek public sector that may want to use
them. An important goal of this paper is to present the lessons learned
from this effort
Study of the Suitability of Corncob Biochar as Electrocatalyst for Zn–Air Batteries
There has been a recent increasing interest in Zn–air batteries as an alternative to Li-ion batteries. Zn–air batteries possess some significant advantages; however, there are still problems to solve, especially related to the tuning of the properties of the air–cathode which should carry an inexpensive but efficient bifunctional oxygen reduction (ORR) and oxygen evolution (OER) reaction electrocatalyst. Biochar can be an alternative, since it is a material of low cost, it exhibits electric conductivity, and it can be used as support for transition metal ions. Although there is a significant number of publications on biochars, there is a lack of data about biochar from raw biomass rich in hemicellulose, and biochar with a small number of heteroatoms, in order to report the pristine activity of the carbon phase. In this work, activated biochar has been made by using corncobs. The biomass was first dried and minced into small pieces and pyrolyzed. Then, it was mixed with KOH and pyrolyzed for a second time. The final product was characterized by various techniques and its electroactivity as a cathode was determined. Physicochemical characterization revealed that the biochar had a hierarchical pore structure, moderate surface area of 92 m2 g−1, carbon phase with a relatively low sp2/sp3 ratio close to one, and a limited amount of N and S, but a high number of oxygen groups. The graphitization was not complete while the biochar had an ordered structure and contained significant O species. This biochar was used as an electrocatalyst for ORR and OER in Zn–air batteries where it demonstrated a satisfactory performance. More specifically, it reached an open-circuit voltage of about 1.4 V, which was stable over a period of several hours, with a short-circuit current density of 142 mA cm−2 and a maximum power density of 55 mW cm−2. Charge–discharge cycling of the battery was achieved between 1.2 and 2.1 V for a constant current of 10 mA. These data show that corncob biochar demonstrated good performance as an electrocatalyst in Zn–air batteries, despite its low specific surface and low sp2/sp3 ratio, owing to its rich oxygen sites, thus showing that electrocatalysis is a complex phenomenon and can be served by biochars of various origins
The Phase-2 Upgrade of the CMS Data Acquisition
The High Luminosity LHC (HL-LHC) will start operating in 2027 after the third Long Shutdown (LS3), and is designed to provide an ultimate instantaneous luminosity of 7:5 × 1034 cm−2 s−1, at the price of extreme pileup of up to 200 interactions per crossing. The number of overlapping interactions in HL-LHC collisions, their density, and the resulting intense radiation environment, warrant an almost complete upgrade of the CMS detector. The upgraded CMS detector will be read out by approximately fifty thousand highspeed front-end optical links at an unprecedented data rate of up to 80 Tb/s, for an average expected total event size of approximately 8 − 10 MB. Following the present established design, the CMS trigger and data acquisition system will continue to feature two trigger levels, with only one synchronous hardware-based Level-1 Trigger (L1), consisting of custom electronic boards and operating on dedicated data streams, and a second level, the High Level Trigger (HLT), using software algorithms running asynchronously on standard processors and making use of the full detector data to select events for offline storage and analysis. The upgraded CMS data acquisition system will collect data fragments for Level-1 accepted events from the detector back-end modules at a rate up to 750 kHz, aggregate fragments corresponding to individual Level- 1 accepts into events, and distribute them to the HLT processors where they will be filtered further. Events accepted by the HLT will be stored permanently at a rate of up to 7.5 kHz. This paper describes the baseline design of the DAQ and HLT systems for the Phase-2 of CMS
40 MHz Level-1 Trigger Scouting for CMS
The CMS experiment will be upgraded for operation at the HighLuminosity LHC to maintain and extend its physics performance under extreme pileup conditions. Upgrades will include an entirely new tracking system, supplemented by a track finder processor providing tracks at Level-1, as well as a high-granularity calorimeter in the endcap region. New front-end and back-end electronics will also provide the Level-1 trigger with high-resolution information from the barrel calorimeter and the muon systems. The upgraded Level-1 processors, based on powerful FPGAs, will be able to carry out sophisticated feature searches with resolutions often similar to the offline ones, while keeping pileup effects under control. In this paper, we discuss the feasibility of a system capturing Level-1 intermediate data at the beam-crossing rate of 40 MHz and carrying out online analyzes based on these limited-resolution data. This 40 MHz scouting system would provide fast and virtually unlimited statistics for detector diagnostics, alternative luminosity measurements and, in some cases, calibrations. It has the potential to enable the study of otherwise inaccessible signatures, either too common to fit in the Level-1 accept budget, or with requirements which are orthogonal to “mainstream” physics, such as long-lived particles. We discuss the requirements and possible architecture of a 40 MHz scouting system, as well as some of the physics potential, and results from a demonstrator operated at the end of Run-2 using the Global Muon Trigger data from CMS. Plans for further demonstrators envisaged for Run-3 are also discussed
DAQExpert the service to increase CMS data-taking efficiency
The Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at the LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of the subsystems, an expert system, the DAQ Expert, has been developed. It aims at improving the data taking efficiency, reducing the human error in the operations and minimising the on-call expert demand. Introduced in the beginning of 2017, it assists the shift crew and the system experts in recovering from operational faults, streamlining the post mortem analysis and, at the end of Run 2, triggering fully automatic recovery without human intervention. DAQ Expert analyses the real-time monitoring data originating from the DAQ components and the high-level trigger updated every few seconds. It pinpoints data flow problems, and recovers them automatically or after given operator approval. We analyse the CMS downtime in the 2018 run focusing on what was improved with the introduction of automated recovery; present challenges and design of encoding the expert knowledge into automated recovery jobs. Furthermore, we demonstrate the web-based, ReactJS interfaces that ensure an effective cooperation between the human operators in the control room and the automated recovery system. We report on the operational experience with automated recovery
DAQExpert the service to increase CMS data-taking efficiency
The Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at the LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of the subsystems, an expert system, the DAQ Expert, has been developed. It aims at improving the data taking efficiency, reducing the human error in the operations and minimising the on-call expert demand. Introduced in the beginning of 2017, it assists the shift crew and the system experts in recovering from operational faults, streamlining the post mortem analysis and, at the end of Run 2, triggering fully automatic recovery without human intervention. DAQ Expert analyses the real-time monitoring data originating from the DAQ components and the high-level trigger updated every few seconds. It pinpoints data flow problems, and recovers them automatically or after given operator approval. We analyse the CMS downtime in the 2018 run focusing on what was improved with the introduction of automated recovery; present challenges and design of encoding the expert knowledge into automated recovery jobs. Furthermore, we demonstrate the web-based, ReactJS interfaces that ensure an effective cooperation between the human operators in the control room and the automated recovery system. We report on the operational experience with automated recovery
Easing the Control System Application Development for CMS Detector Control System with Automatic Production Environment Reproduction
The Detector Control System (DCS) is one of the main pieces involved in the operation of the Compact Muon Solenoid (CMS) experiment at the LHC. The system is built using WinCC Open Architecture (WinCC OA) and the Joint Controls Project (JCOP) framework which was developed on top of WinCC at CERN. Following the JCOP paradigm, CMS has developed its own framework which is structured as a collection of more than 200 individual installable components each providing a different feature. Everyone of the systems that the CMS DCS consists of is created by installing a different set of these components. By automating this process, we are able to quickly and efficiently create new systems in production or recreate problematic ones, but also, to create development environments that are identical to the production ones. This latter one results in smoother development and integration processes, as the new/reworked components are developed and tested in production-like environments. Moreover, it allows the central DCS support team to easily reproduce systems that the users/developers report as being problematic, reducing the response time for bug fixing and improving the support quality
40 MHz Level-1 Trigger Scouting for CMS
The CMS experiment will be upgraded for operation at the HighLuminosity LHC to maintain and extend its physics performance under extreme pileup conditions. Upgrades will include an entirely new tracking system, supplemented by a track finder processor providing tracks at Level-1, as well as a high-granularity calorimeter in the endcap region. New front-end and back-end electronics will also provide the Level-1 trigger with high-resolution information from the barrel calorimeter and the muon systems. The upgraded Level-1 processors, based on powerful FPGAs, will be able to carry out sophisticated feature searches with resolutions often similar to the offline ones, while keeping pileup effects under control. In this paper, we discuss the feasibility of a system capturing Level-1 intermediate data at the beam-crossing rate of 40 MHz and carrying out online analyzes based on these limited-resolution data. This 40 MHz scouting system would provide fast and virtually unlimited statistics for detector diagnostics, alternative luminosity measurements and, in some cases, calibrations. It has the potential to enable the study of otherwise inaccessible signatures, either too common to fit in the Level-1 accept budget, or with requirements which are orthogonal to “mainstream” physics, such as long-lived particles. We discuss the requirements and possible architecture of a 40 MHz scouting system, as well as some of the physics potential, and results from a demonstrator operated at the end of Run-2 using the Global Muon Trigger data from CMS. Plans for further demonstrators envisaged for Run-3 are also discussed