9 research outputs found

    Ontology Generation with the help of Chatbot

    Get PDF
    Τα γραφήματα γνώσεων διευκολύνουν την από κοινού και ξεκάθαρη κατανόηση του κόσμου μεταξύ εφαρμογών και ανθρώπων. Η τυπική τους δομή επιτρέπει την εκτέλεση τεχνικών λογισμού για την έκφραση συνειρμικής γνώση από την σαφή έκφρασή της. Γι’ αυτούς τους λόγους έχουν υιοθετηθεί σε ποικίλους τομείς, από τη βιομηχανία μέχρι τον κλάδο της υγείας. Μεγάλες τεχνολογικές εταιρείες, όπως η Google, η Microsoft και η Amazon, αναπτύσσουν τα δικά τους γραφήματα γνώσεων για να έχουν δομημένα τα δεδομένα τους. Ωστόσο, για να αξιοποιηθούν όλες οι δυνατότητες ενός γραφήματος γνώσης, προαπαιτείται ένα καλά σχεδιασμένο σχήμα, όπως η οντολογία. Για να επιτευχθεί αυτό, η διαδικασία ανάπτυξης της οντολογίας απαιτεί καλή επικοινωνία μεταξύ των ειδικών του τομέα, αλλά και των δημιουργών/προγραμματιστών οντολογιών, κάτι το οποίο δεν είναι πάντα εφικτό. Στην παρούσα εργασία έχει αναπτυχθεί ένα σύστημα οντολογίας που επιτρέπει στους ειδικούς να αναπτύξουν μια RDFS οντολογία χωρίς τη συμβολή των προγραμματιστών οντολογιών. Για να γίνει αυτό, το σύστημα διατυπώνει αρχικά μερικές ερωτήσεις στον χρήστη σχετικά με τις ερωτήσεις που θα ήθελε το γράφημα γνώσης να μπορεί να απαντήσει. Βασιζόμενος σε αυτές τις ερωτήσεις και διατυπώνοντας πρόσθετες επεξηγηματικές ερωτήσεις στον χρήστη, το σύστημα αυτόματα παράγει την οντολογία. Κατά τη διαδικασία αυτή, δίνεται η επιλογή στον χρήστη να επαναχρησιμοποιήσει όρους από εξωτερικές οντολογίες (προς το παρόν, μόνο από το Ευρωπαϊκού Ινστιτούτου Βιοπληροφορικής "EMBL’s"), τηρώντας τους συντηρητικούς περιορισμούς επέκτασης, όπως να μην επηρεάσει τη σημασιολογία άλλων οντολογιών. Η παρούσα εργασία χρησιμοποιεί ένα ευρύ φάσμα τεχνολογιών. Δημιουργώντας φιλικές προς τον χρήστη διεπαφές χρήστη (UI), οι ερωτήσεις επεξεργάζονται και δημιουργείται η οντολογία σύμφωνα με τις εκάστοτε ανάγκες του. Το λογισμικό έχει επίσης αξιολογηθεί από μεταπτυχιακούς φοιτητές του μαθήματος ‘Τεχνολογίες Γνώσης’ του Εθνικού και Καποδιστριακού Πανεπιστημίου Αθηνών. Ο βασικός σκοπός της εργασίας είναι να συνεισφέρει στα πρώτα βήματα για μελλοντική έρευνα σχετικά με το πώς οι χρήστες μπορούν να χρησιμοποιήσουν τις οντολογίες σωστά, χωρίς κάποια ειδική γνώση.Knowledge graphs facilitate a shared and unambiguous understanding of knowledge across applications and people. Their formal structure allows the performance of reasoning techniques to obtain implicit knowledge from the explicit one. For these reasons, they have been adopted in various domains, ranging from manufacturing to health sciences. Several big-tech companies, like Google, Microsoft and Amazon, develop their own knowledge graphs to structure their data. However, to leverage the full potential of a knowledge graph, a well-designed schema, i.e., the ontology, is a prerequisite. To achieve this, the ontology development process requires good communication between the domain experts and the ontology developers, which is not always feasible. In this thesis, I have developed an editor for ontology management that allows a domain expert to produce an RDFS ontology without the contribution of an ontology developer. To accomplish this, the system first asks the domain expert for the competency questions that they would like the knowledge graph to be able to answer. Based on the competency questions and by posing a set of additional clarification questions to the user, the system automatically generates the ontology. During this process, it also provides the option to the user to reuse terms from external ontologies (currently, only from EMBL’s European Bioinformatics Institute), respecting the conservative extension constraint, i.e., without affecting their semantics. This project makes use of a wide range of technologies: through user-friendly UI, questions from the user are processed and then the ontology is created according to user needs. The software has also been evaluated by postgraduate students of the Knowledge Technologies course at the National and Kapodistrian University of Athens. The overall aim is to contribute the basic steps for future studies on how users can create their ontologies correctly, regardless of their depth of knowledge

    Geospatial crowdsourced data fitness analysis for spatial data infrastructure based disaster management actions

    Get PDF
    The reporting of disasters has changed from official media reports to citizen reporters who are at the disaster scene. This kind of crowd based reporting, related to disasters or any other events, is often identified as 'Crowdsourced Data' (CSD). CSD are freely and widely available thanks to the current technological advancements. The quality of CSD is often problematic as it is often created by the citizens of varying skills and backgrounds. CSD is considered unstructured in general, and its quality remains poorly defined. Moreover, the CSD's location availability and the quality of any available locations may be incomplete. The traditional data quality assessment methods and parameters are also often incompatible with the unstructured nature of CSD due to its undocumented nature and missing metadata. Although other research has identified credibility and relevance as possible CSD quality assessment indicators, the available assessment methods for these indicators are still immature. In the 2011 Australian floods, the citizens and disaster management administrators used the Ushahidi Crowd-mapping platform and the Twitter social media platform to extensively communicate flood related information including hazards, evacuations, help services, road closures and property damage. This research designed a CSD quality assessment framework and tested the quality of the 2011 Australian floods' Ushahidi Crowdmap and Twitter data. In particular, it explored a number of aspects namely, location availability and location quality assessment, semantic extraction of hidden location toponyms and the analysis of the credibility and relevance of reports. This research was conducted based on a Design Science (DS) research method which is often utilised in Information Science (IS) based research. Location availability of the Ushahidi Crowdmap and the Twitter data assessed the quality of available locations by comparing three different datasets i.e. Google Maps, OpenStreetMap (OSM) and Queensland Department of Natural Resources and Mines' (QDNRM) road data. Missing locations were semantically extracted using Natural Language Processing (NLP) and gazetteer lookup techniques. The Credibility of Ushahidi Crowdmap dataset was assessed using a naive Bayesian Network (BN) model commonly utilised in spam email detection. CSD relevance was assessed by adapting Geographic Information Retrieval (GIR) relevance assessment techniques which are also utilised in the IT sector. Thematic and geographic relevance were assessed using Term Frequency – Inverse Document Frequency Vector Space Model (TF-IDF VSM) and NLP based on semantic gazetteers. Results of the CSD location comparison showed that the combined use of non-authoritative and authoritative data improved location determination. The semantic location analysis results indicated some improvements of the location availability of the tweets and Crowdmap data; however, the quality of new locations was still uncertain. The results of the credibility analysis revealed that the spam email detection approaches are feasible for CSD credibility detection. However, it was critical to train the model in a controlled environment using structured training including modified training samples. The use of GIR techniques for CSD relevance analysis provided promising results. A separate relevance ranked list of the same CSD data was prepared through manual analysis. The results revealed that the two lists generally agreed which indicated the system's potential to analyse relevance in a similar way to humans. This research showed that the CSD fitness analysis can potentially improve the accuracy, reliability and currency of CSD and may be utilised to fill information gaps available in authoritative sources. The integrated and autonomous CSD qualification framework presented provides a guide for flood disaster first responders and could be adapted to support other forms of emergencies

    Probabilistic techniques in semantic mapping for mobile robotics

    Get PDF
    Los mapas semánticos son representaciones del mundo que permiten a un robot entender no sólo los aspectos espaciales de su lugar de trabajo, sino también el significado de sus elementos (objetos, habitaciones, etc.) y como los humanos interactúan con ellos (e.g. funcionalidades, eventos y relaciones). Para conseguirlo, un mapa semántico añade a las representaciones puramente espaciales, tales como mapas geométricos o topológicos, meta-información sobre los tipos de elementos y relaciones que pueden encontrarse en el entorno de trabajo. Esta meta-información, denominada conocimiento semántico o de sentido común, se codifica típicamente en Bases de Conocimiento. Un ejemplo de este tipo de información podría ser: "los frigoríficos son objetos grandes, con forma rectangular, colocados normalmente en las cocinas, y que pueden contener comida perecedera y medicación". Codificar y manejar este conocimiento semántico permite al robot razonar acerca de la información obtenida de un cierto lugar de trabajo, así como inferir nueva información con el fin de ejecutar eficientemente tareas de alto nivel como "¡hola robot! llévale la medicación a la abuela, por favor". La presente tesis propone la utilización de técnicas probabilísticas para construir y mantener mapas semánticos, lo cual presenta tres ventajas principales en comparación con los enfoques tradicionales: i) permite manejar incertidumbre (proveniente de los sensores imprecisos del robot y de los modelos empleados), ii) provee representaciones del entorno coherentes por medio del aprovechamiento de las relaciones contextuales entre los elementos observados (e.g. los frigoríficos usualmente se encuentran en las cocinas) desde un punto de vista holístico, y iii) produce valores de certidumbre que reflejan el grado de exactitud de la comprensión del robot acerca de su entorno. Específicamente, las contribuciones presentadas pueden agruparse en dos temas principales. El primer conjunto de contribuciones se basa en el problema del reconocimiento de objetos y/o habitaciones, ya que los sistemas de mapeo semántico deben contar con algoritmos de reconocimiento fiables para la construcción de representaciones válidas. Para ello se ha explorado la utilización de Modelos Gráficos Probabilísticos (Probabilistic Graphical Models o PGMs en inglés) con el fin de aprovechar las relaciones de contexto entre objetos y/o habitaciones a la vez que se maneja la incertidumbre inherente al problema de reconocimiento, y el empleo de Bases de Conocimiento para mejorar su desempeño de distintos modos, e.g., detectando resultados incoherentes, proveyendo información a priori, reduciendo la complejidad de los algoritmos de inferencia probabilística, generando ejemplos de entrenamiento sintéticos, habilitando el aprendizaje a partir de experiencias pasadas, etc. El segundo grupo de contribuciones acomoda los resultados probabilísticos provenientes de los algoritmos de reconocimiento desarrollados en una nueva representación semántica, denominada Multiversal Semantic Map (MvSmap). Este mapa gestiona múltiples interpretaciones del espacio de trabajo del robot, llamadas universos, los cuales son anotados con la probabilidad de ser los correctos de acuerdo con el conocimiento actual del robot. Así, este enfoque proporciona una creencia fundamentada sobre la exactitud de la comprensión del robot sobre su entorno, lo que le permite operar de una manera más eficiente y coherente. Los algoritmos probabilísticos propuestos han sido testeados concienzudamente y comparados con otros enfoques actuales e innovadores empleando conjuntos de datos del estado del arte. De manera adicional, esta tesis también contribuye con dos conjuntos de datos, UMA-Offices and Robot@Home, los cuales contienen información sensorial capturada en distintos entornos de oficinas y casas, así como dos herramientas software, la librería Undirected Probabilistic Graphical Models in C++ (UPGMpp), y el conjunto de herramientas Object Labeling Toolkit (OLT), para el trabajo con Modelos Gráficos Probabilísticos y el procesamiento de conjuntos de datos respectivamente

    Strategies for Managing Linked Enterprise Data

    Get PDF
    Data, information and knowledge become key assets of our 21st century economy. As a result, data and knowledge management become key tasks with regard to sustainable development and business success. Often, knowledge is not explicitly represented residing in the minds of people or scattered among a variety of data sources. Knowledge is inherently associated with semantics that conveys its meaning to a human or machine agent. The Linked Data concept facilitates the semantic integration of heterogeneous data sources. However, we still lack an effective knowledge integration strategy applicable to enterprise scenarios, which balances between large amounts of data stored in legacy information systems and data lakes as well as tailored domain specific ontologies that formally describe real-world concepts. In this thesis we investigate strategies for managing linked enterprise data analyzing how actionable knowledge can be derived from enterprise data leveraging knowledge graphs. Actionable knowledge provides valuable insights, supports decision makers with clear interpretable arguments, and keeps its inference processes explainable. The benefits of employing actionable knowledge and its coherent management strategy span from a holistic semantic representation layer of enterprise data, i.e., representing numerous data sources as one, consistent, and integrated knowledge source, to unified interaction mechanisms with other systems that are able to effectively and efficiently leverage such an actionable knowledge. Several challenges have to be addressed on different conceptual levels pursuing this goal, i.e., means for representing knowledge, semantic data integration of raw data sources and subsequent knowledge extraction, communication interfaces, and implementation. In order to tackle those challenges we present the concept of Enterprise Knowledge Graphs (EKGs), describe their characteristics and advantages compared to existing approaches. We study each challenge with regard to using EKGs and demonstrate their efficiency. In particular, EKGs are able to reduce the semantic data integration effort when processing large-scale heterogeneous datasets. Then, having built a consistent logical integration layer with heterogeneity behind the scenes, EKGs unify query processing and enable effective communication interfaces for other enterprise systems. The achieved results allow us to conclude that strategies for managing linked enterprise data based on EKGs exhibit reasonable performance, comply with enterprise requirements, and ensure integrated data and knowledge management throughout its life cycle

    A multi-strategy methodology for ontology integration and reuse. Integrating large and heterogeneous knowledge bases in the rise of Big Data

    Get PDF
    The new revolutionary web today, i.e., the Semantic Web, has augmented the previous one by promoting common data formats and exchange protocols in order to provide a framework that allows data to be shared and reused across application, enterprise, and community boundaries. This revolution, along with the increasing digitization of the world, has led to a high availability of knowledge models, viz., formal representations of concepts and relations between concepts underlying a certain universe of discourse or knowledge domain, which span throughout a wide range of topics, fields of study and applications, from biomedical to advanced manufacturing, mostly heterogeneous from each other at a different levels. As more and more outbreaks of this new revolution light up, a major challenge came soon into sight: addressing the main objectives of the semantic web, the sharing and reuse of data, demands effective and efficient methodologies to mediate between models characterized by such a heterogeneity. Since ontologies are the de facto standard in representing and sharing knowledge models over the web, this doctoral thesis presents a comprehensive methodology to ontology integration and reuse based on various matching techniques. The proposed approach is supported by an ad hoc software framework whose scope is easing the creation of new ontologies by promoting the reuse of existing ones and automatizing, as much as possible, the whole ontology construction procedure

    Онтологічний аналіз у Web

    Get PDF
    Монографію присвячено проблематиці розробки, дослідження та використання онтологій в розподілених застосуваннях. Проаналізовано моделі та методи подання онтологій, їх зв’язок з технологіями Semantic Web. В роботі аналізуються питання, що стосуються здобуттям онтологічних знань з відкритих джерел Web, Wiki-ресурсів та природномовних документів. Розглядається роль онтологічного аналізу в інтелектуалізації пошукових систем, Web–сервісів та програмних агентів. Наводяться приклади застосування онтологій в освіті та науці. Робота орієнтована на дослідників та науковців, які займаються розробками в галузі розподілених інтелектуальних систем

    Módulo de Gestão de alarmes na Indústria 4.0

    Get PDF
    Com a constante evolução tecnológica, mais particularmente na área da indústria, tornou-se necessária a evolução dos métodos de monitorização de forma a garantir a segurança e o funcionamento devido de todos os domínios em questão. Com o intuito de contribuir para a monitorização de empresas com alto envolvimento tecnológico, o GECAD iniciou o desenvolvimento de um projeto, Alarm Management Model for 4.0 Industry, que consiste no desenvolvimento de um sistema capaz de gerir alarmes de várias fontes diferentes, facilitando o acesso a informação sobre os mesmos de forma clara e objetiva. O foco principal do projeto passa por definir, com base nos alarmes detetados, prioridades entre eles de forma individual ou combinada e permitir ao utilizador ter uma experiência simples e eficaz aquando da utilização do sistema desenvolvido.With the constant technology evolution, more particularly in the industrial area, it became the necessary the evolution of the monitorization methods in order to guarantee the security and the behavior of all the compartments in the system. With the goal of contributing to the monitorization of this type of companies, GECAD started the development of a new project, Alarm Management Model for 4.0 Industry that consists of developing a system capable of detecting alarms from various sources and manage it. The main focus of the project is about defining, based on the detected alarms, the individual and combined priority of them so it allows the user to have a simpler experience when using the developed system

    Semantic rules representation in controlled natural language in FluentEditor

    No full text
    corecore