8 research outputs found
Promocijas darbs
Elektroniskā versija nesatur pielikumusDarbā izstrādātas oriģinālas metodes, kas ļauj vizuālus uz paplašinātām UML veida grafu diagrammām balstītus rīkus izmantot praktisku ontoloģiju un semantisko datu vaicājumu veidošanai un attēlošanai. OWL ontoloģiju vizuālas modelēšanas jomā izveidoti līdzekļi konkrētam lietojumam specifiskas notācijas uzdošanai un izmantošanai, tādi ka: mehānisms lietotāja definētu notāciju uzdošanai, ontoloģiju vizualizācijas parametru ietvars, ontoloģiju eksporta modulis un uz gramatikām balstīta priekšāteikšanas metode. Darbā piedāvāts risinājums vizuālai bagātīgu datu vaicājumu veidošanai pār RDF datubāzēm, un to translēšanai uz tekstuālu SPARQL valodu, kurā pierakstītie vaicājumi var tikt tieši izpildīti pār RDF datu bāzēm. Atslēgvārdi: OWL, OWLGrEd, teksta priekšāteicējs, domēnspecifiska ontoloģiju attēlošana, SPARQL, vizuāli vaicājumi, ViziQuerThe doctoral thesis develops original methods that allow visual tools that are based on extended UML-style graph diagrams to be used for creating and visualising practical ontologies and semantic data queries. In the field of visual modeling of OWL ontologies, tools have been developed for creating modeling notations specific to particular applications, such as a mechanism for creating user-defined notations, a framework for ontology visualisation parameters, an ontology export module and a grammar-based auto-completion method. The doctoral thesis presents a solution for the visual formulation of rich data queries over RDF databases, and their translation into the standard textual SPARQL query language. Keywords: OWL, OWLGrEd, text auto-completion, Domain-Specific Ontology Representation, SPARQL, Visual Queries, ViziQuer
Recommended from our members
ADCN: An Anisotropic Density-Based Clustering Algorithm for Discovering Spatial Point Patterns with Noise
Density-based clustering algorithms such as DBSCAN have been widely used for spatial knowledge discovery as they offer several key advantages compared to other clustering algorithms. They can discover clusters with arbitrary shapes, are robust to noise and do not require prior knowledge (or estimation) of the number of clusters. The idea of using a scan circle centered at each point with a search radius Eps to find at least MinPts points as a criterion for deriving local density is easily understandable and sufficient for exploring isotropic spatial point patterns. However, there are many cases that cannot be adequately captured this way, particularly if they involve linear features or shapes with a continuously changing density such as a spiral. In such cases, DBSCAN tends to either create an increasing number of small clusters or add noise points into large clusters. Therefore, in this paper, we propose a novel anisotropic density-based clustering algorithm (ADCN). To motivate our work, we introduce synthetic and real-world cases that cannot be sufficiently handled by DBSCAN (and OPTICS). We then present our clustering algorithm and test it with a wide range of cases. We demonstrate that our algorithm can perform as equally well as DBSCAN in cases that do not explicitly benefit from an anisotropic perspective and that it outperforms DBSCAN in cases that do. We show that our approach has the same time complexity as DBSCAN and OPTICS, namely O(n log n) when using a spatial index and O(n 2 ) otherwise. We provide an implementation and test the runtime over multiple cases. Finally, we apply DBSCAN, OPTICS, and ADCN to the task of extracting urban areas of interest (AOI) from geotagged photos in six cities. Visual comparison shows that, comparing to DBSCAN and OPTICS, ADCN is inclined to extract AOIs with linear shapes which follow the underline road networks. ADCN also turns out to connect clusters when the spatial distribution of them shows similar directions
Linked Data Supported Information Retrieval
Um Inhalte im World Wide Web ausfindig zu machen, sind Suchmaschienen nicht mehr wegzudenken. Semantic Web und Linked Data Technologien ermöglichen ein detaillierteres und eindeutiges Strukturieren der Inhalte und erlauben vollkommen neue Herangehensweisen an die Lösung von Information Retrieval Problemen. Diese Arbeit befasst sich mit den Möglichkeiten, wie Information Retrieval Anwendungen von der Einbeziehung von Linked Data profitieren können. Neue Methoden der computer-gestützten semantischen Textanalyse, semantischen Suche, Informationspriorisierung und -visualisierung werden vorgestellt und umfassend evaluiert. Dabei werden Linked Data Ressourcen und ihre Beziehungen in die Verfahren integriert, um eine Steigerung der Effektivität der Verfahren bzw. ihrer Benutzerfreundlichkeit zu erzielen. Zunächst wird eine Einführung in die Grundlagen des Information Retrieval und Linked Data gegeben. Anschließend werden neue manuelle und automatisierte Verfahren zum semantischen Annotieren von Dokumenten durch deren Verknüpfung mit Linked Data Ressourcen vorgestellt (Entity Linking). Eine umfassende Evaluation der Verfahren wird durchgeführt und das zu Grunde liegende Evaluationssystem umfangreich verbessert. Aufbauend auf den Annotationsverfahren werden zwei neue Retrievalmodelle zur semantischen Suche vorgestellt und evaluiert. Die Verfahren basieren auf dem generalisierten Vektorraummodell und beziehen die semantische Ähnlichkeit anhand von taxonomie-basierten Beziehungen der Linked Data Ressourcen in Dokumenten und Suchanfragen in die Berechnung der Suchergebnisrangfolge ein. Mit dem Ziel die Berechnung von semantischer Ähnlichkeit weiter zu verfeinern, wird ein Verfahren zur Priorisierung von Linked Data Ressourcen vorgestellt und evaluiert. Darauf aufbauend werden Visualisierungstechniken aufgezeigt mit dem Ziel, die Explorierbarkeit und Navigierbarkeit innerhalb eines semantisch annotierten Dokumentenkorpus zu verbessern. Hierfür werden zwei Anwendungen präsentiert. Zum einen eine Linked Data basierte explorative Erweiterung als Ergänzung zu einer traditionellen schlüsselwort-basierten Suchmaschine, zum anderen ein Linked Data basiertes Empfehlungssystem
Actes des 29es Journées Francophones d'Ingénierie des Connaissances, IC 2018
International audienc
Data Science and Knowledge Discovery
Data Science (DS) is gaining significant importance in the decision process due to a mix of various areas, including Computer Science, Machine Learning, Math and Statistics, domain/business knowledge, software development, and traditional research. In the business field, DS's application allows using scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data to support the decision process. After collecting the data, it is crucial to discover the knowledge. In this step, Knowledge Discovery (KD) tasks are used to create knowledge from structured and unstructured sources (e.g., text, data, and images). The output needs to be in a readable and interpretable format. It must represent knowledge in a manner that facilitates inferencing. KD is applied in several areas, such as education, health, accounting, energy, and public administration. This book includes fourteen excellent articles which discuss this trending topic and present innovative solutions to show the importance of Data Science and Knowledge Discovery to researchers, managers, industry, society, and other communities. The chapters address several topics like Data mining, Deep Learning, Data Visualization and Analytics, Semantic data, Geospatial and Spatio-Temporal Data, Data Augmentation and Text Mining
Machine Learning Algorithm for the Scansion of Old Saxon Poetry
Several scholars designed tools to perform the automatic scansion of poetry in many languages, but none of these tools
deal with Old Saxon or Old English. This project aims to be a first attempt to create a tool for these languages. We
implemented a Bidirectional Long Short-Term Memory (BiLSTM) model to perform the automatic scansion of Old Saxon
and Old English poems. Since this model uses supervised learning, we manually annotated the Heliand manuscript, and
we used the resulting corpus as labeled dataset to train the model. The evaluation of the performance of the algorithm
reached a 97% for the accuracy and a 99% of weighted average for precision, recall and F1 Score. In addition, we tested
the model with some verses from the Old Saxon Genesis and some from The Battle of Brunanburh, and we observed that
the model predicted almost all Old Saxon metrical patterns correctly misclassified the majority of the Old English input
verses
Qualität in der Inhaltserschließung
This edited volume deals with issues relating to the quality of subject cataloging in the digital age, where heterogenous articles from different processes meet, and attempts to define important quality standards. Topics range from metadata and the cataloging policies of the German National Library, the GND, and the head offices of the German library association, to the presentation of a range of different projects, such as QURATOR and SoNAR