273 research outputs found

    Representing Dataset Quality Metadata using Multi-Dimensional Views

    Full text link
    Data quality is commonly defined as fitness for use. The problem of identifying quality of data is faced by many data consumers. Data publishers often do not have the means to identify quality problems in their data. To make the task for both stakeholders easier, we have developed the Dataset Quality Ontology (daQ). daQ is a core vocabulary for representing the results of quality benchmarking of a linked dataset. It represents quality metadata as multi-dimensional and statistical observations using the Data Cube vocabulary. Quality metadata are organised as a self-contained graph, which can, e.g., be embedded into linked open datasets. We discuss the design considerations, give examples for extending daQ by custom quality metrics, and present use cases such as analysing data versions, browsing datasets by quality, and link identification. We finally discuss how data cube visualisation tools enable data publishers and consumers to analyse better the quality of their data.Comment: Preprint of a paper submitted to the forthcoming SEMANTiCS 2014, 4-5 September 2014, Leipzig, German

    Towards a Knowledge Graph based Speech Interface

    Full text link
    Applications which use human speech as an input require a speech interface with high recognition accuracy. The words or phrases in the recognised text are annotated with a machine-understandable meaning and linked to knowledge graphs for further processing by the target application. These semantic annotations of recognised words can be represented as a subject-predicate-object triples which collectively form a graph often referred to as a knowledge graph. This type of knowledge representation facilitates to use speech interfaces with any spoken input application, since the information is represented in logical, semantic form, retrieving and storing can be followed using any web standard query languages. In this work, we develop a methodology for linking speech input to knowledge graphs and study the impact of recognition errors in the overall process. We show that for a corpus with lower WER, the annotation and linking of entities to the DBpedia knowledge graph is considerable. DBpedia Spotlight, a tool to interlink text documents with the linked open data is used to link the speech recognition output to the DBpedia knowledge graph. Such a knowledge-based speech recognition interface is useful for applications such as question answering or spoken dialog systems.Comment: Under Review in International Workshop on Grounding Language Understanding, Satellite of Interspeech 201

    Magnetic characteristics of the Ság-hegy volcanic complex, little Hungarian Plain

    Get PDF
    The Ság-hegy volcanic complex is located in the little Hungarian Plain Volcanic Field (LHPVF). An 39Ar/ 40Ar geochronolgy gave an isochron age of 5,42 ±0,06 My for the Ság- hegy (Wijbrans et al. 2004). Evolution of the volcano included two clearly distinct events. At first ascending magma entered meteoric water in a fluvio-lacustrine environment. Fuel-coolant interaction (FCI) of water (water saturated sediment) and magma led to the formation of a phreatomagmatic tuff ring. After water supply was used up the interior of the tephra ring was filled by a lava lake. Locally the tuff ring wall collapsed and subsequently lava was able to flow out of the tuff ring. Due to intensive quarrying most of the effusive rocks have been removed, giving excellent insight to emplacement processes of feeder dykes, sills and lava lake remnant (Martin and Németh, 2004). Pyroclastic rocks include massive and bedded units of lapilli stone, lapilli tuff/ tuff as well as pyroclastic breccias. Varying proportions of accidental lithic clasts indicate excavation of basement rocks during the erruption. Juvenile clasts comprise mainly of angular, blocky sideromelane glass shards with nearly equent shapes and a minor proportion of tachylite. A high amount of water within the systeme is evidenced by soft sediment deformation and accretionary lapilli in the pyroclastic bedsets. Dune and antidune bedding, chute and pool structures grading and sorting features suggest that the tuff ring was gradually built up by base surge and intercalated fallout deposits. Subsequent to the phreatomagmatic stage the inner crater has been filled with a lava lake which morphology was determined by the tephra deposits. At contacts to the pyroclastics a chilled margin of several cm thickness is developed which shows platty (onion shaped) jointing. A high number of dykes and sills were injected into adjacent bedsets. These shallow intrusive bodys can be found throughout the whole complex truncating and dissecting the pyroclastic units. In cases where pyroclastic units comprised a high amount of water this included even mingling with the wet tephra, leading to the formation of peperites. The uppermost units were represented by thick lava flows, which covered all underlaying units. These rocks were quarryed out already a century ago except a large strombolian spatter cone which is now exposed at the uppermost level of the quarry as a big sliced remnant including its large multiple feeder dyke. This setting offers a perfect opportunity to study the relationship between dyke and sill enplacement with transitions from vertical to bedding-parallel geometries. Dimensions of the volcanic bodies range from cm thickness of small apophyses from the lava lake into the pyroclastic rocks up to dykes and sills of several m. We performed a detailed study on a section of pyroclastic rocks truncated by dykes and sills and have evaluated the magnetic characteristics. Preliminary results show that magnetic susceptibility of all the pyroclastic units is in the range of ferrimagnetic susceptibility and varies between 2 to 20 x 10-3 SI. (Fig.1). Magnetic fabric anisotropy is generally low (< 5 %) and in the field of oblate fabric geometries, in bedded tuffs a significantly higher (5 to 10 %) but also oblate anisotropy is realized. Magnetic lineations indicate a consistent NE (020) directed material transport for the whole succession. Remanence intensities are quite high with values of 1 to 15 A/m In the pyroclastic units a stable magnetic remanence characterized by a single vector component has been measured, MDF values are in the range of 30 to 160 mT. The field vector has exclusively reversed polarity and steep inclination, which is in agreement with the paleofield direction and therefore is regarded as natural remanent magnetization aquired during deposition of the pyroclastic successions. In the dykes and sills, however, remanence direction scatter significantly and display geometries ranging from steep to flat orientations and show also strong variations in the declination. Coercitivity of magnetic carriers is significanty lower as indicated by the lower MDF values which are in the range of 8 to 30 mT in the dykes and 15 to 30 mT in sills. Beside a minor contribution of a viscose component the remanence vector in the dykes and sills is characterized by a stable single component. However, further investigations are needed to fully understand and interpret the result

    Luzzu - A Framework for Linked Data Quality Assessment

    Full text link
    With the increasing adoption and growth of the Linked Open Data cloud [9], with RDFa, Microformats and other ways of embedding data into ordinary Web pages, and with initiatives such as schema.org, the Web is currently being complemented with a Web of Data. Thus, the Web of Data shares many characteristics with the original Web of Documents, which also varies in quality. This heterogeneity makes it challenging to determine the quality of the data published on the Web and to subsequently make this information explicit to data consumers. The main contribution of this article is LUZZU, a quality assessment framework for Linked Open Data. Apart from providing quality metadata and quality problem reports that can be used for data cleaning, LUZZU is extensible: third party metrics can be easily plugged-in the framework. The framework does not rely on SPARQL endpoints, and is thus free of all the problems that come with them, such as query timeouts. Another advantage over SPARQL based qual- ity assessment frameworks is that metrics implemented in LUZZU can have more complex functionality than triple matching. Using the framework, we performed a quality assessment of a number of statistical linked datasets that are available on the LOD cloud. For this evaluation, 25 metrics from ten different dimensions were implemented

    Towards the semantic formalization of science

    Get PDF
    The past decades have witnessed a huge growth in scholarly information published on the Web, mostly in unstructured or semi-structured formats, which hampers scientific literature exploration and scientometric studies. Past studies on ontologies for structuring scholarly information focused on describing scholarly articles' components, such as document structure, metadata and bibliographies, rather than the scientific work itself. Over the past four years, we have been developing the Science Knowledge Graph Ontologies (SKGO), a set of ontologies for modeling the research findings in various fields of modern science resulting in a knowledge graph. Here, we introduce this ontology suite and discuss the design considerations taken into account during its development. We deem that within the next years, a science knowledge graph is likely to become a crucial component for organizing and exploring scientific work
    • …
    corecore