283 research outputs found
Representing Dataset Quality Metadata using Multi-Dimensional Views
Data quality is commonly defined as fitness for use. The problem of
identifying quality of data is faced by many data consumers. Data publishers
often do not have the means to identify quality problems in their data. To make
the task for both stakeholders easier, we have developed the Dataset Quality
Ontology (daQ). daQ is a core vocabulary for representing the results of
quality benchmarking of a linked dataset. It represents quality metadata as
multi-dimensional and statistical observations using the Data Cube vocabulary.
Quality metadata are organised as a self-contained graph, which can, e.g., be
embedded into linked open datasets. We discuss the design considerations, give
examples for extending daQ by custom quality metrics, and present use cases
such as analysing data versions, browsing datasets by quality, and link
identification. We finally discuss how data cube visualisation tools enable
data publishers and consumers to analyse better the quality of their data.Comment: Preprint of a paper submitted to the forthcoming SEMANTiCS 2014, 4-5
September 2014, Leipzig, German
Towards a Knowledge Graph based Speech Interface
Applications which use human speech as an input require a speech interface
with high recognition accuracy. The words or phrases in the recognised text are
annotated with a machine-understandable meaning and linked to knowledge graphs
for further processing by the target application. These semantic annotations of
recognised words can be represented as a subject-predicate-object triples which
collectively form a graph often referred to as a knowledge graph. This type of
knowledge representation facilitates to use speech interfaces with any spoken
input application, since the information is represented in logical, semantic
form, retrieving and storing can be followed using any web standard query
languages. In this work, we develop a methodology for linking speech input to
knowledge graphs and study the impact of recognition errors in the overall
process. We show that for a corpus with lower WER, the annotation and linking
of entities to the DBpedia knowledge graph is considerable. DBpedia Spotlight,
a tool to interlink text documents with the linked open data is used to link
the speech recognition output to the DBpedia knowledge graph. Such a
knowledge-based speech recognition interface is useful for applications such as
question answering or spoken dialog systems.Comment: Under Review in International Workshop on Grounding Language
Understanding, Satellite of Interspeech 201
Depositional record of a Pliocene nested multivent maar complex at Fekete-hegy, Pannonian Basin, western Hungary.
No abstract availabl
Magnetic characteristics of the Ság-hegy volcanic complex, little Hungarian Plain
The Ság-hegy volcanic complex is located in the little Hungarian Plain Volcanic Field (LHPVF). An 39Ar/ 40Ar geochronolgy gave an isochron age of 5,42 ±0,06 My for the Ság- hegy (Wijbrans et al. 2004). Evolution of the volcano included two clearly distinct events. At first ascending magma entered meteoric water in a fluvio-lacustrine environment. Fuel-coolant interaction (FCI) of water (water saturated sediment) and magma led to the formation of a phreatomagmatic tuff ring. After water supply was used up the interior of the tephra ring was filled by a lava lake. Locally the tuff ring wall collapsed and subsequently lava was able to flow out of the tuff ring. Due to intensive quarrying most of the effusive rocks have been removed, giving excellent insight to emplacement processes of feeder dykes, sills and lava lake remnant (Martin and Németh, 2004). Pyroclastic rocks include massive and bedded units of lapilli stone, lapilli tuff/ tuff as well as pyroclastic breccias. Varying proportions of accidental lithic clasts indicate excavation of basement rocks during the erruption. Juvenile clasts comprise mainly of angular, blocky sideromelane glass shards with nearly equent shapes and a minor proportion of tachylite. A high amount of water within the systeme is evidenced by soft sediment deformation and accretionary lapilli in the pyroclastic bedsets. Dune and antidune bedding, chute and pool structures grading and sorting features suggest that the tuff ring was gradually built up by base surge and intercalated fallout deposits. Subsequent to the phreatomagmatic stage the inner crater has been filled with a lava lake which morphology was determined by the tephra deposits. At contacts to the pyroclastics a chilled margin of several cm thickness is developed which shows platty (onion shaped) jointing. A high number of dykes and sills were injected into adjacent bedsets. These shallow intrusive bodys can be found throughout the whole complex truncating and dissecting the pyroclastic units. In cases where pyroclastic units comprised a high amount of water this included even mingling with the wet tephra, leading to the formation of peperites. The uppermost units were represented by thick lava flows, which covered all underlaying units. These rocks were quarryed out already a century ago except a large strombolian spatter cone which is now exposed at the uppermost level of the quarry as a big sliced remnant including its large multiple feeder dyke. This setting offers a perfect opportunity to study the relationship between dyke and sill enplacement with transitions from vertical to bedding-parallel geometries. Dimensions of the volcanic bodies range from cm thickness of small apophyses from the lava lake into the pyroclastic rocks up to dykes and sills of several m. We performed a detailed study on a section of pyroclastic rocks truncated by dykes and sills and have evaluated the magnetic characteristics. Preliminary results show that magnetic susceptibility of all the pyroclastic units is in the range of ferrimagnetic susceptibility and varies between 2 to 20 x 10-3 SI. (Fig.1). Magnetic fabric anisotropy is generally low (< 5 %) and in the field of oblate fabric geometries, in bedded tuffs a significantly higher (5 to 10 %) but also oblate anisotropy is realized. Magnetic lineations indicate a consistent NE (020) directed material transport for the whole succession. Remanence intensities are quite high with values of 1 to 15 A/m In the pyroclastic units a stable magnetic remanence characterized by a single vector component has been measured, MDF values are in the range of 30 to 160 mT. The field vector has exclusively reversed polarity and steep inclination, which is in agreement with the paleofield direction and therefore is regarded as natural remanent magnetization aquired during deposition of the pyroclastic successions. In the dykes and sills, however, remanence direction scatter significantly and display geometries ranging from steep to flat orientations and show also strong variations in the declination. Coercitivity of magnetic carriers is significanty lower as indicated by the lower MDF values which are in the range of 8 to 30 mT in the dykes and 15 to 30 mT in sills. Beside a minor contribution of a viscose component the remanence vector in the dykes and sills is characterized by a stable single component. However, further investigations are needed to fully understand and interpret the result
Luzzu - A Framework for Linked Data Quality Assessment
With the increasing adoption and growth of the Linked Open Data cloud [9],
with RDFa, Microformats and other ways of embedding data into ordinary Web
pages, and with initiatives such as schema.org, the Web is currently being
complemented with a Web of Data. Thus, the Web of Data shares many
characteristics with the original Web of Documents, which also varies in
quality. This heterogeneity makes it challenging to determine the quality of
the data published on the Web and to subsequently make this information
explicit to data consumers. The main contribution of this article is LUZZU, a
quality assessment framework for Linked Open Data. Apart from providing quality
metadata and quality problem reports that can be used for data cleaning, LUZZU
is extensible: third party metrics can be easily plugged-in the framework. The
framework does not rely on SPARQL endpoints, and is thus free of all the
problems that come with them, such as query timeouts. Another advantage over
SPARQL based qual- ity assessment frameworks is that metrics implemented in
LUZZU can have more complex functionality than triple matching. Using the
framework, we performed a quality assessment of a number of statistical linked
datasets that are available on the LOD cloud. For this evaluation, 25 metrics
from ten different dimensions were implemented
Towards the semantic formalization of science
The past decades have witnessed a huge growth in scholarly information published on the Web, mostly in unstructured or semi-structured formats, which hampers scientific literature exploration and scientometric studies. Past studies on ontologies for structuring scholarly information focused on describing scholarly articles' components, such as document structure, metadata and bibliographies, rather than the scientific work itself. Over the past four years, we have been developing the Science Knowledge Graph Ontologies (SKGO), a set of ontologies for modeling the research findings in various fields of modern science resulting in a knowledge graph. Here, we introduce this ontology suite and discuss the design considerations taken into account during its development. We deem that within the next years, a science knowledge graph is likely to become a crucial component for organizing and exploring scientific work
- …