50 research outputs found
Towards flexible indices for distributed graph data: The formal schema-level index model FLuID
Graph indices are a key to manage huge amounts of distributed graph data. Instance-level indices are available that focus on the fast retrieval of nodes. Furthermore, there are so-called schema-level indices focusing on summarizing nodes sharing common characteristics, i. e., the combination of attached types and used property-labels. We argue that there is not a one-size-fits-all schema-level index. Rather, a parameterized, formal model is needed that allows to quickly design, tailor, and compare different schema-level indices. We abstract from related works and provide the formal model FLuID using basic building blocks to flexibly define different schema-level indices. The FLuID model provides parameterized simple and complex schema elements together with four parameters. We show that all indices modeled in FLuID can be computed in O(n). Thus, FLuID enables us to efficiently implement, compare, and validate variants of schema-level indices tailored for specific application scenarios
Estimating flow and transport parameters in the unsaturated zone with pore water stable isotopes
The first author was funded by the DFG Research Group: From Catchments as Organised Systems to Models based on Functional Units (FOR 1598). The second author was funded by the DFG project Coupled soil-plant water dynamics – Environmental drivers and species effects (contract numbers: GE 1090/10-1 and WE 4598/2-1). The isotope data in the precipitation for Roodt were provided by FNR/CORE/SOWAT, project of the Luxembourg Institute of Science and Technology – LIST. Sampling of the isotope profiles was made possible by the support of the CAOS Team and Begona Lorente Sistiaga, Benjamin Gralher, Andre Böker, Marvin Reich and Andrea Popp. Special thanks to Britta Kattenstroth and Jean Francois Iffly for their technical support in the field and Barbara Herbstritt for her support in the laboratory. For Roodt, soil texture and hydraulic parameter information were provided by Conrad Jackisch and Christoph Messer (KIT, Karlsruhe, Germany) and hydraulic conductivity data were provided by Christophe Hissler and Jérôme Juilleret (LIST). Pore water isotope and soil moisture data for Hartheim were provided by Steffen Holzkämper and Paul Königer. Temperature and precipitation data for Hartheim were provided by the Chair of Meteorology and Climatology, University of Freiburg.Peer reviewedPublisher PD
Towards an Open Platform for Legal Information
Recent advances in the area of legal information systems have led to a
variety of applications that promise support in processing and accessing legal
documents. Unfortunately, these applications have various limitations, e.g.,
regarding scope or extensibility. Furthermore, we do not observe a trend
towards open access in digital libraries in the legal domain as we observe in
other domains, e.g., economics of computer science. To improve open access in
the legal domain, we present our approach for an open source platform to
transparently process and access Legal Open Data. This enables the sustainable
development of legal applications by offering a single technology stack.
Moreover, the approach facilitates the development and deployment of new
technologies. As proof of concept, we implemented six technologies and
generated metadata for more than 250,000 German laws and court decisions. Thus,
we can provide users of our platform not only access to legal documents, but
also the contained information.Comment: Accepted at ACM/IEEE Joint Conference on Digital Libraries (JCDL)
202
Time and Memory Efficient Parallel Algorithm for Structural Graph Summaries and two Extensions to Incremental Summarization and -Bisimulation for Long -Chaining
We developed a flexible parallel algorithm for graph summarization based on
vertex-centric programming and parameterized message passing. The base
algorithm supports infinitely many structural graph summary models defined in a
formal language. An extension of the parallel base algorithm allows incremental
graph summarization. In this paper, we prove that the incremental algorithm is
correct and show that updates are performed in time , where is the number of additions, deletions, and modifications
to the input graph, the maximum degree, and is the maximum distance in
the subgraphs considered. Although the iterative algorithm supports values of
, it requires nested data structures for the message passing that are
memory-inefficient. Thus, we extended the base summarization algorithm by a
hash-based messaging mechanism to support a scalable iterative computation of
graph summarizations based on -bisimulation for arbitrary . We
empirically evaluate the performance of our algorithms using benchmark and
real-world datasets. The incremental algorithm almost always outperforms the
batch computation. We observe in our experiments that the incremental algorithm
is faster even in cases when of the graph database changes from one
version to the next. The incremental computation requires a three-layered hash
index, which has a low memory overhead of only (). Finally, the
incremental summarization algorithm outperforms the batch algorithm even with
fewer cores. The iterative parallel -bisimulation algorithm computes
summaries on graphs with over M edges within seconds. We show that the
algorithm processes graphs of M edges within a few minutes while having
a moderate memory consumption of GB. For the largest BSBM1B dataset with
1 billion edges, it computes bisimulation in under an hour
Towards an incremental schema-level index for distributed linked open data graphs
Semi-structured, schema-free data formats are used in many applications because their flexibility enables simple data exchange. Especially graph data formats like RDF have become well established in the Web of Data. For the Web of Data, it is known that data instances are not only added, changed, and removed regularly, but that their schemas are also subject to enormous changes over time. Unfortunately, the collection, indexing, and analysis of the evolution of data schemas on the web is still in its infancy. To enable a detailed analysis of the evolution of Linked Open Data, we lay the foundation for the implementation of incremental schema-level indices for the Web of Data. Unlike existing schema-level indices, incremental schema-level indices have an efficient update mechanism to avoid costly recomputations of the entire index. This enables us to monitor changes to data instances at schema-level, trace changes, and ultimately provide an always up-to-date schema-level index for the Web of Data. In this paper, we analyze in detail the challenges of updating arbitrary schema-level indices for the Web of Data. To this end, we extend our previously developed meta model FLuID. In addition, we outline an algorithm for performing the updates
Observation of the Efimov state of the helium trimer
Quantum theory dictates that upon weakening the two-body interaction in a
three-body system, an infinite number of three-body bound states of a huge
spatial extent emerge just before these three-body states become unbound. Three
helium atoms have been predicted to form a molecular system that manifests this
peculiarity under natural conditions without artificial tuning of the
attraction between particles by an external field. Here we report experimental
observation of this long predicted but experimentally elusive Efimov state of
He by means of Coulomb explosion imaging. We show spatial images of
an Efimov state, confirming the predicted size and a typical structure where
two atoms are close to each other while the third is far away
Structural Summarization of Semantic Graphs Using Quotients
Graph summarization is the process of computing a compact version of an input graph while preserving chosen features of its structure. We consider semantic graphs where the features include edge labels and label sets associated with a vertex. Graph summaries are typically much smaller than the original graph. Applications that depend on the preserved features can perform their tasks on the summary, but much faster or with less memory overhead, while producing the same outcome as if they were applied on the original graph. In this survey, we focus on structural summaries based on quotients that organize vertices in equivalence classes of shared features. Structural summaries are particularly popular for semantic graphs and have the advantage of defining a precise graph-based output. We consider approaches and algorithms for both static and temporal graphs. A common example of quotient-based structural summaries is bisimulation, and we discuss this in detail. While there exist other surveys on graph summarization, to the best of our knowledge, we are the first to bring in a focused discussion on quotients, bisimulation, and their relation. Furthermore, structural summarization naturally connects well with formal logic due to the discrete structures considered. We complete the survey with a brief description of approaches beyond structural summaries
MOVING: A User-Centric Platform for Online Literacy Training and Learning
Part of the Progress in IS book series (PROIS)In this paper, we present an overview of the MOVING platform, a user-driven approach that enables young researchers, decision makers, and public administrators to use machine learning and data mining tools to search, organize, and manage large-scale information sources on the web such as scientific publications, videos of research talks, and social media. In order to provide a concise overview of the platform, we focus on its front end, which is the MOVING web application. By presenting the main components of the web application, we illustrate what functionalities and capabilities the platform offer its end-users, rather than delving into the data analysis and machine learning technologies that make these functionalities possible
A dense network of cosmic-ray neutron sensors for soil moisture observation in a highly instrumented pre-Alpine headwater catchment in Germany
Monitoring soil moisture is still a challenge: it varies strongly in space and time and at various scales while conventional sensors typically suffer from small spatial support. With a sensor footprint up to several hectares, cosmic-ray neutron sensing (CRNS) is a modern technology to address that challenge.
So far, the CRNS method has typically been applied with single sensors or in sparse national-scale networks. This study presents, for the first time, a dense network of 24 CRNS stations that covered, from May to July 2019, an area of just 1 km2: the pre-Alpine Rott headwater catchment in Southern Germany, which is characterized by strong soil moisture gradients in a heterogeneous landscape with forests and grasslands. With substantially overlapping sensor footprints, this network was designed to study root-zone soil moisture dynamics at the catchment scale. The observations of the dense CRNS network were complemented by extensive measurements that allow users to study soil moisture variability at various spatial scales: roving (mobile) CRNS units, remotely sensed thermal images from unmanned areal systems (UASs), permanent and temporary wireless sensor networks, profile probes, and comprehensive manual soil sampling. Since neutron counts are also affected by hydrogen pools other than soil moisture, vegetation biomass was monitored in forest and grassland patches, as well as meteorological variables; discharge and groundwater tables were recorded to support hydrological modeling experiments.
As a result, we provide a unique and comprehensive data set to several research communities: to those who investigate the retrieval of soil moisture from cosmic-ray neutron sensing, to those who study the variability of soil moisture at different spatiotemporal scales, and to those who intend to better understand the role of root-zone soil moisture dynamics in the context of catchment and groundwater hydrology, as well as land–atmosphere exchange processes. The data set is available through the EUDAT Collaborative Data Infrastructure and is split into two subsets: https://doi.org/10.23728/b2share.282675586fb94f44ab2fd09da0856883 (Fersch et al., 2020a) and https://doi.org/10.23728/b2share.bd89f066c26a4507ad654e994153358b (Fersch et al., 2020b)
Glioneuronal tumor with ATRX alteration, kinase fusion and anaplastic features (GTAKA): a molecularly distinct brain tumor type with recurrent NTRK gene fusions
Glioneuronal tumors are a heterogenous group of CNS neoplasms that can be challenging to accurately diagnose. Molecular methods are highly useful in classifying these tumors-distinguishing precise classes from their histological mimics and identifying previously unrecognized types of tumors. Using an unsupervised visualization approach of DNA methylation data, we identified a novel group of tumors (n = 20) that formed a cluster separate from all established CNS tumor types. Molecular analyses revealed ATRX alterations (in 16/16 cases by DNA sequencing and/or immunohistochemistry) as well as potentially targetable gene fusions involving receptor tyrosine-kinases (RTK; mostly NTRK1-3) in all of these tumors (16/16; 100%). In addition, copy number profiling showed homozygous deletions of CDKN2A/B in 55% of cases. Histological and immunohistochemical investigations revealed glioneuronal tumors with isomorphic, round and often condensed nuclei, perinuclear clearing, high mitotic activity and microvascular proliferation. Tumors were mainly located supratentorially (84%) and occurred in patients with a median age of 19 years. Survival data were limited (n = 18) but point towards a more aggressive biology as compared to other glioneuronal tumors (median progression-free survival 12.5 months). Given their molecular characteristics in addition to anaplastic features, we suggest the term glioneuronal tumor with ATRX alteration, kinase fusion and anaplastic features (GTAKA) to describe these tumors. In summary, our findings highlight a novel type of glioneuronal tumor driven by different RTK fusions accompanied by recurrent alterations in ATRX and homozygous deletions of CDKN2A/B. Targeted approaches such as NTRK inhibition might represent a therapeutic option for patients suffering from these tumors