28,571 research outputs found
Metadata Editing by Schema
Metadata creation and editing is a reasonably well-understood task which involves creating forms, checking the input data and generating appropriate storage formats. XML has largely become the standard storage representation for metadata records and various automatic mechanisms are becoming popular for validation of these records, including XML Schema and Schematron. However, there is no standard methodology for creating data manipulation mechanisms. This work presents a set of guidelines and extensions to use the XML Schema standard for this purpose. The experiences and issues involved in building such a generalised structured data editor are discussed, to support the notion that metadata editing, and not just validation, should be description-driven
Discovering Implicational Knowledge in Wikidata
Knowledge graphs have recently become the state-of-the-art tool for
representing the diverse and complex knowledge of the world. Examples include
the proprietary knowledge graphs of companies such as Google, Facebook, IBM, or
Microsoft, but also freely available ones such as YAGO, DBpedia, and Wikidata.
A distinguishing feature of Wikidata is that the knowledge is collaboratively
edited and curated. While this greatly enhances the scope of Wikidata, it also
makes it impossible for a single individual to grasp complex connections
between properties or understand the global impact of edits in the graph. We
apply Formal Concept Analysis to efficiently identify comprehensible
implications that are implicitly present in the data. Although the complex
structure of data modelling in Wikidata is not amenable to a direct approach,
we overcome this limitation by extracting contextual representations of parts
of Wikidata in a systematic fashion. We demonstrate the practical feasibility
of our approach through several experiments and show that the results may lead
to the discovery of interesting implicational knowledge. Besides providing a
method for obtaining large real-world data sets for FCA, we sketch potential
applications in offering semantic assistance for editing and curating Wikidata
A Software Tool for Parameter Estimation from Flight Test Data
A software package called FIDA is developed and implemented in PC MATLAB for estimating aircraft stability and control derivatives from flight test data using different system identification techniques. FIDA also contains data pre-processing tools to remove wild points and high frequency noise components from measured flight data. FIDA is a menu driven and user interactive software which is useful to scientists/flight test engineers/pilots who are engaged in experimental flights and analysis of flight test data. Also it has an educational value for students and practising engineers who are new to the field of aircraft parameter estimation
Cadabra: reference guide and tutorial
Cadabra is a computer algebra system for the manipulation of tensorial mathematical expressions such as they occur in “field theory problems”. It is aimed at, but not necessarily restricted to, high-energy physicists. It is constructed as a simple tree-manipulating core, a large collection of standalone algorithmic modules which act on the expression tree, and a set of modules responsible for output of nodes in the tree. All of these parts are written in C++. The input and output formats closely follow TEX, which in many cases means that cadabra is much simpler to use than other similar programs. It intentionally does not contain its own programming language; instead, new functionality is added by writing new modules in C++
Editing and multiply imputing German establishment panel data to estimate stochastic production frontier models
"This paper illustrates the effects of item-nonresponse in surveys on the results of multivariate statistical analysis when estimation of productivity is the task. To multiply impute the missing data a data augmentation algorithm based on a normal/Wishart model is applied. Data of the German IAB Establishment Panel from waves 2000 and 2001 are used to estimate the establishment's productivity. The processes of constructing, editing, and transforming the variables needed for the analyst's as well as the imputer's models are described. It is shown that standard multiple imputation techniques can be used to estimate sophisticated econometric models from large-scale panel data exposed to item-nonresponse. Basis of the empirical analysis is a stochastic production frontier model with labour and capital as input factors. The results show that a model of technical inefficiency is favoured compared to a case where we assume different production functions in East and West Germany. Also we see that the effect of regional setting on technical inefficiency increases when inference is based on multiply imputed data sets. These results may stimulate future research and could have influence on the economic and regional policies in Germany. " (Author's abstract, IAB-Doku) ((en))IAB-Betriebspanel, Befragung, Antwortverhalten, Produktivität, Datenanalyse, Schätzung, betriebliche Kennzahlen, Imputationsverfahren
Development of multiple media documents
Development of documents in multiple media involves activities in three different
fields, the technical, the discoursive and the procedural. The major development problems of
artifact complexity, cognitive processes, design basis and working context are located where these
fields overlap. Pending the emergence of a unified approach to design, any method must allow for
development at the three levels of discourse structure, media disposition and composition, and
presentation. Related work concerned with generalised discourse structures, structured
documents, production methods for existing multiple media artifacts, and hypertext design offer
some partial forms of assistance at different levels. Desirable characteristics of a multimedia
design method will include three phases of production, a variety of possible actions with media
elements, an underlying discoursive structure, and explicit comparates for review
Shingle 2.0: generalising self-consistent and automated domain discretisation for multi-scale geophysical models
The approaches taken to describe and develop spatial discretisations of the
domains required for geophysical simulation models are commonly ad hoc, model
or application specific and under-documented. This is particularly acute for
simulation models that are flexible in their use of multi-scale, anisotropic,
fully unstructured meshes where a relatively large number of heterogeneous
parameters are required to constrain their full description. As a consequence,
it can be difficult to reproduce simulations, ensure a provenance in model data
handling and initialisation, and a challenge to conduct model intercomparisons
rigorously. This paper takes a novel approach to spatial discretisation,
considering it much like a numerical simulation model problem of its own. It
introduces a generalised, extensible, self-documenting approach to carefully
describe, and necessarily fully, the constraints over the heterogeneous
parameter space that determine how a domain is spatially discretised. This
additionally provides a method to accurately record these constraints, using
high-level natural language based abstractions, that enables full accounts of
provenance, sharing and distribution. Together with this description, a
generalised consistent approach to unstructured mesh generation for geophysical
models is developed, that is automated, robust and repeatable, quick-to-draft,
rigorously verified and consistent to the source data throughout. This
interprets the description above to execute a self-consistent spatial
discretisation process, which is automatically validated to expected discrete
characteristics and metrics.Comment: 18 pages, 10 figures, 1 table. Submitted for publication and under
revie
Systematic evaluation of design choices for software development tools
[Abstract]: Most design and evaluation of software tools
is based on the intuition and experience of the designers.
Software tool designers consider themselves typical users
of the tools that they build and tend to subjectively evaluate their products rather than objectively evaluate them using established usability methods. This subjective approach is inadequate if the quality of software tools is to improve and the use of more systematic methods is advocated. This paper summarises a sequence of studies that
show how user interface design choices for software development tools can be evaluated using established usability engineering techniques. The techniques used included guideline review, predictive modelling and experimental studies with users
PyZX: Large Scale Automated Diagrammatic Reasoning
The ZX-calculus is a graphical language for reasoning about ZX-diagrams, a
type of tensor networks that can represent arbitrary linear maps between
qubits. Using the ZX-calculus, we can intuitively reason about quantum theory,
and optimise and validate quantum circuits. In this paper we introduce PyZX, an
open source library for automated reasoning with large ZX-diagrams. We give a
brief introduction to the ZX-calculus, then show how PyZX implements methods
for circuit optimisation, equality validation, and visualisation and how it can
be used in tandem with other software. We end with a set of challenges that
when solved would enhance the utility of automated diagrammatic reasoning.Comment: In Proceedings QPL 2019, arXiv:2004.1475
- …