25,284 research outputs found
Towards semantic mathematical editing *
Currently, there exists a big gap between formal computer-understandable mathematics and informal mathematics, as written by humans. When looking more closely, there are two important subproblems: making documents written by humans at least syntactically understandable for computers, and the formal verification of the actual mathematics in the documents. In this paper, we will focus on the first problem. For the time being, most authors use T E X, L A T E X, or one of its graphical frontends in order to write documents with many mathematical formulas. In the past decade, we have developed an alternative wysiwyg system GNU T E X MACS , which is not based on T E X. All these systems are only adequate for visual typesetting and do not carry much semantics. Stated in the MathML jargon, they concentrate on presentation markup, not content markup. In recent versions of T E X MACS , we have started to integrate facilities for the semantic editing of formulas. In this paper, we will describe these facilities and expand on the underlying motivation and design choices. To go short, we continue to allow the user to enter formulas in a visually oriented way. In the background, we continuously run a packrat parser, which attempts to convert (potentially incomplete) formulas into content markup. As long as all formulas remain sufficiently correct, the editor can then both operate on a visual or semantic level, independently of the low-level representation being used. An important related topic, which will also be discussed at length, is the automatic correction of syntax errors in existing mathematical documents. In particular, the syntax corrector that we have implemented enables us to upgrade existing documents and test our parsing grammar on various books and papers from different sources. We will provide a detailed analysis of these experiments
Towards a generation-based semantic web authoring tool
Widespread use of Semantic Web technologies requires interfaces through which knowledge can be viewed and edited without deep understanding of Description Logic and formalisms like OWL and RDF. Several groups are pursuing approaches based on Controlled Natural Languages (CNLs), so that editing can be performed by typing in sentences which are automatically interpreted as statements in OWL. We suggest here a variant of this approach which relies entirely on Natural Language Generation (NLG), and propose requirements for a system that can reliably generate transparent realisations of statements in Description Logic
Isabelle/PIDE as Platform for Educational Tools
The Isabelle/PIDE platform addresses the question whether proof assistants of
the LCF family are suitable as technological basis for educational tools. The
traditionally strong logical foundations of systems like HOL, Coq, or Isabelle
have so far been counter-balanced by somewhat inaccessible interaction via the
TTY (or minor variations like the well-known Proof General / Emacs interface).
Thus the fundamental question of math education tools with fully-formal
background theories has often been answered negatively due to accidental
weaknesses of existing proof engines.
The idea of "PIDE" (which means "Prover IDE") is to integrate existing
provers like Isabelle into a larger environment, that facilitates access by
end-users and other tools. We use Scala to expose the proof engine in ML to the
JVM world, where many user-interfaces, editor frameworks, and educational tools
already exist. This shall ultimately lead to combined mathematical assistants,
where the logical engine is in the background, without obstructing the view on
applications of formal methods, formalized mathematics, and math education in
particular.Comment: In Proceedings THedu'11, arXiv:1202.453
Improving the Representation and Conversion of Mathematical Formulae by Considering their Textual Context
Mathematical formulae represent complex semantic information in a concise
form. Especially in Science, Technology, Engineering, and Mathematics,
mathematical formulae are crucial to communicate information, e.g., in
scientific papers, and to perform computations using computer algebra systems.
Enabling computers to access the information encoded in mathematical formulae
requires machine-readable formats that can represent both the presentation and
content, i.e., the semantics, of formulae. Exchanging such information between
systems additionally requires conversion methods for mathematical
representation formats. We analyze how the semantic enrichment of formulae
improves the format conversion process and show that considering the textual
context of formulae reduces the error rate of such conversions. Our main
contributions are: (1) providing an openly available benchmark dataset for the
mathematical format conversion task consisting of a newly created test
collection, an extensive, manually curated gold standard and task-specific
evaluation metrics; (2) performing a quantitative evaluation of
state-of-the-art tools for mathematical format conversions; (3) presenting a
new approach that considers the textual context of formulae to reduce the error
rate for mathematical format conversions. Our benchmark dataset facilitates
future research on mathematical format conversions as well as research on many
problems in mathematical information retrieval. Because we annotated and linked
all components of formulae, e.g., identifiers, operators and other entities, to
Wikidata entries, the gold standard can, for instance, be used to train methods
for formula concept discovery and recognition. Such methods can then be applied
to improve mathematical information retrieval systems, e.g., for semantic
formula search, recommendation of mathematical content, or detection of
mathematical plagiarism.Comment: 10 pages, 4 figure
- …