5 research outputs found
POS Tagging and its Applications for Mathematics
Content analysis of scientific publications is a nontrivial task, but a
useful and important one for scientific information services. In the Gutenberg
era it was a domain of human experts; in the digital age many machine-based
methods, e.g., graph analysis tools and machine-learning techniques, have been
developed for it. Natural Language Processing (NLP) is a powerful
machine-learning approach to semiautomatic speech and language processing,
which is also applicable to mathematics. The well established methods of NLP
have to be adjusted for the special needs of mathematics, in particular for
handling mathematical formulae. We demonstrate a mathematics-aware part of
speech tagger and give a short overview about our adaptation of NLP methods for
mathematical publications. We show the use of the tools developed for key
phrase extraction and classification in the database zbMATH
The Semantic Multilingual Glossary of Mathematics (SMGloM) project or why do we need a semantic glossary of mathematics
In this overview, we describe a new terminological and notational base for mathematics: The Semantic Multilingual Glossary of Mathematics (shortly SMGloM) is an ontology for mathematical concepts, objects or models. The terminological and notational data can be applied, e.g. in a more semantic text and formula search and the disambiguation of symbols and formulae in mathematical publications or the translation of mathematical terms. The paper is focused to present the intention, the needs, the framework, and user scenarios of the SMGloM concept, not the technical details of the data model and its implementation
Automated document classification for the DeLiVerMath project
<p>Building on appropriate taxonomies and semantic<br>information, the DeLiVerMath project addresses the<br>problem of automatic indexing for documents from<br>the field of mathematics. In this context we<br>evaluated several state-of-the-art text categorization<br>and analysis techniques.</p