Search CORE

23,198 research outputs found

A fact-aligned corpus of numerical expressions

Author: Power Richard
Williams Sandra
Publication venue
Publication date: 01/01/2010
Field of study

We describe a corpus of numerical expressions, developed as part of the NUMGEN project. The corpus contains newspaper articles and scientific papers in which exactly the same numerical facts are presented many times (both within and across texts). Some annotations of numerical facts are original: for example, numbers are automatically classified as round or non-round by an algorithm derived from Jansen and Pollmann (2001); also, numerical hedges such as 'about' or 'a little under' are marked up and classified semantically using arithmetical relations. Through explicit alignment of phrases describing the same fact, the corpus can support research on the influence of various contextual factors (e.g., document position, intended readership) on the way in which numerical facts are expressed. As an example we present results from an investigation showing that when a fact is mentioned more than once in a text, there is a clear tendency for precision to increase from first to subsequent mentions, and for mathematical level either to remain constant or to increase

CiteSeerX

Open Research Online (The Open University)

$OntoMath^{PRO}$ Ontology: A Linked Data Hub for Mathematics

Author: C. Bizer
C. David
C. Lange
C. Lange
E. Sirin
E.V. Biryaltsev
F. Kamareddine
H. Barendregt
H.S. Barrows
M. Doerr
M. Kohlhase
N. Sloane
O. Nevzorova
O.A. Nevzorova
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we present an ontology of mathematical knowledge concepts that covers a wide range of the fields of mathematics and introduces a balanced representation between comprehensive and sensible models. We demonstrate the applications of this representation in information extraction, semantic search, and education. We argue that the ontology can be a core of future integration of math-aware data sets in the Web of Data and, therefore, provide mappings onto relevant datasets, such as DBpedia and ScienceWISE.Comment: 15 pages, 6 images, 1 table, Knowledge Engineering and the Semantic Web - 5th International Conferenc

arXiv.org e-Print Archive

Crossref

Implicit reference to citations: a study of astronomy

Author: Kim Yunhyong
Webber Bonnie
Publication venue
Publication date: 05/10/2006
Field of study

The research in this paper presents results in the automatic classification of pronouns within articles into those which refer to cited research and those which do not. It also discusses the automatic linking of pronouns which do refer to citations to their corresponding citations. The current study focused on the pronoun they as used in papers in Astronomy journals. The paper describes a classifier trained on maximum entropy principles using features defined by the distance to preceding citations and the category of verbs associated to the pronoun under consideration

Enlighten

Supporting Usability and Reusability Based on eLearning Standards

Author: Blat Josep
Casado Francis
García Robles Rocío
Griffiths Dai
Martínez Juanjo
Sayago Sergio
Publication venue: IEEE Computer Society
Publication date: 01/01/2004
Field of study

The IMS-QTI, and other related specifications have been developed to support the creation of reusable and pedagogically neutral assessment scenarios and content, as stated by the IMS Global Learning Consortium. In this paper we discuss how current specifications both constrain the design of assessment scenarios, and limit content reusability. We also suggest some solutions to overcome these limitations. The paper is based on our experience developing and testing an IMS QTI Lite compliant assessment authoring tool, QAed. It supports teacher centering, which is quite neglected when designing such tools. In the paper we also discuss how to make compatible standards support and user centering in eLearning applications and provide some recommendations for the design of the user interfaces

idUS. Depósito de Investigación Universidad de Sevilla

Three Steps to Heaven: Semantic Publishing in a Real World Workflow

Author: Bizer
Brazma
Knuth
Lamport
Lord
Phillip Lord
Robert Stevens
Shadbolt
Shotton
Simon Cockell
Publication venue: 'MDPI AG'
Publication date: 01/01/2012
Field of study

Semantic publishing offers the promise of computable papers, enriched visualisation and a realisation of the linked data ideal. In reality, however, the publication process contrives to prevent richer semantics while culminating in a `lumpen' PDF. In this paper, we discuss a web-first approach to publication, and describe a three-tiered approach which integrates with the existing authoring tooling. Critically, although it adds limited semantics, it does provide value to all the participants in the process: the author, the reader and the machine.Comment: Published as part of SePublica 201

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

The University of Manchester - Institutional Repository

Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

Author: Lourenço Anália
Nogueira R.
Santos André Fernandes
Publication venue: 'Ediciones Universidad de Salamanca'
Publication date: 01/01/2012
Field of study

Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents

Universidade do Minho: RepositoriUM

Directory of Open Access Journals