Search CORE

475 research outputs found

Safeguarding Old and New Journal Tables for the VO: Status for Extragalactic and Radio Data

Author: Andernach Heinz
Publication venue: 'Codata'
Publication date: 01/01/2009
Field of study

Independent of established data centers, and partly for my own research, since 1989 I have been collecting the tabular data from over 2600 articles concerned with radio sources and extragalactic objects in general. Optical character recognition (OCR) was used to recover tables from 740 papers. Tables from only 41 percent of the 2600 articles are available in the CDS or CATS catalog collections, and only slightly better coverage is estimated for the NED database. This fraction is not better for articles published electronically since 2001. Both object databases (NED, SIMBAD, LEDA) as well as catalog browsers (VizieR, CATS) need to be consulted to obtain the most complete information on astronomical objects. More human resources at the data centers and better collaboration between authors, referees, editors, publishers, and data centers are required to improve data coverage and accessibility. The current efforts within the Virtual Observatory (VO) project, to provide retrieval and analysis tools for different types of published and archival data stored at various sites, should be balanced by an equal effort to recover and include large amounts of published data not currently available in this way.Comment: 11 pages, 4 figures; accepted for publication in Data Science Journal, vol. 8 (2009), http://dsj.codataweb.org; presented at Special Session "Astronomical Data and the Virtual Observatory" on the conference "CODATA 21", Kiev, Ukraine, October 5-8, 200

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

A Machine Learning Approach to the Classification of Dialogue Utterances

Author: Andernach Toine
Publication venue
Publication date: 01/01/1996
Field of study

The purpose of this paper is to present a method for automatic classification of dialogue utterances and the results of applying that method to a corpus. Superficial features of a set of training utterances (which we will call cues) are taken as the basis for finding relevant utterance classes and for extracting rules for assigning these classes to new utterances. Each cue is assumed to partially contribute to the communicative function of an utterance. Instead of relying on subjective judgments for the tasks of finding classes and rules, we opt for using machine learning techniques to guarantee objectivity.Comment: 12 pages, using nemlap.sty, harvard.sty and agsm.bst, to appear in Proceedings of NeMLaP-2, Bilkent University, Ankara, Turke

arXiv.org e-Print Archive

CiteSeerX

University of Twente Research Information