Search CORE

21,332 research outputs found

PARTIAL COORDINATION: A PRELIMINARY EVALUATION AND FAILURE ANALYSIS

Author: Bodoff David
Kambil Ajit
Publication venue: Stern School of Business, New York University
Publication date: 01/01/1997
Field of study

Partial coordination is a new method for cataloging documents for subject access. It is especially designed to enhance the precision of document searches in online environments. This paper reports a preliminary evaluation of partial coordination which shows promising results compared with full text retrieval. We also report the difficulties in empirically evaluating the effectiveness of automatic full-text retrieval in contrast to mixed methods such as partial coordination which combine human cataloging with computerized retrieval. Based on our study we propose research in this area will substantially benefit from a common framework for failure analysis and a common data set. This will allow information retrieval researchers adapting "library style" cataloging to large electronic document collections, as well as those developing automated or mixed methods, to directly compare their proposals for indexing and retrieval. This paper concludes by suggesting guidelines for constructing such a testbed.Information Systems Working Papers Serie

New York University Faculty Digital Archive

PARTIAL COORDINATION: A PRELIMINARY EVALUATION AND FAILURE ANALYSIS

Author: Bodoff David
Kambil Ajit
Publication venue: Stern School of Business, New York University
Publication date: 01/01/1997
Field of study

Full-Text Retrieval: Systems and Files

Author: Tenopir Carol
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/01/1994
Field of study

Much of the development in the first 30 years of library automation has been in solving the problem of identifying relevant sources. Automation of the library\u27s card catalog provides a finding tool for the library\u27s collections. The books, journals, films, and other materials located through the catalog still mostly reside in their original form, with no direct connection to the automated finding tool. Most of the early development in electronic publishing was also aimed solely at identifying information sources. Secondary publishers, notably publishers of indexing/abstracting serials, were the first to provide their resources in electronic form. Throughout the 1970s and much of the 1980s, indexing/abstracting (bibliographic) databases were predominant in the online database world. The first CD-ROM databases for libraries were many of these same bibliographic files. Traditionally (and still) bibliographic databases are the most widely used type of electronic resource in libraries. Starting in the mid-1980s, due to great increases in disk storage capacities and better document conversion techniques, full texts of certain types of documents became more widely available. In the 1990s,full-text databases (files) are the most rapidly growing type of commercially available database. Better text-retrieval software is leading to more locally created full-text databases as well. Perhaps in this decade we will at last electronically solve the document delivery problem as well as the document location problem

University of Tennessee, Knoxville: Trace

Access to recorded interviews: A research agenda

Author: Heeren W.F.L.
Jong F.M.G. de
Oard D.W.
Ordelman R.J.F.
Publication venue: ACM
Publication date: 01/01/2008
Field of study

Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

University of Twente Research Information

Machine Learning in Automated Text Categorization

Author: ANDROUTSOPOULOS I.
ATTARDI G.
BAKER L.D.
BIEBRICHER P.
CAROPRESO M.F.
CAVNAR W.B.
CHAKRABARTI S.
CLACK C.
CLEVERDON C.
COHEN W. W.
COHEN W. W.
COHEN W.W.
DAGAN I.
DEERWESTER S.
DENOYER L.
DIAZ ESTEBAN A.
DRUCKER H.
DUMAIS S.T.
DUMAIS S.T.
ESCUDERO G.
Fabrizio Sebastiani
FIELD B.
FORSYTH R. S.
FUHR N.
FUHR N.
FUHR N.
FURNKRANZ J.
GALAVOTTI L.
GALE W. A.
GOVERT N.
GRAY W.A.
GUTHRIE L.
HAYES P.J.
HEAPS H.
HERSH W.
HULL D. A.
HULL D. A.
ITTNER D.J.
IWAYAMA M.
IYER R.D.
JOACHIMS T.
JOACHIMS T.
JOACHIMS T.
JOHN G. H.
JUNKER M.
JUNKER M.
KESSLER B.
KIM Y.-H.
KLINKENBERG R.
KNORZ G.
KOLLER D.
LAM S.L.
LAM W.
LAM W.
LANG K.
LARKEY L. S.
LARKEY L. S.
LARKEY L.S.
LEWIS D. D.
LEWIS D. D.
LEWIS D. D.
LEWIS D. D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LI H.
LI Y.H.
LIERE R.
LIM J. H.
MASAND B.
MASAND B.
MCCALLUM A. K.
MCCALLUM A.K.
MLADENIC D.
MLADENIC D.
MOULINIER I.
MOULINIER I.
MYERS K.
NG H.T.
OH H.-J.
PAZIENZA M. T.
RILOFF E.
ROBERTSON S.E.
ROBERTSON S.E.
ROTH D.
RUIZ M.E.
SABLE C.L.
SARACEVIC T.
SCHAPIRE R. E.
SCHUTZE H.
SCHUTZE H.
SCOTT S.
SEBASTIANI F.
SINGHAL A.
SLONIM N.
TAIRA H.
TUMER K.
TZERAS K.
VAN RIJSBERGEN C. J.
WIENER E.D.
YANG Y.
YANG Y.
YANG Y.
YANG Y.
YU K.L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2001
Field of study

The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Computer as a Tool for Legal Research

Author: Dennis Sally F.
Eldridge William B.
Publication venue: Duke University School of Law
Publication date: 01/01/1963
Field of study

bepress Legal Repository

Duke Law Scholarship Repository

Automated speech and audio analysis for semantic access to multimedia

Author: Huijbregts Marijn
Jong Franciska de
Ordelman Roeland
Publication venue: Springer Verlag
Publication date: 01/01/2006
Field of study

The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

University of Twente Research Information

Immaculate catalogues, indexes and monsters too…: David E. Bennett reports on the three-day residential CILIP Cataloguing and Indexing Group Annual Conference, University of East Anglia, Norwich, UK, 13-15 September 2006.

Author: Bennett David
Publication venue
Publication date: 01/10/2006
Field of study

Portsmouth University Research Portal (Pure)

Weak signal identification with semantic web mining

Author: Thorleuchter Dirk
Van den Poel Dirk
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

We investigate an automated identification of weak signals according to Ansoff to improve strategic planning and technological forecasting. Literature shows that weak signals can be found in the organization's environment and that they appear in different contexts. We use internet information to represent organization's environment and we select these websites that are related to a given hypothesis. In contrast to related research, a methodology is provided that uses latent semantic indexing (LSI) for the identification of weak signals. This improves existing knowledge based approaches because LSI considers the aspects of meaning and thus, it is able to identify similar textual patterns in different contexts. A new weak signal maximization approach is introduced that replaces the commonly used prediction modeling approach in LSI. It enables to calculate the largest number of relevant weak signals represented by singular value decomposition (SVD) dimensions. A case study identifies and analyses weak signals to predict trends in the field of on-site medical oxygen production. This supports the planning of research and development (R&D) for a medical oxygen supplier. As a result, it is shown that the proposed methodology enables organizations to identify weak signals from the internet for a given hypothesis. This helps strategic planners to react ahead of time

Ghent University Academic Bibliography

Fraunhofer-ePrints