2 research outputs found
The extension and application of Swet's theory of information retrieval
Phd ThesisThe thesis comprises (1) 8 critical interpretation of Swets's
contribution to information retrieval, (2) development (i.e.
"extension") of the formalism, as so interpreted, and (3) a
description of an experiment that identifies hypotheses consistent
with the extended formalism. The early sections of the thesis
place the original contribution by Swets in the contexts of both
signal-detection theory and information retrieval theory. It is
then argued that as the original theoretical contribution is
ambiguous in key respects, an interpretation of it is necessary.
The interpretation given constitutes an initial development of
Swets's work but other developments, not simply a consequence of
the interpretation of the original description by Swets, are also
put forward. The major one of these is the explicit incorporation
in the formalism of logical search expressions. Elementary logical
conjuncts of search terms are seen as (1) being weakly ordered by
"document ordering expressions", and (2) having probability-pairs
attached to disjunctions of them defined by the ordering. A major
part of the thesis is the identification of novel hypotheses,
expressed within the extension of the original formalism, which
relate to triples of: (1) instances of information need in medicine,
represented by prespecified partitionings of a medical-literature
data base (MEDLARS), (2) an analytical document ordering expression,
and (3) an algorithmically-derived set of terms characterising the
information need. An enhancement is suggested to data base management
programs that at present employ only user-specified logical
search expressions by way of search input, this enhancement
stemming directly from the extension of the original formalism. The
broad conclusion of the thesis is that when the original contribution
of Swets is suitably interpreted and extended, a robust, hospitable
conceptual framework for describing information retrieval at the
macroscopic level is provided
Recommended from our members
Statistical studies of patents literature
This study has been undertaken to determine what pseudo-proprietary information and patenting activity statistics could be derived from an online patents database. To achieve this, a thorough investigation uas made of patenting in the field of an important group of beta-lactam antibiotics, the. Cephalosporins. Patents data was retrieved from the World Patents Index online files of Derwent Publications Limited, and the bibliographic details of each patent application retrieved analysed according to numbers of patents per patentee, priority and publication dates, types of patents, etc. A review of technological advances in this subject was conducted, demonstrating the value of patents literature for such purposes. The relationship between sales volumes and patenting activity for Cephalosporins patentees has been investigated and found to show a. significant correlation between these parameters. As an extension, the USA patenting and sales activity for the leading USA Industrial Corporations (the 1981 Fortune 500) was studied; overall a high correlation was exhibited, but there were notable differences. between different industries. A number of bibliometric studies have been undertaken with a variety of patents data. for a number of techhologies. These studies include the application of Bradford-Zipf plots, other productivity studies and Vector Analysis to patents. Whilst previous studies on journal literature have investigated the applicability of frequency distributions as measures of author productivity, this study has for the first time applied Lotka's Law, Price's Pareto-type Distribution, Simon-Yule Distribution, Shockley's Lognormal Distribution, Borel-Tanner Distribution, Williams Geometric Series, Fisher's Logarithmic Series and the Negative Binomial Distribution to patents data. Theoretical distributions were ascertained using a series of microcomputer programs written in BASIC programming' language. The results indicate that of the distributions investigated, the Negative Binomial most closely fits the observed data when goodness-offit is measured by the Kolmogorov-Smirnov Test