2 research outputs found
The Most Influential Paper Gerard Salton Never Wrote
Gerard Salton is often credited with developing the vector space model
(VSM) for information retrieval (IR). Citations to Salton give the impression
that the VSM must have been articulated as an IR model sometime between
1970 and 1975. However, the VSM as it is understood today evolved over a
longer time period than is usually acknowledged, and an articulation of the
model and its assumptions did not appear in print until several years after
those assumptions had been criticized and alternative models proposed. An
often cited overview paper titled ???A Vector Space Model for Information
Retrieval??? (alleged to have been published in 1975) does not exist, and
citations to it represent a confusion of two 1975 articles, neither of which
were overviews of the VSM as a model of information retrieval. Until the
late 1970s, Salton did not present vector spaces as models of IR generally
but rather as models of specifi c computations. Citations to the phantom
paper refl ect an apparently widely held misconception that the operational
features and explanatory devices now associated with the VSM must have
been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio
Approximate Dimension Reduction at NTCIR
We carried out a comparison of cross-language retrieval methods on the NTCIR-1 data based on dimension reduction (latent semantic indexing). These methods all use a collection parallel documents (translations or approximate translations) and very little, if any, linguistic knowledge. In NTCIR-1, we compared latent semantic indexing, local LSI, and approximate dimensional equalization (ADE). We found that local LSI and ADE performed the best on this collection and were comparable to the best performing systems reported elsewhere. We also ran ADE on the NTCIR-2 and found it fared considerably less well. Keywords: Cross-language retrieval, approximate dimension equalization, latent semantic indexing, local LSI.