Search CORE

2,593 research outputs found

Disambiguation strategies for cross-language information retrieval

Author: D. Harman
G. Salton
S.E. Robertson
Publication venue: Springer Verlag
Publication date: 01/01/1999
Field of study

This paper gives an overview of tools and methods for Cross-Language Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that the quality of search methods is more important than the quality of disambiguation methods. Good retrieval methods are able to disambiguate translated queries implicitly during searching

CiteSeerX

Crossref

Radboud Repository (Radboud Univ.)

University of Twente Research Information

A method for maintaining document consistency based on similarity contents

Author: G. Salton
G. Salton
P. Dourish
P. Dourish
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

The advent of the WWW and distributed information systems have made it possible to share documents between different users and organisations. However, this has created many problems related to the security, accessibility, right and most importantly the consistency of documents. It is important that the people involved have access to the most up-to-date version of the documents, retrieve the correct documents and should be able to update the documents repository in such a way that his or her document are known to others. In this paper we propose a method for organising, storing and retrieving documents based on similarity contents. The method uses techniques based on information retrieval, document summarisation and term extraction and indexing. This methodology is developed for the E-cognos project which aims at developing tools for the management and sharing of documents in the construction domain

University of Salford Institutional Repository

Crossref

Online Research @ Cardiff

Summarization of Dynamic Content in Web Collections

Author: G. Salton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Crossref

Object-based Image Ranking using Neural Networks

Author: A. Sajjanhar
C. Chen
G. C. Karmakar
G. Salton
G. Salton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2001
Field of study

In this paper an object-based image ranking is performed using both supervised and unsupervised neural networks. The features are extracted based on the moment invariants, the run length, and a composite method. This paper also introduces a likeness parameter, namely a similarity measure using the weights of the neural networks. The experimental results show that the performance of image retrieval depends on the method of feature extraction, types of learning, the values of the parameters of the neural networks, and the databases including query set. The best performance is achieved using supervised neural networks for internal query set

Crossref

Open Research Online

Numerical simulation of flow over a rough bed

Author: G. Salton
G.H. Silber
J.P. Callan
M.A.K. Halliday
M.F. Porter
Publication venue
Publication date: 01/01/2004
Field of study

This paper presents results of a direct numerical simulation (DNS) of turbulent flow over the rough bed of an open channel. We consider a hexagonal arrangement of spheres on the channel bed. The depth of flow has been taken as four times the diameter of the spheres and the Reynolds number has been chosen so that the roughness Reynolds number is greater than 70, thus ensuring a fully rough flow. A parallel code based on finite difference, domain decomposition, and multigrid methods has been used for the DNS. Computed results are compared with available experimental data. We report the first- and second-order statistics, variation of lift/drag and exchange coefficients. Good agreement with experimental results is seen for the mean velocity, turbulence intensities, and Reynolds stress. Further, the DNS results provide accurate quantitative statistics for rough bed flow. Detailed analysis of the DNS data confirms the streaky nature of the flow near the effective bed and the existence of a hierarchy of vortices aligned with the streamwise direction, and supports the wall similarity hypothesis. The computed exchange coefficients indicate a large degree of mixing between the fluid trapped below the midplane of the roughness elements and that above it

CiteSeerX

Southampton (e-Prints Soton)

Crossref

Irish Universities

Sunderland University Institutional Repository

DCU Online Research Access Service

White Rose Research Online

Vertex similarity in networks

Author: A. W. Wolfe
E. A. Leicht
E. Ravasz
F. Lorrain
G. Jeh
G. Salton
G. Salton
L. Donetti
M. E. J. Newman
M. Molloy
Petter Holme
T. Łuczak
Publication venue: 'American Physical Society (APS)'
Publication date: 14/10/2005
Field of study

We consider methods for quantifying the similarity of vertices in networks. We propose a measure of similarity based on the concept that two vertices are similar if their immediate neighbors in the network are themselves similar. This leads to a self-consistent matrix formulation of similarity that can be evaluated iteratively using only a knowledge of the adjacency matrix of the network. We test our similarity measure on computer-generated networks for which the expected results are known, and on a number of real-world networks

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

CERN Document Server

Sampled Weighted Min-Hashing for Large-Scale Topic Mining

Author: AZ Broder
DM Blei
G Fuentes Pineda
G Salton
GE Hinton
O Chum
YW Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/09/2015
Field of study

We present Sampled Weighted Min-Hashing (SWMH), a randomized approach to automatically mine topics from large-scale corpora. SWMH generates multiple random partitions of the corpus vocabulary based on term co-occurrence and agglomerates highly overlapping inter-partition cells to produce the mined topics. While other approaches define a topic as a probabilistic distribution over a vocabulary, SWMH topics are ordered subsets of such vocabulary. Interestingly, the topics mined by SWMH underlie themes from the corpus at different levels of granularity. We extensively evaluate the meaningfulness of the mined topics both qualitatively and quantitatively on the NIPS (1.7 K documents), 20 Newsgroups (20 K), Reuters (800 K) and Wikipedia (4 M) corpora. Additionally, we compare the quality of SWMH with Online LDA topics for document representation in classification.Comment: 10 pages, Proceedings of the Mexican Conference on Pattern Recognition 201

arXiv.org e-Print Archive

Crossref

Electronic Quantum Monte Carlo Calculations of Atomic Forces, Vibrations, and Anharmonicities

Author: Andrew M. Rappe
Massimo Mella
Myung Won Lee
Press W. H.
Salton G.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2005
Field of study

Atomic forces are calculated for first-row monohydrides and carbon monoxide within electronic quantum Monte Carlo (QMC). Accurate and efficient forces are achieved by using an improved method for moving variational parameters in variational QMC. Newton's method with singular value decomposition (SVD) is combined with steepest descent (SD) updates along directions rejected by the SVD, after initial SD steps. Dissociation energies in variational and diffusion QMC agree well with experiment. The atomic forces agree quantitatively with potential energy surfaces, demonstrating the accuracy of this force procedure. The harmonic vibrational frequencies and anharmonicity constants, derived from the QMC energies and atomic forces, also agree well with experimental values.Comment: 6 pages, 2 figures; updated conten

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Identifying Research Fields within Business and Management: A Journal Cross-Citation Analysis

Author: Hair J
John Mingers
Loet Leydesdorff
Mingers J
Salton G
Van Raan A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/12/2012
Field of study

A discipline such as business and management (B&M) is very broad and has many fields within it, ranging from fairly scientific ones such as management science or economics to softer ones such as information systems. There are at least three reasons why it is important to identify these sub-fields accurately. Firstly, to give insight into the structure of the subject area and identify perhaps unrecognised commonalities; second for the purpose of normalizing citation data as it is well known that citation rates vary significantly between different disciplines. And thirdly, because journal rankings and lists tend to split their classifications into different subjects – for example, the Association of Business Schools (ABS) list, which is a standard in the UK, has 22 different fields. Unfortunately, at the moment these are created in an ad hoc manner with no underlying rigour. The purpose of this paper is to identify possible sub-fields in B&M rigorously based on actual citation patterns. We have examined 450 journals in B&M which are included in the ISI Web of Science (WoS) and analysed the cross-citation rates between them enabling us to generate sets of coherent and consistent sub-fields that minimise the extent to which journals appear in several categories. Implications and limitations of the analysis are discussed

arXiv.org e-Print Archive

Crossref

Kent Academic Repository

International Migration, Integration and Social Cohesion online publications

Can a workspace help to overcome the query formulation problem in image retrieval?

Author: D.G. Hendry
G. Salton
J. Urban
J.M. Jose
M. Beaulieu
X.S. Zhou
Y. Rui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

We have proposed a novel image retrieval system that incorporates a workspace where users can organise their search results. A task-oriented and user-centred experiment has been devised involving design professionals and several types of realistic search tasks. We study the workspace’s effect on two aspects: task conceptualisation and query formulation. A traditional relevance feedback system serves as baseline. The results of this study show that the workspace is more useful with respect to both of the above aspects. The proposed approach leads to a more effective and enjoyable search experience

CiteSeerX

Crossref

Enlighten