2,593 research outputs found

    Disambiguation strategies for cross-language information retrieval

    Get PDF
    This paper gives an overview of tools and methods for Cross-Language Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that the quality of search methods is more important than the quality of disambiguation methods. Good retrieval methods are able to disambiguate translated queries implicitly during searching

    A method for maintaining document consistency based on similarity contents

    Get PDF
    The advent of the WWW and distributed information systems have made it possible to share documents between different users and organisations. However, this has created many problems related to the security, accessibility, right and most importantly the consistency of documents. It is important that the people involved have access to the most up-to-date version of the documents, retrieve the correct documents and should be able to update the documents repository in such a way that his or her document are known to others. In this paper we propose a method for organising, storing and retrieving documents based on similarity contents. The method uses techniques based on information retrieval, document summarisation and term extraction and indexing. This methodology is developed for the E-cognos project which aims at developing tools for the management and sharing of documents in the construction domain

    Summarization of Dynamic Content in Web Collections

    Full text link

    Object-based Image Ranking using Neural Networks

    Get PDF
    In this paper an object-based image ranking is performed using both supervised and unsupervised neural networks. The features are extracted based on the moment invariants, the run length, and a composite method. This paper also introduces a likeness parameter, namely a similarity measure using the weights of the neural networks. The experimental results show that the performance of image retrieval depends on the method of feature extraction, types of learning, the values of the parameters of the neural networks, and the databases including query set. The best performance is achieved using supervised neural networks for internal query set

    Numerical simulation of flow over a rough bed

    Get PDF
    This paper presents results of a direct numerical simulation (DNS) of turbulent flow over the rough bed of an open channel. We consider a hexagonal arrangement of spheres on the channel bed. The depth of flow has been taken as four times the diameter of the spheres and the Reynolds number has been chosen so that the roughness Reynolds number is greater than 70, thus ensuring a fully rough flow. A parallel code based on finite difference, domain decomposition, and multigrid methods has been used for the DNS. Computed results are compared with available experimental data. We report the first- and second-order statistics, variation of lift/drag and exchange coefficients. Good agreement with experimental results is seen for the mean velocity, turbulence intensities, and Reynolds stress. Further, the DNS results provide accurate quantitative statistics for rough bed flow. Detailed analysis of the DNS data confirms the streaky nature of the flow near the effective bed and the existence of a hierarchy of vortices aligned with the streamwise direction, and supports the wall similarity hypothesis. The computed exchange coefficients indicate a large degree of mixing between the fluid trapped below the midplane of the roughness elements and that above it

    Vertex similarity in networks

    Full text link
    We consider methods for quantifying the similarity of vertices in networks. We propose a measure of similarity based on the concept that two vertices are similar if their immediate neighbors in the network are themselves similar. This leads to a self-consistent matrix formulation of similarity that can be evaluated iteratively using only a knowledge of the adjacency matrix of the network. We test our similarity measure on computer-generated networks for which the expected results are known, and on a number of real-world networks

    Sampled Weighted Min-Hashing for Large-Scale Topic Mining

    Full text link
    We present Sampled Weighted Min-Hashing (SWMH), a randomized approach to automatically mine topics from large-scale corpora. SWMH generates multiple random partitions of the corpus vocabulary based on term co-occurrence and agglomerates highly overlapping inter-partition cells to produce the mined topics. While other approaches define a topic as a probabilistic distribution over a vocabulary, SWMH topics are ordered subsets of such vocabulary. Interestingly, the topics mined by SWMH underlie themes from the corpus at different levels of granularity. We extensively evaluate the meaningfulness of the mined topics both qualitatively and quantitatively on the NIPS (1.7 K documents), 20 Newsgroups (20 K), Reuters (800 K) and Wikipedia (4 M) corpora. Additionally, we compare the quality of SWMH with Online LDA topics for document representation in classification.Comment: 10 pages, Proceedings of the Mexican Conference on Pattern Recognition 201

    Electronic Quantum Monte Carlo Calculations of Atomic Forces, Vibrations, and Anharmonicities

    Get PDF
    Atomic forces are calculated for first-row monohydrides and carbon monoxide within electronic quantum Monte Carlo (QMC). Accurate and efficient forces are achieved by using an improved method for moving variational parameters in variational QMC. Newton's method with singular value decomposition (SVD) is combined with steepest descent (SD) updates along directions rejected by the SVD, after initial SD steps. Dissociation energies in variational and diffusion QMC agree well with experiment. The atomic forces agree quantitatively with potential energy surfaces, demonstrating the accuracy of this force procedure. The harmonic vibrational frequencies and anharmonicity constants, derived from the QMC energies and atomic forces, also agree well with experimental values.Comment: 6 pages, 2 figures; updated conten

    Identifying Research Fields within Business and Management: A Journal Cross-Citation Analysis

    Get PDF
    A discipline such as business and management (B&M) is very broad and has many fields within it, ranging from fairly scientific ones such as management science or economics to softer ones such as information systems. There are at least three reasons why it is important to identify these sub-fields accurately. Firstly, to give insight into the structure of the subject area and identify perhaps unrecognised commonalities; second for the purpose of normalizing citation data as it is well known that citation rates vary significantly between different disciplines. And thirdly, because journal rankings and lists tend to split their classifications into different subjects – for example, the Association of Business Schools (ABS) list, which is a standard in the UK, has 22 different fields. Unfortunately, at the moment these are created in an ad hoc manner with no underlying rigour. The purpose of this paper is to identify possible sub-fields in B&M rigorously based on actual citation patterns. We have examined 450 journals in B&M which are included in the ISI Web of Science (WoS) and analysed the cross-citation rates between them enabling us to generate sets of coherent and consistent sub-fields that minimise the extent to which journals appear in several categories. Implications and limitations of the analysis are discussed

    Can a workspace help to overcome the query formulation problem in image retrieval?

    Get PDF
    We have proposed a novel image retrieval system that incorporates a workspace where users can organise their search results. A task-oriented and user-centred experiment has been devised involving design professionals and several types of realistic search tasks. We study the workspace’s effect on two aspects: task conceptualisation and query formulation. A traditional relevance feedback system serves as baseline. The results of this study show that the workspace is more useful with respect to both of the above aspects. The proposed approach leads to a more effective and enjoyable search experience
    corecore