144,441 research outputs found

    The Effect of the Multi-Layer Text Summarization Model on the Efficiency and Relevancy of the Vector Space-based Information Retrieval

    Full text link
    The massive upload of text on the internet creates a huge inverted index in information retrieval systems, which hurts their efficiency. The purpose of this research is to measure the effect of the Multi-Layer Similarity model of the automatic text summarization on building an informative and condensed invert index in the IR systems. To achieve this purpose, we summarized a considerable number of documents using the Multi-Layer Similarity model, and we built the inverted index from the automatic summaries that were generated from this model. A series of experiments were held to test the performance in terms of efficiency and relevancy. The experiments include comparisons with three existing text summarization models; the Jaccard Coefficient Model, the Vector Space Model, and the Latent Semantic Analysis model. The experiments examined three groups of queries with manual and automatic relevancy assessment. The positive effect of the Multi-Layer Similarity in the efficiency of the IR system was clear without noticeable loss in the relevancy results. However, the evaluation showed that the traditional statistical models without semantic investigation failed to improve the information retrieval efficiency. Comparing with the previous publications that addressed the use of summaries as a source of the index, the relevancy assessment of our work was higher, and the Multi-Layer Similarity retrieval constructed an inverted index that was 58% smaller than the main corpus inverted index

    HII: Histogram Inverted Index For Fast Images Retrieval

    Get PDF
    This work aims to improve the speed of search by creating an indexing structure in CBIR system. We utilised an inverted index structure that usually used in text retrieval with a modification. The modified inverted index is built based on histogram data that generated using Multi Texton Histogram (MTH) and Multi Texton Co-Occurrence Descriptor (MTCD) from 10,000 images of Corel dataset. When building the inverted index, we normalised value of each feature into a real number and considered pairs of feature and value that owned by a particular number of images. Based on our investigation, on MTCD histogram of 5,000 data test, we found that by considering histogram variable values which owned by maximum 12% of images, the number of comparison for each query can be reduced by 67.47% in a rate, the precision is 82.2%, and the rate of access to disk is 32.83%. Furthermore, we named our approach as Histogram Inverted Index (HII).

    The Spectrum and Variability of Circular Polarization in Sagittarius A* from 1.4 to 15 GHz

    Get PDF
    We report here multi-epoch, multi-frequency observations of the circular polarization in Sagittarius A*, the compact radio source in the Galactic Center. Data taken from the VLA archive indicate that the fractional circular polarization at 4.8 GHz was -0.31% with an rms scatter of 0.13% from 1981 to 1998, in spite of a factor of 2 change in the total intensity. The sign remained negative over the entire time range, indicating a stable magnetic field polarity. In the Summer of 1999 we obtained 13 epochs of VLA A-array observations at 1.4, 4.8, 8.4 and 15 GHz. In May, September and October of 1999 we obtained 11 epochs of Australia Telescope Compact Array observations at 4.8 and 8.5 GHz. In all three of the data sets, we find no evidence for linear polarization greater than 0.1% in spite of strong circular polarization detections. Both VLA and ATCA data sets support three conclusions regarding the fractional circular polarization: the average spectrum is inverted with a spectral index ~0.5 +/- 0.2; the degree of variability is roughly constant on timescales of days to years; and, the degree of variability increases with frequency. We also observed that the largest increase in fractional circular polarization was coincident with the brightest flare in total intensity. Significant variability in the total intensity and fractional circular polarization on a timescale of 1 hour was observed during this flare, indicating an upper limit to the size of 70 AU at 15 GHz. The fractional circular polarization at 15 GHz reached -1.1% and the spectral index is strongly inverted during this flare. We conclude that the spectrum has two components that match the high and low frequency total intensity components. (abridged)Comment: Accepted for publication in ApJ, 40 pages, 18 figure

    Packing and Padding: Coupled Multi-index for Accurate Image Retrieval

    Full text link
    In Bag-of-Words (BoW) based image retrieval, the SIFT visual word has a low discriminative power, so false positive matches occur prevalently. Apart from the information loss during quantization, another cause is that the SIFT feature only describes the local gradient distribution. To address this problem, this paper proposes a coupled Multi-Index (c-MI) framework to perform feature fusion at indexing level. Basically, complementary features are coupled into a multi-dimensional inverted index. Each dimension of c-MI corresponds to one kind of feature, and the retrieval process votes for images similar in both SIFT and other feature spaces. Specifically, we exploit the fusion of local color feature into c-MI. While the precision of visual match is greatly enhanced, we adopt Multiple Assignment to improve recall. The joint cooperation of SIFT and color features significantly reduces the impact of false positive matches. Extensive experiments on several benchmark datasets demonstrate that c-MI improves the retrieval accuracy significantly, while consuming only half of the query time compared to the baseline. Importantly, we show that c-MI is well complementary to many prior techniques. Assembling these methods, we have obtained an mAP of 85.8% and N-S score of 3.85 on Holidays and Ukbench datasets, respectively, which compare favorably with the state-of-the-arts.Comment: 8 pages, 7 figures, 6 tables. Accepted to CVPR 201

    Ранжирование документов при полнотекстовом поиске с учетом расстояния с использованием индексов с многокомпонентными ключами

    Full text link
    The problem of proximity full-text search is considered. If a search query contains high-frequently occurring words, then multi-component key indexes deliver improvement of the search speed in comparison with ordinary inverted indexes. It was shown that we can increase the search speed up to 130 times in cases when queries consist of high-frequently occurring words. In this paper, we are investigating how the multi-component key indexes architecture affects the quality of the search. We consider several well-known methods of relevance ranking; these methods are of different authors. Using these methods we perform the search in the ordinary inverted index and then in the index that is enhanced with multi-component key indexes. The results show that with multi-component key indexes we obtain search results that are very near in terms of relevance ranking to the search results that are obtained by means of ordinary inverted indexes. © 2021 Udmurt State University. All rights reserved
    corecore