Search CORE

10,535 research outputs found

The study of probability model for compound similarity searching

Author: Abd. Wahid Mohd. Taib
Alwee Razana
Dollah @ Md. Zain Rozilawati
Salim Naomie
Publication venue: Faculty of Computer Science and Information System
Publication date: 30/09/2006
Field of study

Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model

Universiti Teknologi Malaysia Institutional Repository

SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

Author: Huang Fang
Yu Hao
Yu Weiren
Zhang Chengyuan
Zhang Zuping
Zhu Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/10/2019
Field of study

In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently

Warwick Research Archives Portal Repository

Bounded Coordinate-Descent for Biological Sequence Classification in High Dimensional Predictor Space

Author: Ifrim Georgiana
Wiuf Carsten
Publication venue
Publication date: 03/08/2010
Field of study

We present a framework for discriminative sequence classification where the learner works directly in the high dimensional predictor space of all subsequences in the training set. This is possible by employing a new coordinate-descent algorithm coupled with bounding the magnitude of the gradient for selecting discriminative subsequences fast. We characterize the loss functions for which our generic learning algorithm can be applied and present concrete implementations for logistic regression (binomial log-likelihood loss) and support vector machines (squared hinge loss). Application of our algorithm to protein remote homology detection and remote fold recognition results in performance comparable to that of state-of-the-art methods (e.g., kernel support vector machines). Unlike state-of-the-art classifiers, the resulting classification models are simply lists of weighted discriminative subsequences and can thus be interpreted and related to the biological problem

arXiv.org e-Print Archive

CiteSeerX

Toward Entity-Aware Search

Author: Cheng Tao
Publication venue
Publication date: 01/12/2010
Field of study

As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability

Illinois Digital Environment for Access to Learning and Scholarship Repository

Index ordering by query-independent measures

Author: Alan F. Smeaton
Amento
Anh
Anh
Anh
Baeza-Yates
Broder
Büttcher
Chakrabarti
Fagni
Ferguson
Garcia
Joachims
Joachims
Kleinberg
Moffat
Ntoulas
Park
Paul Ferguson
Persin
Plachouras
Robertson
Vapnik
Wang
Witten
Xue
Zhai
Zhang
Zipf
Publication venue: 'Elsevier BV'
Publication date: 01/05/2012
Field of study

Conventional approaches to information retrieval search through all applicable entries in an inverted file for a particular collection in order to find those documents with the highest scores. For particularly large collections this may be extremely time consuming. A solution to this problem is to only search a limited amount of the collection at query-time, in order to speed up the retrieval process. In doing this we can also limit the loss in retrieval efficacy (in terms of accuracy of results). The way we achieve this is to firstly identify the most “important” documents within the collection, and sort documents within inverted file lists in order of this “importance”. In this way we limit the amount of information to be searched at query time by eliminating documents of lesser importance, which not only makes the search more efficient, but also limits loss in retrieval accuracy. Our experiments, carried out on the TREC Terabyte collection, report significant savings, in terms of number of postings examined, without significant loss of effectiveness when based on several measures of importance used in isolation, and in combination. Our results point to several ways in which the computation cost of searching large collections of documents can be significantly reduced

Crossref

Irish Universities

DCU Online Research Access Service

The IceCube Neutrino Observatory Part VI: Ice Properties, Reconstruction and Future Developments

Author: Aartsen M. G.
Abbasi R.
Abdou Y.
Ackermann M.
Adams J.
Aguilar J. A.
Ahlers M.
Altmann D.
Auffenberg J.
Bai X.
Baker M.
Barwick S. W.
Baum V.
Bay R.
Beatty J. J.
Bechet S.
Becker K. -H.
Bell M.
Benabderrahmane M. L.
BenZvi S.
Berghaus P.
Berley D.
Bernardini E.
Bernhard A.
Bertrand D.
Besson D. Z.
Binder G.
Bindig D.
Bissok M.
Blaufuss E.
Blumenthal J.
Boersma D. J.
Bohaichuk S.
Bohm C.
Bose D.
Botner O.
Brayeur L.
Bretz H. -P.
Brown A. M.
Bruijn R.
Brunner J.
Böser S.
Carson M.
Casey J.
Casier M.
Chirkin D.
Christov A.
Christy B.
Clark K.
Clevermann F.
Coenders S.
Cohen S.
Cowen D. F.
Danninger M.
Daughhetee J.
Davis J. C.
De Clercq C.
De Ridder S.
de Vries K. D.
de With M.
Desiati P.
DeYoung T.
Dunkman M.
Díaz-Vélez J. C.
Eagan R.
Eberhardt B.
Eisch J.
Ellsworth R. W.
Euler S.
Evenson P. A.
Fadiran O.
Fazely A. R.
Fedynitch A.
Feintzeig J.
Feusels T.
Filimonov K.
Finley C.
Fischer-Wasels T.
Flis S.
Franckowiak A.
Frantzen K.
Fuchs T.
Gaisser T. K.
Gallagher J.
Gerhardt L.
Gladstone L.
Glüsenkamp T.
Goldschmidt A.
Golup G.
Gonzalez J. G.
Goodman J. A.
Grandmont D. T.
Grant D.
Groß A.
Góra D.
Ha C.
Hallen P.
Hallgren A.
Halzen F.
Hanson K.
Heereman D.
Heinen D.
Helbing K.
Hellauer R.
Heros C. Pérez de los
Hickford S.
Hill G. C.
Hoffman K. D.
Hoffmann R.
Homeier A.
Hoshina K.
Huelsnitz W.
Hulth P. O.
Hultqvist K.
Hussain S.
IceCube Collaboration
Ishihara A.
Ismail A. Haj
Jacobi E.
Jacobsen J.
Jagielski K.
Japaridze G. S.
Jero K.
Jlelati O.
Kaminsky B.
Kappes A.
Karg T.
Karle A.
Kelley J. L.
Kiryluk J.
Klein S. R.
Kläs J.
Kohnen G.
Kolanoski H.
Kopper C.
Kopper S.
Koskinen D. J.
Kowalski M.
Krasberg M.
Krings K.
Kroll G.
Kunnen J.
Kurahashi N.
Kuwabara T.
Köhne J. -H.
Köpke L.
Labare M.
Landsman H.
Larson M. J.
Lesiak-Bzdak M.
Leuermann M.
Leute J.
Lünemann J.
Madsen J.
Maggi G.
Maruyama R.
Mase K.
Matis H. S.
McNally F.
Meagher K.
Merck M.
Meures T.
Miarecki S.
Middell E.
Milke N.
Miller J.
Mohrmann L.
Montaruli T.
Morse R.
Mészáros P.
Nahnhauer R.
Naumann U.
Niederhausen H.
Nowicki S. C.
Nygren D. R.
O'Murchadha A.
Obertacke A.
Odrowski S.
Olivas A.
Olivo M.
Paul L.
Pepper J. A.
Pfendner C.
Pieloth D.
Pinat E.
Posselt J.
Price P. B.
Przybylski G. T.
Rameez M.
Rawlins K.
Redl P.
Reimann R.
Resconi E.
Rhode W.
Ribordy M.
Richman M.
Riedel B.
Rodrigues J. P.
Rott C.
Ruhe T.
Ruzybayev B.
Ryckbosch D.
Rädel L.
Saba S. M.
Salameh T.
Sander H. -G.
Santander M.
Sarkar S.
Schatto K.
Scheel M.
Scheriau F.
Schmidt T.
Schmitz M.
Schoenen S.
Schukraft A.
Schulte L.
Schulz O.
Schöneberg S.
Schönwald A.
Seckel D.
Sestayo Y.
Seunarine S.
Shanidze R.
Sheremata C.
Silva A. H. Cruz
Smith M. W. E.
Soldin D.
Spiczak G. M.
Spiering C.
Stamatikos M.
Stanev T.
Stasik A.
Stezelberger T.
Stokstad R. G.
Strahler E. A.
Ström R.
Stößl A.
Sullivan G. W.
Taavola H.
Taboada I.
Tamburro A.
Tepe A.
Ter-Antonyan S.
Tešić G.
Tilav S.
Tjus J. Becker
Toale P. A.
Toscano S.
Usner M.
van der Drift D.
van Eijndhoven N.
Van Overloop A.
van Santen J.
Vehring M.
Voge M.
Vraeghe M.
Walck C.
Waldenmaier T.
Wallraff M.
Wasserman R.
Weaver Ch.
Wellons M.
Wendt C.
Westerhoff S.
Whitehorn N.
Wiebe K.
Wiebusch C. H.
Williams D. R.
Wissing H.
Wolf M.
Wood T. R.
Woschnagg K.
Xu D. L.
Xu X. W.
Yanez J. P.
Yodh G.
Yoshida S.
Zarzhitsky P.
Ziemann J.
Zierke S.
Zoll M.
Publication venue
Publication date: 01/01/2013
Field of study

Papers on ice properties, reconstruction and future developments submitted to the 33nd International Cosmic Ray Conference (Rio de Janeiro 2013) by the IceCube Collaboration.Comment: 28 pages, 38 figures; Papers submitted to the 33nd International Cosmic Ray Conference, Rio de Janeiro 2013; version 2 corrects errors in the author lis

arXiv.org e-Print Archive

DESY

Oxford University Research Archive

SemIndex: Semantic-Aware Inverted Index

Author: Al Assad Marc
Chbeir Richard
Luo Yi
Raymundo Ibañez Carlos Arturo
Tekli Joe
Traina Jr Caetano
Traina Agma J. M.
Universidad Peruana de Ciencias Aplicadas (UPC)
Yetongnon Kokou
Publication venue: Springer International Publishing
Publication date: 01/01/2014
Field of study

[email protected] paper focuses on the important problem of semanticaware search in textual (structured, semi-structured, NoSQL) databases. This problem has emerged as a required extension of the standard containment keyword based query to meet user needs in textual databases and IR applications. We provide here a new approach, called SemIndex, that extends the standard inverted index by constructing a tight coupling inverted index graph that combines two main resources: a general purpose semantic network, and a standard inverted index on a collection of textual data. We also provide an extended query model and related processing algorithms with the help of SemIndex. To investigate its effectiveness, we set up experiments to test the performance of SemIndex. Preliminary results have demonstrated the effectiveness, scalability and optimality of our approach.This study is partly funded by: Bourgogne Region program, CNRS, and STIC AmSud project Geo-Climate XMine, and LAU grant SOERC-1314T012.Revisión por pare

HAL-uB

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Repositorio Académico UPC

Sense-Based Arabic Information Retrieval Using Harmony Search Algorithm

Author: Abdul Hassan Alia
Hadi Mustafa
Publication venue: University of Information and Technology Communications
Publication date: 01/08/2017
Field of study

Information Retrieval (IR) is a field of computer science that deals with storing, searching, and retrievingdocuments that satisfy the user need. The modern standard Arabic language is rich in multiple meanings (senses) for manywords and this is substantially due to lack of diacritical marks. The task for finding appropriate meanings is a key demand inmost of the Arabic IR applications. Actually, the successful system should not be interested only in the retrieval quality andoblivious to the system efficiency. Thus, this paper contributes to improve the system effectiveness by finding appropriatestemming methodology, word sense disambiguation, and query expansion for addressing the retrieval quality of AIR. Also, itcontributes to improve the system efficiency through using a powerful metaheuristic search called Harmony Search (HS)algorithm inspired from the musical improvisation processes. The performance of the proposed system outperforms the one inthe traditional system in a rate of 19.5% while reduces the latency in an approximate rate of 0.077 second for each query

Iraqi Journal for Computers and Informatics

Directory of Open Access Journals