19,096 research outputs found

    The study of probability model for compound similarity searching

    Get PDF
    Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model

    Phonetic Searching

    Get PDF
    An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.Georgia Tech Research Corporatio

    Applying contextual memory cues for retrieval from personal information archives

    Get PDF
    Advances in digital technologies for information capture combined with massive increases in the capacity of digital storage media mean that it is now possible to capture and store one’s entire life experiences in a Human Digital Memory (HDM). Information can be captured from a myriad of personal information devices including desktop computers, PDAs, digital cameras, video and audio recorders, and various sensors, including GPS, Bluetooth, and biometric devices. These diverse collections of personal information are potentially very valuable, but will only be so if significant information can be reliably retrieved from them. HDMs differ from traditional document collections for which existing search technologies have been developed since users may have poor recollection of contents or even the existence of stored items. Additionally HDM data is highly heterogeneous and unstructured, making it difficult to form search queries. We believe that a Personal Information Management (PIM) system which exploits the context of information capture, and potentially of earlier refinding, can be valuable in effective retrieval from an HDM. We report an investigation into how individuals perform searches of their personal information, and use the outcome of this study to develop an information retrieval (IR) framework for HDM search incorporating the context of document capture. We then describe the creation of a pilot HDM test collection, and initial experiments in retrieval from this collection. Results from these experiments indicate that use of context data can be significantly beneficial to increasing the efficient retrieval of partially recalled items from an HDM

    Searching by approximate personal-name matching

    Get PDF
    We discuss the design, building and evaluation of a method to access theinformation of a person, using his name as a search key, even if it has deformations. We present a similarity function, the DEA function, based on the probabilities of the edit operations accordingly to the involved letters and their position, and using a variable threshold. The efficacy of DEA is quantitatively evaluated, without human relevance judgments, very superior to the efficacy of known methods. A very efficient approximate search technique for the DEA function is also presented based on a compacted trie-tree structure.Postprint (published version

    The Computer as a Tool for Legal Research

    Get PDF
    corecore