Search CORE

14,319 research outputs found

The study of probability model for compound similarity searching

Author: Abd. Wahid Mohd. Taib
Alwee Razana
Dollah @ Md. Zain Rozilawati
Salim Naomie
Publication venue: Faculty of Computer Science and Information System
Publication date: 30/09/2006
Field of study

Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model

Universiti Teknologi Malaysia Institutional Repository

Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries

Author: Ananiadou S
Thompson P
Wang X
Publication venue
Publication date: 01/01/2012
Field of study

The University of Manchester - Institutional Repository

Exploring Protein-Protein Interactions as Drug Targets for Anti-cancer Therapy with In Silico Workflows

Author: A Goncearenco
A Goncearenco
A Marchler-Bauer
A Truszkowski
AA Bogan
B Graves
B Ma
BA Shoemaker
BA Shoemaker
BA Shoemaker
BJ Smith
CA Goble
CM Yates
D Petrey
E Cukuroglu
FP Davis
H Perez-Sanchez
HS Haase
J Bhagat
J Cinatl
JA Wells
K Wolstencroft
M Guharoy
M Li
M Li
M Li
M Petukh
M Tyagi
MK Gilson
MP Mazanetz
N Estrada-Ortiz
P Aloy
P Aloy
P Filippakopoulos
R Mosca
RR Thangudu
S Beisken
S Kim
S Shangary
S Teng
T Rolland
W Yang
WS Valdar
Y Wang
Y Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

We describe a computational protocol to aid the design of small molecule and peptide drugs that target protein-protein interactions, particularly for anti-cancer therapy. To achieve this goal, we explore multiple strategies, including finding binding hot spots, incorporating chemical similarity and bioactivity data, and sampling similar binding sites from homologous protein complexes. We demonstrate how to combine existing interdisciplinary resources with examples of semi-automated workflows. Finally, we discuss several major problems, including the occurrence of drug-resistant mutations, drug promiscuity, and the design of dual-effect inhibitors.Fil: Goncearenco, Alexander. National Institutes of Health; Estados UnidosFil: Li, Minghui. Soochow University; China. National Institutes of Health; Estados UnidosFil: Simonetti, Franco Lucio. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Shoemaker, Benjamin A. National Institutes of Health; Estados UnidosFil: Panchenko, Anna R. National Institutes of Health; Estados Unido

Crossref

CONICET Digital

Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

Author: Lin Jimmy
Rao Jinfeng
Ture Ferhan
Yang Wei
Zhang Yuhao
Publication venue
Publication date: 21/06/2019
Field of study

Despite substantial interest in applications of neural networks to information retrieval, neural ranking models have only been applied to standard ad hoc retrieval tasks over web pages and newswire documents. This paper proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network) a novel neural ranking model specifically designed for ranking short social media posts. We identify document length, informal language, and heterogeneous relevance signals as features that distinguish documents in our domain, and present a model specifically designed with these characteristics in mind. Our model uses hierarchical convolutional layers to learn latent semantic soft-match relevance signals at the character, word, and phrase levels. A pooling-based similarity measurement layer integrates evidence from multiple types of matches between the query, the social media post, as well as URLs contained in the post. Extensive experiments using Twitter data from the TREC Microblog Tracks 2011--2014 show that our model significantly outperforms prior feature-based as well and existing neural ranking models. To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models.Comment: AAAI 2019, 10 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

BIKE: Bilingual Keyphrase Experiments

Author: Barrière Caroline
George Foster
Nadeau David
Publication venue
Publication date: 01/01/2005
Field of study

This paper presents a novel strategy for translating lists of keyphrases. Typical keyphrase lists appear in scientific articles, information retrieval systems and web page meta-data. Our system combines a statistical translation model trained on a bilingual corpus of scientific papers with sense-focused look-up in a large bilingual terminological resource. For the latter, we developed a novel technique that benefits from viewing the keyphrase list as contextual help for sense disambiguation. The optimal combination of modules was discovered by a genetic algorithm. Our work applies to the French / English language pair

NRC Publications Archive

CogPrints Cognitive Sciences Eprint Archive

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Automated simulation as part of a design workstation

Author: Cantwell Elizabeth
Robinson P.
Shenk T.
Upadhye R.
Publication venue
Publication date
Field of study

A development project for a design workstation for advanced life-support systems (called the DAWN Project, for Design Assistant Workstation), incorporating qualitative simulation, required the implementation of a useful qualitative simulation capability and the integration of qualitative and quantitative simulation such that simulation capabilities are maximized without duplication. The reason is that to produce design solutions to a system goal, the behavior of the system in both a steady and perturbed state must be represented. The Qualitative Simulation Tool (QST), on an expert-system-like model building and simulation interface toll called ScratchPad (SP), and on the integration of QST and SP with more conventional, commercially available simulation packages now being applied in the evaluation of life-support system processes and components are discussed

NASA Technical Reports Server

Cross-lingual Question Answering with QED

Author: Ahn Kisuh
Alex Beatrice
Bos Johan
Dalmas Tiphaine
Leidner Jochen L.
Smillie Matthew B.
Publication venue
Publication date: 01/01/2004
Field of study

We present improvements and modifications of the QED open-domain question answering system developed for TREC-2003 to make it cross-lingual for participation in the CrossLinguistic Evaluation Forum (CLEF) Question Answering Track 2004 for the source languages French and German and the target language English. We use rule-based question translation extended with surface pattern-oriented pre- and post-processing rules for question reformulation to create and English query from its French or German original. Our system uses deep processing for the question and answers, which requires efficient and radical prior search space pruning. For answering factoid questions, we report an accuracy of 16% (German to English) and 20% (French to English), respectively

CiteSeerX

Edinburgh Research Explorer

OmniSearch: A semantic search system based on the Ontology for MIcroRNA Target (OMIT) for microRNA-target gene interaction data

Author: et al
Wang Xiaowei
Publication venue: Digital Commons@Becker
Publication date: 01/01/2016
Field of study

Digital Commons@Becker

Dynamic load balancing for the distributed mining of molecular structures

Author: Berthold M.R.
Di Fatta Giuseppe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids

KOPS - The Institutional Repository of the University of Konstanz

Central Archive at the University of Reading

Crossref