8,711 research outputs found
Combination of molecular similarity measures using data fusion
Many different measures of structural similarity have been suggested for matching chemical structures, each such measure focusing upon some particular type of molecular characteristic. The multi-faceted nature of biological activity suggests that an appropriate similarity measure should encompass many different types of characteristic, and this article discusses the use of data fusion methods to combine the results of searches based on multiple similarity measures. Experiments with several different types of dataset and activity suggest that data fusion provides a simple, but effective, approach to the combination of individual similarity measures. The best results were generally obtained with a fusion rule that sums the rank positions achieved by each molecule in searches using individual measures
Enhancing the effectiveness of ligand-based virtual screening using data fusion
Data fusion is being increasingly used to combine the outputs of different types of sensor. This paper reviews the application of the approach to ligand-based virtual screening, where the sensors to be combined are functions that score molecules in a database on their likelihood of exhibiting some required biological activity. Much of the literature to date involves the combination of multiple similarity searches, although there is also increasing interest in the combination of multiple machine learning techniques. Both approaches are reviewed here, focusing on the extent to which fusion can improve the effectiveness of searching when compared with a single screening mechanism, and on the reasons that have been suggested for the observed performance enhancement
Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings
This paper compares 22 different similarity coefficients when they are used for searching databases of 2D fragment bit-strings. Experiments with the National Cancer Institute's AIDS and IDAlert databases show that the coefficients fall into several well-marked clusters, in which the members of a cluster will produce comparable rankings of a set of molecules. These clusters provide a basis for selecting combinations of coefficients for use in data fusion experiments. The results of these experiments provide a simple way of increasing the effectiveness of fragment-based similarity searching systems
Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings
This paper compares 22 different similarity coefficients when they are used for searching databases of 2D fragment bit-strings. Experiments with the National Cancer Institute's AIDS and IDAlert databases show that the coefficients fall into several well-marked clusters, in which the members of a cluster will produce comparable rankings of a set of molecules. These clusters provide a basis for selecting combinations of coefficients for use in data fusion experiments. The results of these experiments provide a simple way of increasing the effectiveness of fragment-based similarity searching systems
Similarity-based virtual screening using 2D fingerprints
This paper summarises recent work at the University of Sheffield on virtual screening methods that use 2D fingerprint measures of structural similarity. A detailed comparison of a large number of similarity coefficients demonstrates that the well-known Tanimoto coefficient remains the method of choice for the computation of fingerprint-based similarity, despite possessing some inherent biases related to the sizes of the molecules that are being sought. Group fusion involves combining the results of similarity searches based on multiple reference structures and a single similarity measure. We demonstrate the effectiveness of this approach to screening, and also describe an approximate form of group fusion, turbo similarity searching, that can be used when just a single reference structure is available
An Overview of Classifier Fusion Methods
A number of classifier fusion methods have been
recently developed opening an alternative approach
leading to a potential improvement in the
classification performance. As there is little theory of
information fusion itself, currently we are faced with
different methods designed for different problems and
producing different results. This paper gives an
overview of classifier fusion methods and attempts to
identify new trends that may dominate this area of
research in future. A taxonomy of fusion methods
trying to bring some order into the existing “pudding
of diversities” is also provided
An Overview of Classifier Fusion Methods
A number of classifier fusion methods have been
recently developed opening an alternative approach
leading to a potential improvement in the
classification performance. As there is little theory of
information fusion itself, currently we are faced with
different methods designed for different problems and
producing different results. This paper gives an
overview of classifier fusion methods and attempts to
identify new trends that may dominate this area of
research in future. A taxonomy of fusion methods
trying to bring some order into the existing “pudding
of diversities” is also provided
Ligand-based virtual screening using binary kernel discrimination
This paper discusses the use of a machine-learning technique called binary kernel discrimination (BKD) for virtual screening in drug- and pesticide-discovery programmes. BKD is compared with several other ligand-based tools for virtual screening in databases of 2D structures represented by fragment bit-strings, and is shown to provide an effective, and reasonably efficient, way of prioritising compounds for biological screening
Learning to Rank Academic Experts in the DBLP Dataset
Expert finding is an information retrieval task that is concerned with the
search for the most knowledgeable people with respect to a specific topic, and
the search is based on documents that describe people's activities. The task
involves taking a user query as input and returning a list of people who are
sorted by their level of expertise with respect to the user query. Despite
recent interest in the area, the current state-of-the-art techniques lack in
principled approaches for optimally combining different sources of evidence.
This article proposes two frameworks for combining multiple estimators of
expertise. These estimators are derived from textual contents, from
graph-structure of the citation patterns for the community of experts, and from
profile information about the experts. More specifically, this article explores
the use of supervised learning to rank methods, as well as rank aggregation
approaches, for combing all of the estimators of expertise. Several supervised
learning algorithms, which are representative of the pointwise, pairwise and
listwise approaches, were tested, and various state-of-the-art data fusion
techniques were also explored for the rank aggregation framework. Experiments
that were performed on a dataset of academic publications from the Computer
Science domain attest the adequacy of the proposed approaches.Comment: Expert Systems, 2013. arXiv admin note: text overlap with
arXiv:1302.041
- …