133 research outputs found

    Effectiveness of retrieval in similarity searches of chemical databases: A review of performance measures

    Get PDF
    This article reviews measures for evaluating the effectiveness of similarity searches in chemical databases, drawing principally upon the many measures that have been described previously for evaluating the performance of text search engines. The use of the various measures is exemplified by fragment-based 2D similarity searches on several databases for which both structural and bioactivity data are available. It is concluded that the cumulative recall and G-H score measures are the most useful of those tested

    Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings

    Get PDF
    This paper compares 22 different similarity coefficients when they are used for searching databases of 2D fragment bit-strings. Experiments with the National Cancer Institute's AIDS and IDAlert databases show that the coefficients fall into several well-marked clusters, in which the members of a cluster will produce comparable rankings of a set of molecules. These clusters provide a basis for selecting combinations of coefficients for use in data fusion experiments. The results of these experiments provide a simple way of increasing the effectiveness of fragment-based similarity searching systems

    Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings

    Get PDF
    This paper compares 22 different similarity coefficients when they are used for searching databases of 2D fragment bit-strings. Experiments with the National Cancer Institute's AIDS and IDAlert databases show that the coefficients fall into several well-marked clusters, in which the members of a cluster will produce comparable rankings of a set of molecules. These clusters provide a basis for selecting combinations of coefficients for use in data fusion experiments. The results of these experiments provide a simple way of increasing the effectiveness of fragment-based similarity searching systems

    Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

    Get PDF
    This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

    Use of the R-group descriptor for alignment-free QSAR

    Get PDF
    An R-group descriptor characterises the distribution of some atom-based property, such as elemental type or partial atomic charge, at increasing numbers of bonds distant from the point of substitution on a parent ring system. Application of Partial Least Squares (PLS) to datasets for which bioactivity data and R-group descriptor information are available is shown to provide an effective way of generating QSAR models with a high level of predictive ability. The resulting models are competitive with the models produced by established QSAR approaches, are readily interpretable in structural terms, and are shown to be of value in the optimisation of a lead series

    Maximum Common Subgraph Isomorphism Algorithms

    Get PDF
    Maximum common subgraph (MCS) isomorphism algorithms play an important role in chemoinformatics by providing an effective mechanism for the alignment of pairs of chemical structures. This article discusses the various types of MCS that can be identified when two graphs are compared and reviews some of the algorithms that are available for this purpose, focusing on those that are, or may be, applicable to the matching of chemical graphs

    Identification of diverse database subsets using property-based and fragment-based molecular descriptions

    Get PDF
    This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors; and that both approaches are noticeably superior to random selection

    Leukocyte traits and exposure to ambient particulate matter air pollution in the women’s health initiative and atherosclerosis risk in communities study

    Get PDF
    BACKGROUND: Inflammatory effects of ambient particulate matter (PM) air pollution exposures may underlie PM-related increases in cardiovascular disease risk and mortality, although evidence of PM-associated leukocytosis is inconsistent and largely based on small, cross-sectional, and/or unrepresentative study populations. OBJECTIVES: Our objective was to estimate PM–leukocyte associations among U.S. women and men in the Women’s Health Initiative and Atherosclerosis Risk in Communities study (n = 165,675). METHODS: We based the PM–leukocyte estimations on up to four study visits per participant, at which peripheral blood leukocytes and geocoded address-specific concentrations of PM ≤ 10, ≤2:5, and 2:5–10 lm in diameter (PM10, PM2:5, and PM2:5–10, respectively) were available. We multiply imputed missing data using chained equations and estimated PM–leukocyte count associations over daily to yearly PM exposure averaging periods using center-specific, linear, mixed, longitudinal models weighted for attrition and adjusted for sociodemographic, behavioral, meteorological, and geographic covariates. In a subset of participants with available data (n = 8,457), we also estimated PM–leukocyte proportion associations in compositional data analyses. RESULTS: We found a 12 cells=lL (95% confidence interval: −9, 33) higher leukocyte count, a 1.2% (0.6%, 1.8%) higher granulocyte proportion, and a −1:1% (−1:9%, −0:3%) lower CD8+ T-cell proportion per 10-lg=m3 increase in 1-month mean PM2:5. However, shorter-duration PM10 exposures were inversely and only modestly associated with leukocyte count. DISCUSSION: The PM2:5 –leukocyte estimates, albeit imprecise, suggest that among racially, ethnically, and environmentally diverse U.S. populations, sustained, ambient exposure to fine PM may induce subclinical, but epidemiologically important, inflammatory effects. https://doi.org/10.1289/EHP5360

    Epigenetically mediated electrocardiographic manifestations of sub-chronic exposures to ambient particulate matter air pollution in the Women's Health Initiative and Atherosclerosis Risk in Communities Study

    Get PDF
    Background: Short-duration exposure to ambient particulate matter (PM) air pollution is associated with cardiac autonomic dysfunction and prolonged ventricular repolarization. However, associations with sub-chronic exposures to coarser particulates are relatively poorly characterized as are molecular mechanisms underlying their potential relationships with cardiovascular disease. Materials and methods: We estimated associations between monthly mean concentrations of PM < 10 μm and 2.5–10 μm in diameter (PM10; PM2.5-10) with time-domain measures of heart rate variability (HRV) and QT interval duration (QT) among U.S. women and men in the Women's Health Initiative and Atherosclerosis Risk in Communities Study (nHRV = 82,107; nQT = 76,711). Then we examined mediation of the PM-HRV and PM-QT associations by DNA methylation (DNAm) at three Cytosine-phosphate-Guanine (CpG) sites (cg19004594, cg24102420, cg12124767) with known sensitivity to monthly mean PM concentrations in a subset of the participants (nHRV = 7,169; nQT = 6,895). After multiply imputing missing PM, electrocardiographic and covariable data, we estimated associations using attrition-weighted, linear, mixed, longitudinal models adjusting for sociodemographic, behavioral, meteorological, and clinical characteristics. We assessed mediation by estimating the proportions of PM-HRV and PM-QT associations mediated by DNAm. Results: We found little evidence of PM-HRV association, PM-QT association, or mediation by DNAm. Conclusions: The findings suggest that among racially/ethnically and environmentally diverse U.S. populations, sub-chronic exposures to coarser particulates may not exert appreciable, epigenetically mediated effects on cardiac autonomic function or ventricular repolarization. Further investigation in better-powered studies is warranted, with additional focus on shorter duration exposures to finer particulates and non-electrocardiographic outcomes among relatively susceptible populations

    Homo naledi, a new species of the genus Homo from the Dinaledi Chamber, South Africa.

    Get PDF
    Homo naledi is a previously-unknown species of extinct hominin discovered within the Dinaledi Chamber of the Rising Star cave system, Cradle of Humankind, South Africa. This species is characterized by body mass and stature similar to small-bodied human populations but a small endocranial volume similar to australopiths. Cranial morphology of H. naledi is unique, but most similar to early Homo species including Homo erectus, Homo habilis or Homo rudolfensis. While primitive, the dentition is generally small and simple in occlusal morphology. H. naledi has humanlike manipulatory adaptations of the hand and wrist. It also exhibits a humanlike foot and lower limb. These humanlike aspects are contrasted in the postcrania with a more primitive or australopith-like trunk, shoulder, pelvis and proximal femur. Representing at least 15 individuals with most skeletal elements repeated multiple times, this is the largest assemblage of a single species of hominins yet discovered in Africa
    • …
    corecore