13 research outputs found

    Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization

    Full text link
    Undetected overfitting can occur when there are significant redundancies between training and validation data. We describe AVE, a new measure of training-validation redundancy for ligand-based classification problems that accounts for the similarity amongst inactive molecules as well as active. We investigated seven widely-used benchmarks for virtual screening and classification, and show that the amount of AVE bias strongly correlates with the performance of ligand-based predictive methods irrespective of the predicted property, chemical fingerprint, similarity measure, or previously-applied unbiasing techniques. Therefore, it may be that the previously-reported performance of most ligand-based methods can be explained by overfitting to benchmarks rather than good prospective accuracy

    Astrocytes: orchestrating synaptic plasticity?

    Get PDF
    Synaptic plasticity is the capacity of a preexisting connection between two neurons to change in strength as a function of neural activity. Because synaptic plasticity is the major candidate mechanism for learning and memory, the elucidation of its constituting mechanisms is of crucial importance in many aspects of normal and pathological brain function. In particular, a prominent aspect that remains debated is how the plasticity mechanisms, that encompass a broad spectrum of temporal and spatial scales, come to play together in a concerted fashion. Here we review and discuss evidence that pinpoints to a possible non-neuronal, glial candidate for such orchestration: the regulation of synaptic plasticity by astrocytes.Comment: 63 pages, 4 figure

    AI is a viable alternative to high throughput screening: a 318-target study

    Get PDF
    : High throughput screening (HTS) is routinely used to identify bioactive small molecules. This requires physical compounds, which limits coverage of accessible chemical space. Computational approaches combined with vast on-demand chemical libraries can access far greater chemical space, provided that the predictive accuracy is sufficient to identify useful molecules. Through the largest and most diverse virtual HTS campaign reported to date, comprising 318 individual projects, we demonstrate that our AtomNet® convolutional neural network successfully finds novel hits across every major therapeutic area and protein class. We address historical limitations of computational screening by demonstrating success for target proteins without known binders, high-quality X-ray crystal structures, or manual cherry-picking of compounds. We show that the molecules selected by the AtomNet® model are novel drug-like scaffolds rather than minor modifications to known bioactive compounds. Our empirical results suggest that computational methods can substantially replace HTS as the first step of small-molecule drug discovery

    Automated Synthetic Feasibility Assessment: A Data-driven Derivation of Computational tools for Medicinal Chemistry

    No full text
    The planning of organic syntheses, a critical problem in chemistry, can be directly modeled as resource- constrained branching plans in a discrete, fully-observable state space. Despite this clear relationship, the full artillery of artificial intelligence has not been brought to bear on this problem due to its inherent complexity and multidisciplinary challenges. In this thesis, I describe a mapping between organic synthesis and heuristic search and build a planner that can solve such problems automatically at the undergraduate level. Along the way, I show the need for powerful heuristic search algorithms and build large databases of synthetic information, which I use to derive a qualitatively new kind of heuristic guidance.Ph

    XML screamer: An integrated approach to high performance XML parsing, validation and deserialization

    No full text
    This paper describes an experimental system in which customized high performance XML parsers are prepared using parser generation and compilation techniques. Parsing is integrated with Schema-based validation and deserialization, and the resulting validating processors are shown to be as fast as or in many cases significantly faster than traditional nonvalidating parsers. High performance is achieved by integration across layers of software that are traditionally separate, by avoiding unnecessary data copying and transformation, and by careful attention to detail in the generated code. The effect of API design on XML performance is also briefly discussed. Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors – code generation, compilers, optimization, parsing, retargetable compilers. D.2.

    WWW 2007 / Track: XML and Web Data Session: Parsing, Normalizing, and Storing XML ABSTRACT A High-Performance Interpretive Approach to Schema-Directed Parsing

    No full text
    XML delivers key advantages in interoperability due to its flexibility, expressiveness, and platform-neutrality. As XML has become a performance-critical aspect of the next generation of business computing infrastructure, however, it has become increasingly clear that XML parsing often carries a heavy performance penalty, and that current, widely-used parsing technologies are unable to meet the performance demands of an XML-based computing infrastructure. Several efforts have been made to address this performance gap through the use of grammar-based parser generation. While the performance of generated parsers has been significantly improved, adoption of the technology has been hindered by the complexity of compiling and deploying the generated parsers. Through careful analysis of the operations required for parsing and validation, we have devised a set of specialize

    Astrocytes: Orchestrating synaptic plasticity?

    No full text
    corecore