22,359 research outputs found

    IMPROVING MOLECULAR FINGERPRINT SIMILARITY VIA ENHANCED FOLDING

    Get PDF
    Drug discovery depends on scientists finding similarity in molecular fingerprints to the drug target. A new way to improve the accuracy of molecular fingerprint folding is presented. The goal is to alleviate a growing challenge due to excessively long fingerprints. This improved method generates a new shorter fingerprint that is more accurate than the basic folded fingerprint. Information gathered during preprocessing is used to determine an optimal attribute order. The most commonly used blocks of bits can then be organized and used to generate a new improved fingerprint for more optimal folding. We thenapply the widely usedTanimoto similarity search algorithm to benchmark our results. We show an improvement in the final results using this method to generate an improved fingerprint when compared against other traditional folding methods

    Throughput-based Design for Polar Coded-Modulation

    Full text link
    Typically, forward error correction (FEC) codes are designed based on the minimization of the error rate for a given code rate. However, for applications that incorporate hybrid automatic repeat request (HARQ) protocol and adaptive modulation and coding, the throughput is a more important performance metric than the error rate. Polar codes, a new class of FEC codes with simple rate matching, can be optimized efficiently for maximization of the throughput. In this paper, we aim to design HARQ schemes using multilevel polar coded-modulation (MLPCM). Thus, we first develop a method to determine a set-partitioning based bit-to-symbol mapping for high order QAM constellations. We simplify the LLR estimation of set-partitioned QAM constellations for a multistage decoder, and we introduce a set of algorithms to design throughput-maximizing MLPCM for the successive cancellation decoding (SCD). These codes are specifically useful for non-combining (NC) and Chase-combining (CC) HARQ protocols. Furthermore, since optimized codes for SCD are not optimal for SC list decoders (SCLD), we propose a rate matching algorithm to find the best rate for SCLD while using the polar codes optimized for SCD. The resulting codes provide throughput close to the capacity with low decoding complexity when used with NC or CC HARQ

    Prospects and limitations of full-text index structures in genome analysis

    Get PDF
    The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared
    corecore