50 research outputs found

    Some new results on majority-logic codes for correction of random errors

    Get PDF
    The main advantages of random error-correcting majority-logic codes and majority-logic decoding in general are well known and two-fold. Firstly, they offer a partial solution to a classical coding theory problem, that of decoder complexity. Secondly, a majority-logic decoder inherently corrects many more random error patterns than the minimum distance of the code implies is possible. The solution to the decoder complexity is only a partial one because there are circumstances under which a majority-logic decoder is too complex and expensive to implement. [Continues.

    A New Approach for Homomorphic Encryption with Secure Function Evaluation on Genomic Data

    Get PDF
    Additively homomorphic encryption is a public-key primitive allowing a sum to be computed on encrypted values. Although limited in functionality, additive schemes have been an essential tool in the private function evaluation toolbox for decades. They are typically faster and more straightforward to implement relative to their fully homomorphic counterparts, and more efficient than garbled circuits in certain applications. This thesis presents a novel method for extending the functionality of additively homomorphic encryption to allow the private evaluation of functions of restricted domain. Provided the encrypted sum falls within the restricted domain, the function can be homomorphically evaluated “for free” in a single public-key operation. We will describe an algorithm for encoding private functions into the public-keys of two well-known additive cryptosystems. We extend this scheme to an application in the field of pharmacogenomics called Similar Patient Query. With the advent of human genome project, there is a tremendous availability of genomic data opening the door for a possibility of many advances in the field of medicine. Precision medicine is one such application where a patient is administered drugs based on their genetic makeup. If the genomic data is not kept private, it can lead to several information frauds, so it needs to be encrypted. To tap the full potential of the encrypted genomic data, we need to perform computations on it without compromising its security. For SPQ, we pick a query genome and compare it across a hospital data base, to find patients similar to that of the query and use the information to apply precision medicine, all of this is carried out under privacy preserving settings in the presence of a semi-honest adversary in a single transaction

    Towards efficient proofs of storage and verifiable outsourced database in cloud computing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Efficient homology search for genomic sequence databases

    Get PDF
    Genomic search tools can provide valuable insights into the chemical structure, evolutionary origin and biochemical function of genetic material. A homology search algorithm compares a protein or nucleotide query sequence to each entry in a large sequence database and reports alignments with highly similar sequences. The exponential growth of public data banks such as GenBank has necessitated the development of fast, heuristic approaches to homology search. The versatile and popular blast algorithm, developed by researchers at the US National Center for Biotechnology Information (NCBI), uses a four-stage heuristic approach to efficiently search large collections for analogous sequences while retaining a high degree of accuracy. Despite an abundance of alternative approaches to homology search, blast remains the only method to offer fast, sensitive search of large genomic collections on modern desktop hardware. As a result, the tool has found widespread use with millions of queries posed each day. A significant investment of computing resources is required to process this large volume of genomic searches and a cluster of over 200 workstations is employed by the NCBI to handle queries posed through the organisation's website. As the growth of sequence databases continues to outpace improvements in modern hardware, blast searches are becoming slower each year and novel, faster methods for sequence comparison are required. In this thesis we propose new techniques for fast yet accurate homology search that result in significantly faster blast searches. First, we describe improvements to the final, gapped alignment stages where the query and sequences from the collection are aligned to provide a fine-grain measure of similarity. We describe three new methods for aligning sequences that roughly halve the time required to perform this computationally expensive stage. Next, we investigate improvements to the first stage of search, where short regions of similarity between a pair of sequences are identified. We propose a novel deterministic finite automaton data structure that is significantly smaller than the codeword lookup table employed by ncbi-blast, resulting in improved cache performance and faster search times. We also discuss fast methods for nucleotide sequence comparison. We describe novel approaches for processing sequences that are compressed using the byte packed format already utilised by blast, where four nucleotide bases from a strand of DNA are stored in a single byte. Rather than decompress sequences to perform pairwise comparisons, our innovations permit sequences to be processed in their compressed form, four bases at a time. Our techniques roughly halve average query evaluation times for nucleotide searches with no effect on the sensitivity of blast. Finally, we present a new scheme for managing the high degree of redundancy that is prevalent in genomic collections. Near-duplicate entries in sequence data banks are highly detrimental to retrieval performance, however existing methods for managing redundancy are both slow, requiring almost ten hours to process the GenBank database, and crude, because they simply purge highly-similar sequences to reduce the level of internal redundancy. We describe a new approach for identifying near-duplicate entries that is roughly six times faster than the most successful existing approaches, and a novel approach to managing redundancy that reduces collection size and search times but still provides accurate and comprehensive search results. Our improvements to blast have been integrated into our own version of the tool. We find that our innovations more than halve average search times for nucleotide and protein searches, and have no signifcant effect on search accuracy. Given the enormous popularity of blast, this represents a very significant advance in computational methods to aid life science research

    [Research Pertaining to Physics, Space Sciences, Computer Systems, Information Processing, and Control Systems]

    Get PDF
    Research project reports pertaining to physics, space sciences, computer systems, information processing, and control system

    Seventh Biennial Report : June 2003 - March 2005

    No full text
    corecore