397 research outputs found

    Modified Huffman Code for Lossless Compression and Bandwidth Optimization and Applying Genetic Algorithms to Generating Paintings Based on Images

    Get PDF
    This thesis contains two projects. A modified Huffman code is presented as a lossless method to compress common traffic types. We posit the usage of compression benefits instead of just frequency of occurrence, as is common in Huffman codes, as the priority of each node when constructing the Huffman tree. We show the effectiveness of this method on common data transmission types and describe what would be needed for adoption of this algorithm. We explore genetic algorithms as a method to create paintings based on images. We find a balance between computational work required and visually pleasing results to the algorithm, prioritizing aspects of the parameter space based on their impact on the painting and how they impact computational workload

    Antioxidants: nanotechnology and biotechnology fusion for medicine in overall

    Get PDF
    Antioxidant is a chemical substance that is naturally found in our food. It can prevent or reduce the oxidative stress of the physiological system. Due to the regular usage of oxygen, the body continuously produces free radicals. Excessive number of free radicals could cause cellular damage in the human body that could lead to various diseases like cancer, muscular degeneration and diabetes. The presence of antioxidants helps to counterattack the effect of these free radicals. The antioxidant can be found in abundance in plants and most of the time there are problems with the delivery. The solution is by using nanotechnology that has multitude potential for advanced medical science. Nano devices and nanoparticles have significant impact as they can interact with the subcellular level of the body with a high degree of specificity. Thus, the treatment can be in maximum efficacy with little side effect

    Clustering by compression

    Full text link
    We present a new method for clustering based on compression. The method doesn't use subject-specific features or background knowledge, and works as follows: First, we determine a universal similarity distance, the normalized compression distance or NCD, computed from the lengths of compressed data files (singly and in pairwise concatenation). Second, we apply a hierarchical clustering method. The NCD is universal in that it is not restricted to a specific application area, and works across application area boundaries. A theoretical precursor, the normalized information distance, co-developed by one of the authors, is provably optimal but uses the non-computable notion of Kolmogorov complexity. We propose precise notions of similarity metric, normal compressor, and show that the NCD based on a normal compressor is a similarity metric that approximates universality. To extract a hierarchy of clusters from the distance matrix, we determine a dendrogram (binary tree) by a new quartet method and a fast heuristic to implement it. The method is implemented and available as public software, and is robust under choice of different compressors. To substantiate our claims of universality and robustness, we report evidence of successful application in areas as diverse as genomics, virology, languages, literature, music, handwritten digits, astronomy, and combinations of objects from completely different domains, using statistical, dictionary, and block sorting compressors. In genomics we presented new evidence for major questions in Mammalian evolution, based on whole-mitochondrial genomic analysis: the Eutherian orders and the Marsupionta hypothesis against the Theria hypothesis.Comment: LaTeX, 27 pages, 20 figure

    GVC: efficient random access compression for gene sequence variations

    Get PDF
    Background: In recent years, advances in high-throughput sequencing technologies have enabled the use of genomic information in many fields, such as precision medicine, oncology, and food quality control. The amount of genomic data being generated is growing rapidly and is expected to soon surpass the amount of video data. The majority of sequencing experiments, such as genome-wide association studies, have the goal of identifying variations in the gene sequence to better understand phenotypic variations. We present a novel approach for compressing gene sequence variations with random access capability: the Genomic Variant Codec (GVC). We use techniques such as binarization, joint row- and column-wise sorting of blocks of variations, as well as the image compression standard JBIG for efficient entropy coding. Results: Our results show that GVC provides the best trade-off between compression and random access compared to the state of the art: it reduces the genotype information size from 758 GiB down to 890 MiB on the publicly available 1000 Genomes Project (phase 3) data, which is 21% less than the state of the art in random-access capable methods. Conclusions: By providing the best results in terms of combined random access and compression, GVC facilitates the efficient storage of large collections of gene sequence variations. In particular, the random access capability of GVC enables seamless remote data access and application integration. The software is open source and available at https://github.com/sXperfect/gvc/
    • …
    corecore