845 research outputs found

    Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval

    Full text link
    In this paper, we propose a novel deep generative approach to cross-modal retrieval to learn hash functions in the absence of paired training samples through the cycle consistency loss. Our proposed approach employs adversarial training scheme to lean a couple of hash functions enabling translation between modalities while assuming the underlying semantic relationship. To induce the hash codes with semantics to the input-output pair, cycle consistency loss is further proposed upon the adversarial training to strengthen the correlations between inputs and corresponding outputs. Our approach is generative to learn hash functions such that the learned hash codes can maximally correlate each input-output correspondence, meanwhile can also regenerate the inputs so as to minimize the information loss. The learning to hash embedding is thus performed to jointly optimize the parameters of the hash functions across modalities as well as the associated generative models. Extensive experiments on a variety of large-scale cross-modal data sets demonstrate that our proposed method achieves better retrieval results than the state-of-the-arts.Comment: To appeared on IEEE Trans. Image Processing. arXiv admin note: text overlap with arXiv:1703.10593 by other author

    On Distributed Storage Codes

    Get PDF
    Distributed storage systems are studied. The interest in such system has become relatively wide due to the increasing amount of information needed to be stored in data centers or different kinds of cloud systems. There are many kinds of solutions for storing the information into distributed devices regarding the needs of the system designer. This thesis studies the questions of designing such storage systems and also fundamental limits of such systems. Namely, the subjects of interest of this thesis include heterogeneous distributed storage systems, distributed storage systems with the exact repair property, and locally repairable codes. For distributed storage systems with either functional or exact repair, capacity results are proved. In the case of locally repairable codes, the minimum distance is studied. Constructions for exact-repairing codes between minimum bandwidth regeneration (MBR) and minimum storage regeneration (MSR) points are given. These codes exceed the time-sharing line of the extremal points in many cases. Other properties of exact-regenerating codes are also studied. For the heterogeneous setup, the main result is that the capacity of such systems is always smaller than or equal to the capacity of a homogeneous system with symmetric repair with average node size and average repair bandwidth. A randomized construction for a locally repairable code with good minimum distance is given. It is shown that a random linear code of certain natural type has a good minimum distance with high probability. Other properties of locally repairable codes are also studied.Siirretty Doriast

    ACCELERATING STORAGE APPLICATIONS WITH EMERGING KEY VALUE STORAGE DEVICES

    Get PDF
    With the continuous data explosion in the big data era, traditional software and hardware stack are facing unprecedented challenges on how to operate on such data scale. Thus, designing new architectures and efficient systems for data oriented applications has become increasingly critical. This motivates us to re-think of the conventional storage system design and re-architect both software and hardware to meet the challenges of scale. Besides the fast growth of data volume, the increasing demand on storage applications such as video streaming, data analytics are pushing high performance flash based storage devices to replace the traditional spinning disks. Such all-flash era increase the data reliability concerns due to the endurance problem of flash devices. Key-value stores (KVS) are important storage infrastructure to handle the fast growing unstructured data and have been widely deployed in a variety of scale-out enterprise applications such as online retail, big data analytic, social networks, etc. How to efficiently manage data redundancy for key-value stores to provide data reliability, how to efficiently support range query for key-value stores to accelerate analytic oriented applications under emerging key-value store system architecture become an important research problem. In this research, we focus on how to design new software hardware architectures for the keyvalue store applications to provide reliability and improve query performance. In order to address the different issues identified in this dissertation, we propose to employ a logical key management layer, a thin layer above the KV devices that maps logical keys into phsyical keys on the devices. We show how such a layer can enable multiple solutions to improve the performance and reliability of KVSSD based storage systems. First, we present KVRAID, a high performance, write efficient erasure coding management scheme on emerging key-value SSDs. The core innovation of KVRAID is to propose a logical key management layer that maps logical keys to physical keys to efficiently pack similar size KV objects and dynamically manage the membership of erasure coding groups. Unlike existing schemes which manage erasure codes on the block level, KVRAID manages the erasure codes on the KV object level. In order to achieve better storage efficiency for variable sized objects, KVRAID predefines multiple fixed sizes (slabs) according to the object size distribution for the erasure code. KVRAID uses a logical to physical key conversion to pack the KV objects of similar size into a parity group. KVRAID uses a lazy deletion mechanism with a garbage collector for object updates. Our experiments show that in 100% put case, KVRAID outperforms software block RAID by 18x in case of throughput and reduces 15x write amplification (WAF) with only ~5% CPU utilization. In a mixed update/get workloads, KVRAID achieves ~4x better throughput with ~23% CPU utilization and reduces the storage overhead and WAF by 3.6x and 11.3x in average respectively. Second, we present KVRangeDB, an ordered log structure tree based key index that supports range queries on a hash-based KVSSD. In addition, we propose to pack smaller application records into a larger physical record on the device through the logical key management layer. We compared the performance of KVRangeDB against RocksDB implementation on KVSSD and stateof- art software KV-store Wisckey on block device, on three types of real world applications of cloud-serving workloads, TABLEFS filesystem and time-series databases. For cloud serving applications, KVRangeDB achieves 8.3x and 1.7x better 99.9% write tail latency respectively compared to RocksDB implementation on KV-SSD and Wisckey on block SSD. On the query side, KVrangeDB only performs worse for those very long scans, but provides fast point queries and closed range queries. The experiments on TABLEFS demonstrate that using KVRangeDB for metadata indexing can boost the performance by a factor of ~6.3x in average and reduce ~3.9x CPU cost for four metadata-intensive workloads compared to RocksDB implementation on KVSSD. Compared toWisckey, KVRangeDB improves performance by ~2.6x in average and reduces ~1.7x CPU usage. Third, we propose a generic FPGA accelerator for emerging Minimum Storage Regenerating (MSR) codes encoding/decoding which maximizes the computation parallelism and minimizes the data movement between off-chip DRAM and the on-chip SRAM buffers. To demonstrate the efficiency of our proposed accelerator, we implemented the encoding/decoding algorithms for a specific MSR code called Zigzag code on Xilinx VCU1525 acceleration card. Our evaluation shows our proposed accelerator can achieve ~2.4-3.1x better throughput and ~4.2-5.7x better power efficiency compared to the state-of-art multi-core CPU implementation and ~2.8-3.3x better throughput and ~4.2-5.3x better power efficiency compared to a modern GPU accelerato

    An investigation of error characteristics and coding performance

    Get PDF
    The first year's effort on NASA Grant NAG5-2006 was an investigation to characterize typical errors resulting from the EOS dorn link. The analysis methods developed for this effort were used on test data from a March 1992 White Sands Terminal Test. The effectiveness of a concatenated coding scheme of a Reed Solomon outer code and a convolutional inner code versus a Reed Solomon only code scheme has been investigated as well as the effectiveness of a Periodic Convolutional Interleaver in dispersing errors of certain types. The work effort consisted of development of software that allows simulation studies with the appropriate coding schemes plus either simulated data with errors or actual data with errors. The software program is entitled Communication Link Error Analysis (CLEAN) and models downlink errors, forward error correcting schemes, and interleavers

    Coded caching in a multi-server system with random topology

    Get PDF
    Cache-aided content delivery is studied in a multi-server system with P servers and K users, each equipped with a local cache memory. In the delivery phase, each user connects randomly to any ρ out of P servers. Thanks to the availability of multiple servers, which model small-cell base stations (SBSs), demands can be satisfied with reduced storage capacity at each server and reduced delivery rate per server; however, this also leads to reduced multicasting opportunities compared to the single-server scenario. A joint storage and proactive caching scheme is proposed, which exploits coded storage across the servers, uncoded cache placement at the users, and coded delivery. The delivery latency is studied for both successive and parallel transmissions from the servers. It is shown that, with successive transmissions the achievable average delivery latency is comparable to the one achieved in the single-server scenario, while the gap between the two depends on ρ, the available redundancy across the servers, and can be reduced by increasing the storage capacity at the SBSs. The optimality of the proposed scheme with uncoded cache placement and MDS-coded server storage is also proved for successive transmissions

    Homeobox-containing genes in the nemertean "Lineus" : key players in the antero-posterior body patterning and in the specification of the visual structures

    Get PDF
    One of the most important breakthroughs in the field of developmental biology has been the discovery of the homeobox and of its widespread phylogenetic conservation. Many homeobox-containing genes encode transcription factors that regulate gene expression during important developmental processes, such as patterning and cell differentiation. Not only their sequences, but often also their expression patterns and their functions are conserved throughout bilaterian animals. Despite specific knowledge from selected model organisms, which belong to the Deuterostomia and the Ecdysozoa, an unified view about the evolutionary conserved developmental mechanisms requires more investigations from the Lophotrochozoa, the third clade of Bilateria, which has been neglected, so far. We have worked with nermerteans, also called ribbonworms, which are members of the Lophotrochozoa. Because of its evolutionary position, its relative simplicity and impressive developmental plasticity, Lineus sanguineus, a marine ribbonworm from the class Anopla, is an attractive system to investigate the specification of the body plan and the mechanism by which differentiated cells maintain or reprogram their identity in a context-dependent manner. In this thesis, Lineus was used as a model system in an attempt to reveal to which extent the rostral/ caudal specification of the antero-posterior axis and the eye specification network are conserved throughout the Bilateria. Although the Hox genes play important roles in the antero-posterior specification of the bilaterian body, the most rostral and the most caudal regions of the embryo are specified by orthodenticle-like (Otx) and caudal-like (Cdx), respectively. To test whether this is also the case in Lophotrochozoa, we first have characterized the full-length Ls-Otx and Ls-Cdx genes. Then, we have shown that expression patterns in developing and adult Lineus suggest an involvement of Otx in the development, the specification and the maintenance of the anterior sensory structures and anterior brain regions. This is in good agreement with the proposed conserved functions of Otx among Bilateria. Similarly, the restriction of Ls-Cdx expression at the posterior extremity of the developing Lineus larva suggest that the presumed conserved role of Cdx in the specification of the posterior end of bilaterian embryos could be conserved in Lineus. Additionally, we have studied both, the expression patterns and the variation of expression levels of Ls-Otx and Ls-Cdx during regeneration. This has revealed that Ls-Cdx is specifically up-regulated, during posterior regeneration, only, whereas Ls-Otx is up-regulated during both, anterior and posterior regeneration. The Ls-Otx expression becomes restricted to the anterior regenerating blastema only one week after the onset of regeneration. As it has been suggested that the CNS plays a crucial role in nemertean regeneration and as Ls-Otx is specifically expressed at the tip of the sectioned nerve cord of the early regenerating stages, we propose that Ls-Otx could be part of a signaling network responsible for the onset of regeneration. Additional information has been obtained from Lineus lacteus, a close relative of Lineus sanguineus, which does not exhibit the same regeneration capacities. In the light of the expression pattern of Otx in amputated Lineus lacteus, we propose that the differences in regeneration capacities between nemertean species could rely on the differences in the capacity of their differentiated cells to de-differentiate in response to signals emitted from Otx expressing cells of the nerve cord, rather than in the capacity to emit the signals leading to the onset of regeneration. In a second project, we have investigated the specification of the visual structures in L.sanguineus. Studies in Drosophila and vertebrates have revealed that a combinatorial expression of members of the evolutionary conserved “eye specification network” specify the eye field. The key members of this eye specification network are the Pax-6, Six, Eyes absent and Dachshund genes. We wanted to know whether this network is involved in the development, maintenance and regeneration of the Lineus eyes. At the beginning of this PhD work, it was already known that LsPax-6 is expressed in the developing eye field in Lineus. In addition, we had reported that its inactivation by RNA-mediated gene interference (RNAi) in an adult L.sanguineus leads to the disappearance of the adult eyes. To further investigate the specification of the Lineus eyes, we have characterized three Six genes, LsSix1/2, LsSix3/6 and LsSix4/5. Their expressions, especially the one of LsSix1/2, suggest an involvement in the development and the regeneration of the Lineus eyes. In addition, we have observed a cross-reaction of a Drosophila antibody anti-dachshund with the developing Lineus eyes. Taken together, these data support the idea that the “eye specification network” could be conserved in nemerteans. This molecular unity underlying eye specification in all bilaterian clades strongly supports the hypothesis of a monophyletic origin of the eyes

    On the Design of Future Communication Systems with Coded Transport, Storage, and Computing

    Get PDF
    Communication systems are experiencing a fundamental change. There are novel applications that require an increased performance not only of throughput but also latency, reliability, security, and heterogeneity support from these systems. To fulfil the requirements, future systems understand communication not only as the transport of bits but also as their storage, processing, and relation. In these systems, every network node has transport storage and computing resources that the network operator and its users can exploit through virtualisation and softwarisation of the resources. It is within this context that this work presents its results. We proposed distributed coded approaches to improve communication systems. Our results improve the reliability and latency performance of the transport of information. They also increase the reliability, flexibility, and throughput of storage applications. Furthermore, based on the lessons that coded approaches improve the transport and storage performance of communication systems, we propose a distributed coded approach for the computing of novel in-network applications such as the steering and control of cyber-physical systems. Our proposed approach can increase the reliability and latency performance of distributed in-network computing in the presence of errors, erasures, and attackers

    C-MOS array design techniques: SUMC multiprocessor system study

    Get PDF
    The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units
    corecore