845 research outputs found
Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
In this paper, we propose a novel deep generative approach to cross-modal
retrieval to learn hash functions in the absence of paired training samples
through the cycle consistency loss. Our proposed approach employs adversarial
training scheme to lean a couple of hash functions enabling translation between
modalities while assuming the underlying semantic relationship. To induce the
hash codes with semantics to the input-output pair, cycle consistency loss is
further proposed upon the adversarial training to strengthen the correlations
between inputs and corresponding outputs. Our approach is generative to learn
hash functions such that the learned hash codes can maximally correlate each
input-output correspondence, meanwhile can also regenerate the inputs so as to
minimize the information loss. The learning to hash embedding is thus performed
to jointly optimize the parameters of the hash functions across modalities as
well as the associated generative models. Extensive experiments on a variety of
large-scale cross-modal data sets demonstrate that our proposed method achieves
better retrieval results than the state-of-the-arts.Comment: To appeared on IEEE Trans. Image Processing. arXiv admin note: text
overlap with arXiv:1703.10593 by other author
On Distributed Storage Codes
Distributed storage systems are studied. The interest in such system has become relatively wide due to the increasing amount of information needed to be stored in data centers or different kinds of cloud systems. There are many kinds of solutions for storing the information into distributed devices regarding the needs of the system designer. This thesis studies the questions of designing such storage systems and also fundamental limits of such systems. Namely, the subjects of interest of this thesis include heterogeneous distributed storage systems, distributed storage systems with the exact repair property, and locally repairable codes. For distributed storage systems with either functional or exact repair, capacity results are proved. In the case of locally repairable codes, the minimum distance is studied.
Constructions for exact-repairing codes between minimum bandwidth regeneration (MBR) and minimum storage regeneration (MSR) points are given. These codes exceed the time-sharing line of the extremal points in many cases. Other properties of exact-regenerating codes are also studied. For the heterogeneous setup, the main result is that the capacity of such systems is always smaller than or equal to the capacity of a homogeneous system with symmetric repair with average node size and average repair bandwidth. A randomized construction for a locally repairable code with good minimum distance is given. It is shown that a random linear code of certain natural type has a good minimum distance with high probability. Other properties of locally repairable codes are also studied.Siirretty Doriast
ACCELERATING STORAGE APPLICATIONS WITH EMERGING KEY VALUE STORAGE DEVICES
With the continuous data explosion in the big data era, traditional software and hardware stack
are facing unprecedented challenges on how to operate on such data scale. Thus, designing new
architectures and efficient systems for data oriented applications has become increasingly critical.
This motivates us to re-think of the conventional storage system design and re-architect both
software and hardware to meet the challenges of scale.
Besides the fast growth of data volume, the increasing demand on storage applications such
as video streaming, data analytics are pushing high performance flash based storage devices to
replace the traditional spinning disks. Such all-flash era increase the data reliability concerns
due to the endurance problem of flash devices. Key-value stores (KVS) are important storage
infrastructure to handle the fast growing unstructured data and have been widely deployed in a
variety of scale-out enterprise applications such as online retail, big data analytic, social networks,
etc. How to efficiently manage data redundancy for key-value stores to provide data reliability, how
to efficiently support range query for key-value stores to accelerate analytic oriented applications
under emerging key-value store system architecture become an important research problem.
In this research, we focus on how to design new software hardware architectures for the keyvalue
store applications to provide reliability and improve query performance. In order to address
the different issues identified in this dissertation, we propose to employ a logical key management
layer, a thin layer above the KV devices that maps logical keys into phsyical keys on the devices.
We show how such a layer can enable multiple solutions to improve the performance and reliability
of KVSSD based storage systems. First, we present KVRAID, a high performance, write
efficient erasure coding management scheme on emerging key-value SSDs. The core innovation
of KVRAID is to propose a logical key management layer that maps logical keys to physical keys
to efficiently pack similar size KV objects and dynamically manage the membership of erasure
coding groups. Unlike existing schemes which manage erasure codes on the block level, KVRAID
manages the erasure codes on the KV object level. In order to achieve better storage efficiency for variable sized objects, KVRAID predefines multiple fixed sizes (slabs) according to the object size
distribution for the erasure code. KVRAID uses a logical to physical key conversion to pack the
KV objects of similar size into a parity group. KVRAID uses a lazy deletion mechanism with a
garbage collector for object updates. Our experiments show that in 100% put case, KVRAID outperforms
software block RAID by 18x in case of throughput and reduces 15x write amplification
(WAF) with only ~5% CPU utilization. In a mixed update/get workloads, KVRAID achieves ~4x
better throughput with ~23% CPU utilization and reduces the storage overhead and WAF by 3.6x
and 11.3x in average respectively.
Second, we present KVRangeDB, an ordered log structure tree based key index that supports
range queries on a hash-based KVSSD. In addition, we propose to pack smaller application records
into a larger physical record on the device through the logical key management layer. We compared
the performance of KVRangeDB against RocksDB implementation on KVSSD and stateof-
art software KV-store Wisckey on block device, on three types of real world applications of
cloud-serving workloads, TABLEFS filesystem and time-series databases. For cloud serving applications,
KVRangeDB achieves 8.3x and 1.7x better 99.9% write tail latency respectively compared
to RocksDB implementation on KV-SSD and Wisckey on block SSD. On the query side,
KVrangeDB only performs worse for those very long scans, but provides fast point queries and
closed range queries. The experiments on TABLEFS demonstrate that using KVRangeDB for
metadata indexing can boost the performance by a factor of ~6.3x in average and reduce ~3.9x
CPU cost for four metadata-intensive workloads compared to RocksDB implementation on KVSSD.
Compared toWisckey, KVRangeDB improves performance by ~2.6x in average and reduces
~1.7x CPU usage.
Third, we propose a generic FPGA accelerator for emerging Minimum Storage Regenerating
(MSR) codes encoding/decoding which maximizes the computation parallelism and minimizes
the data movement between off-chip DRAM and the on-chip SRAM buffers. To demonstrate the
efficiency of our proposed accelerator, we implemented the encoding/decoding algorithms for a
specific MSR code called Zigzag code on Xilinx VCU1525 acceleration card. Our evaluation shows our proposed accelerator can achieve ~2.4-3.1x better throughput and ~4.2-5.7x better
power efficiency compared to the state-of-art multi-core CPU implementation and ~2.8-3.3x better
throughput and ~4.2-5.3x better power efficiency compared to a modern GPU accelerato
An investigation of error characteristics and coding performance
The first year's effort on NASA Grant NAG5-2006 was an investigation to characterize typical errors resulting from the EOS dorn link. The analysis methods developed for this effort were used on test data from a March 1992 White Sands Terminal Test. The effectiveness of a concatenated coding scheme of a Reed Solomon outer code and a convolutional inner code versus a Reed Solomon only code scheme has been investigated as well as the effectiveness of a Periodic Convolutional Interleaver in dispersing errors of certain types. The work effort consisted of development of software that allows simulation studies with the appropriate coding schemes plus either simulated data with errors or actual data with errors. The software program is entitled Communication Link Error Analysis (CLEAN) and models downlink errors, forward error correcting schemes, and interleavers
Coded caching in a multi-server system with random topology
Cache-aided content delivery is studied in a multi-server system with P servers and K users, each equipped with a local cache memory. In the delivery phase, each user connects randomly to any ρ out of P servers. Thanks to the availability of multiple servers, which model small-cell base stations (SBSs), demands can be satisfied with reduced storage capacity at each server and reduced delivery rate per server; however, this also leads to reduced multicasting opportunities compared to the single-server scenario. A joint storage and proactive caching scheme is proposed, which exploits coded storage across the servers, uncoded cache placement at the users, and coded delivery. The delivery latency is studied for both successive and parallel transmissions from the servers. It is shown that, with successive transmissions the achievable average delivery latency is comparable to the one achieved in the single-server scenario, while the gap between the two depends on ρ, the available redundancy across the servers, and can be reduced by increasing the storage capacity at the SBSs. The optimality of the proposed scheme with uncoded cache placement and MDS-coded server storage is also proved for successive transmissions
Homeobox-containing genes in the nemertean "Lineus" : key players in the antero-posterior body patterning and in the specification of the visual structures
One of the most important breakthroughs in the field of developmental biology
has been the discovery of the homeobox and of its widespread phylogenetic
conservation. Many homeobox-containing genes encode transcription factors that
regulate gene expression during important developmental processes, such as patterning
and cell differentiation. Not only their sequences, but often also their expression
patterns and their functions are conserved throughout bilaterian animals. Despite
specific knowledge from selected model organisms, which belong to the
Deuterostomia and the Ecdysozoa, an unified view about the evolutionary conserved
developmental mechanisms requires more investigations from the Lophotrochozoa,
the third clade of Bilateria, which has been neglected, so far.
We have worked with nermerteans, also called ribbonworms, which are
members of the Lophotrochozoa. Because of its evolutionary position, its relative
simplicity and impressive developmental plasticity, Lineus sanguineus, a marine
ribbonworm from the class Anopla, is an attractive system to investigate the
specification of the body plan and the mechanism by which differentiated cells
maintain or reprogram their identity in a context-dependent manner. In this thesis,
Lineus was used as a model system in an attempt to reveal to which extent the rostral/
caudal specification of the antero-posterior axis and the eye specification network are
conserved throughout the Bilateria.
Although the Hox genes play important roles in the antero-posterior
specification of the bilaterian body, the most rostral and the most caudal regions of
the embryo are specified by orthodenticle-like (Otx) and caudal-like (Cdx),
respectively. To test whether this is also the case in Lophotrochozoa, we first have
characterized the full-length Ls-Otx and Ls-Cdx genes. Then, we have shown that
expression patterns in developing and adult Lineus suggest an involvement of Otx in
the development, the specification and the maintenance of the anterior sensory
structures and anterior brain regions. This is in good agreement with the proposed
conserved functions of Otx among Bilateria. Similarly, the restriction of Ls-Cdx
expression at the posterior extremity of the developing Lineus larva suggest that the
presumed conserved role of Cdx in the specification of the posterior end of bilaterian
embryos could be conserved in Lineus. Additionally, we have studied both, the
expression patterns and the variation of expression levels of Ls-Otx and Ls-Cdx
during regeneration. This has revealed that Ls-Cdx is specifically up-regulated, during
posterior regeneration, only, whereas Ls-Otx is up-regulated during both, anterior and
posterior regeneration. The Ls-Otx expression becomes restricted to the anterior
regenerating blastema only one week after the onset of regeneration. As it has been
suggested that the CNS plays a crucial role in nemertean regeneration and as Ls-Otx is
specifically expressed at the tip of the sectioned nerve cord of the early regenerating
stages, we propose that Ls-Otx could be part of a signaling network responsible for
the onset of regeneration. Additional information has been obtained from Lineus
lacteus, a close relative of Lineus sanguineus, which does not exhibit the same
regeneration capacities. In the light of the expression pattern of Otx in amputated
Lineus lacteus, we propose that the differences in regeneration capacities between
nemertean species could rely on the differences in the capacity of their differentiated
cells to de-differentiate in response to signals emitted from Otx expressing cells of the
nerve cord, rather than in the capacity to emit the signals leading to the onset of
regeneration.
In a second project, we have investigated the specification of the visual
structures in L.sanguineus. Studies in Drosophila and vertebrates have revealed that a
combinatorial expression of members of the evolutionary conserved “eye
specification network” specify the eye field. The key members of this eye
specification network are the Pax-6, Six, Eyes absent and Dachshund genes. We
wanted to know whether this network is involved in the development, maintenance
and regeneration of the Lineus eyes. At the beginning of this PhD work, it was already
known that LsPax-6 is expressed in the developing eye field in Lineus. In addition, we
had reported that its inactivation by RNA-mediated gene interference (RNAi) in an
adult L.sanguineus leads to the disappearance of the adult eyes. To further investigate
the specification of the Lineus eyes, we have characterized three Six genes, LsSix1/2,
LsSix3/6 and LsSix4/5. Their expressions, especially the one of LsSix1/2, suggest an
involvement in the development and the regeneration of the Lineus eyes. In addition,
we have observed a cross-reaction of a Drosophila antibody anti-dachshund with the
developing Lineus eyes. Taken together, these data support the idea that the “eye
specification network” could be conserved in nemerteans. This molecular unity
underlying eye specification in all bilaterian clades strongly supports the hypothesis
of a monophyletic origin of the eyes
On the Design of Future Communication Systems with Coded Transport, Storage, and Computing
Communication systems are experiencing a fundamental change. There are novel applications that require an increased performance not only of throughput but also latency, reliability, security, and heterogeneity support from these systems. To fulfil the requirements, future systems understand communication not only as the transport of bits but also as their storage, processing, and relation. In these systems, every network node has transport storage and computing resources that the network operator and its users can exploit through virtualisation and softwarisation of the resources. It is within this context that this work presents its results. We proposed distributed coded approaches to improve communication systems. Our results improve the reliability and latency performance of the transport of information. They also increase the reliability, flexibility, and throughput of storage applications. Furthermore, based on the lessons that coded approaches improve the transport and storage performance of communication systems, we propose a distributed coded approach for the computing of novel in-network applications such as the steering and control of cyber-physical systems. Our proposed approach can increase the reliability and latency performance of distributed in-network computing in the presence of errors, erasures, and attackers
C-MOS array design techniques: SUMC multiprocessor system study
The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units
- …