51 research outputs found
Creating a native Swift JPEG codec
Swift is one of the world’s most popular systems programming languages, however for many
applications, such as image decoding and encoding, Apple’s proprietary frameworks are the only
options available to users. This project, an open-source, pure-Swift implementation of the ITU-T81
JPEG standard, is motivated by that gap in the language ecosystem. Written as an open source
project contributor’s guide, we begin by detailing the problems and considerations inherent to codec
design, and how the Swift language allows for highly expressive and safe APIs beyond what older
C and C++ frameworks can provide. We continue with an overview of the components of our fully featured
JPEG library, including ways in which various performance and safety issues have been
addressed. We overview the packaging and encapsulation required to vend a usable framework, as
well as the unit, integration, and regression tests essential for its long-term maintenance.Ope
Incompressible Encodings
An incompressible encoding can probabilistically encode some data into a codeword , which is not much larger. Anyone can decode the codeword to recover the original data . However, the codeword cannot be efficiently compressed, even if the original data is given to the decompression procedure on the side. In other words, is an efficiently decodable representation of , yet is computationally incompressible even given . An incompressible encoding is composable if many encodings cannot be simultaneously compressed.
The recent work of Damg\aa{}rd, Ganesh and Orlandi (CRYPTO \u2719) defined a variant of incompressible encodings as a building block for ``proofs of replicated storage\u27\u27. They constructed incompressible encodings in an ideal permutation model, but it was left open if they can be constructed under standard assumptions, or even in the more basic random-oracle model. In this work, we undertake the comprehensive study of incompressible encodings as a primitive of independent interest and give new constructions, negative results and applications:
* We construct incompressible encodings in the common random string (CRS) model under either Decisional Composite Residuosity (DCR) or Learning with Errors (LWE). However, the construction has several drawbacks: (1) it is not composable, (2) it only achieves selective security, and (3) the CRS is as long as the data .
* We leverage the above construction to also get a scheme in the random-oracle model, under the same assumptions, that avoids all of the above drawbacks. Furthermore, it is significantly more efficient than the prior ideal-model construction.
* We give black-box separations, showing that incompressible encodings in the plain model cannot be proven secure under any standard hardness assumption, and incompressible encodings in the CRS model must inherently suffer from all of the drawbacks above.
* We give a new application to ``big-key cryptography in the bounded-retrieval model\u27\u27, where secret keys are made intentionally huge to make them hard to exfiltrate. Using incompressible encodings, we can get all the security benefits of a big key without wasting storage space, by having the key to encode useful data
Representation and Exploitation of Event Sequences
Programa Oficial de Doutoramento en Computación . 5009V01[Abstract]
The Ten Commandments, the thirty best smartphones in the market and
the five most wanted people by the FBI. Our life is ruled by sequences:
thought sequences, number sequences, event sequences. . . a history book
is nothing more than a compilation of events and our favorite film is
just a sequence of scenes. All of them have something in common, it
is possible to acquire relevant information from them. Frequently, by
accumulating some data from the elements of each sequence we may
access hidden information (e.g. the passengers transported by a bus
on a journey is the sum of the passengers who got on in the sequence
of stops made); other times, reordering the elements by any of their
characteristics facilitates the access to the elements of interest (e.g. the
publication of books in 2019 can be ordered chronologically, by author,
by literary genre or even by a combination of characteristics); but it
will always be sought to store them in the smallest space possible.
Thus, this thesis proposes technological solutions for the storage
and subsequent processing of events, focusing specifically on three
fundamental aspects that can be found in any application that needs
to manage them: compressed and dynamic storage, aggregation
or accumulation of elements of the sequence and element sequence
reordering by their different characteristics or dimensions.
The first contribution of this work is a compact structure for the
dynamic compression of event sequences. This structure allows any
sequence to be compressed in a single pass, that is, it is capable of
compressing in real time as elements arrive. This contribution is
a milestone in the world of compression since, to date, this is the
first proposal for a variable-to-variable dynamic compressor for general purpose.
Regarding aggregation, a data warehouse-like proposal is presented
capable of storing information on any characteristic of the events in a
sequence in an aggregated, compact and accessible way. Following the
philosophy of current data warehouses, we avoid repeating cumulative
operations and speed up aggregate queries by preprocessing the
information and keeping it in this separate structure.
Finally, this thesis addresses the problem of indexing event sequences
considering their different characteristics and possible reorderings. A new
approach for simultaneously keeping the elements of a sequence ordered
by different characteristics is presented through compact structures.
Thus, it is possible to consult the information and perform operations
on the elements of the sequence using any possible rearrangement in a
simple and efficient way.[Resumen]
Los diez mandamientos, los treinta mejores móviles del mercado y las
cinco personas más buscadas por el FBI. Nuestra vida está gobernada
por secuencias: secuencias de pensamientos, secuencias de números,
secuencias de eventos. . . un libro de historia no es más que una sucesión
de eventos y nuestra película favorita no es sino una secuencia de
escenas. Todas ellas tienen algo en común, de todas podemos extraer
información relevante. A veces, al acumular algún dato de los elementos
de cada secuencia accedemos a información oculta (p. ej. los viajeros
transportados por un autobús en un trayecto es la suma de los pasajeros
que se subieron en la secuencia de paradas realizadas); otras veces, la
reordenación de los elementos por alguna de sus características facilita
el acceso a los elementos de interés (p. ej. la publicación de obras
literarias en 2019 puede ordenarse cronológicamente, por autor, por
género literario o incluso por una combinación de características); pero
siempre se buscará almacenarlas en el espacio más reducido posible sin
renunciar a su contenido.
Por ello, esta tesis propone soluciones tecnológicas para el almacenamiento
y posterior procesamiento de secuencias, centrándose
concretamente en tres aspectos fundamentales que se pueden encontrar
en cualquier aplicación que precise gestionarlas: el almacenamiento
comprimido y dinámico, la agregación o acumulación de algún dato
sobre los elementos de la secuencia y la reordenación de los elementos
de la secuencia por sus diferentes características o dimensiones.
La primera contribución de este trabajo es una estructura compacta
para la compresión dinámica de secuencias. Esta estructura permite
comprimir cualquier secuencia en una sola pasada, es decir, es capaz de comprimir en tiempo real a medida que llegan los elementos de la
secuencia. Esta aportación es un hito en el mundo de la compresión ya
que, hasta la fecha, es la primera propuesta de un compresor dinámico
“variable to variable” de carácter general.
En cuanto a la agregación, se presenta una propuesta de almacén
de datos capaz de guardar la información acumulada sobre alguna
característica de los eventos de la secuencia de modo compacto y
fácilmente accesible. Siguiendo la filosofía de los actuales almacenes de
datos, el objetivo es evitar repetir operaciones de acumulación y agilizar
las consultas agregadas mediante el preprocesado de la información
manteniéndola en esta estructura.
Por último, esta tesis aborda el problema de la indexación de
secuencias de eventos considerando sus diferentes características y
posibles reordenaciones. Se presenta una nueva forma de mantener
simultáneamente ordenados los elementos de una secuencia por diferentes
características a través de estructuras compactas. Así se permite
consultar la información y realizar operaciones sobre los elementos
de la secuencia usando cualquier posible ordenación de una manera
sencilla y eficiente
Web Content Delivery Optimization
Milliseconds matters, when they’re counted. If we consider the life of the universe into one single year, then on 31 December at 11:59:59.5 PM, “speed” was transportation’s concern, and now after 500 milliseconds it is web’s, and no one knows whose concern it would be in coming milliseconds, but at this very moment; this thesis proposes an optimization method, mainly for content delivery on slow connections. The method utilizes a proxy as a middle box to fetch the content; requested by a client, from a single or multiple web servers, and bundles all of the fetched image content types that fits into the bundling policy; inside a JavaScript file in Base64 format. This optimization method reduces the number of HTTP requests between the client and multiple web servers as a result of its proposed bundling solution, and at the same time optimizes the HTTP compression efficiency as a result of its proposed method of aggregative textual content compression. Page loading time results of the test web pages; which were specially designed and developed to capture the optimum benefits of the proposed method; proved up to 81% faster page loading time for all connection types. However, other tests in non-optimal situations such as webpages which use “Lazy Loading” techniques, showed just 35% to 50% benefits, that is only achievable on 2G and 3G connections (0.2 Mbps – 15 Mbps downlink) and not faster connections
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of Data
El actual diluvio de datos está inundando la web con grandes volúmenes de datos representados en RDF, dando lugar a la denominada 'Web de Datos'. En esta tesis proponemos, en primer lugar, un estudio profundo de aquellos textos que nos permitan abordar un conocimiento global de la estructura real de los conjuntos de datos RDF, HDT, que afronta la representación eficiente de grandes volúmenes de datos RDF a través de estructuras optimizadas para su almacenamiento y transmisión en red. HDT representa efizcamente un conjunto de datos RDF a través de su división en tres componentes: la cabecera (Header), el diccionario (Dictionary) y la estructura de sentencias RDF (Triples). A continuación, nos centramos en proveer estructuras eficientes de dichos componentes, ocupando un espacio comprimido al tiempo que se permite el acceso directo a cualquier dat
FPGA-based Query Acceleration for Non-relational Databases
Database management systems are an integral part of today’s everyday life. Trends like smart applications, the internet of things, and business and social networks require applications to deal efficiently with data in various data models close to the underlying domain. Therefore, non-relational database systems provide a wide variety of database models, like graphs and documents. However, current non-relational database systems face performance challenges due to the end of Dennard scaling and therefore performance scaling of CPUs. In the meanwhile, FPGAs have gained traction as accelerators for data management.
Our goal is to tackle the performance challenges of non-relational database
systems with FPGA acceleration and, at the same time, address design challenges of FPGA acceleration itself. Therefore, we split this thesis up into two main lines of work: graph processing and flexible data processing.
Because of the lacking benchmark practices for graph processing accelerators, we propose GraphSim. GraphSim is able to reproduce runtimes of these accelerators based on a memory access model of the approach. Through this simulation environment, we extract three performance-critical accelerator properties: asynchronous graph processing, compressed graph data structure, and multi-channel memory. Since these accelerator properties have not been combined in one system, we propose GraphScale. GraphScale is the first scalable, asynchronous graph processing accelerator working on a compressed graph and outperforms all state-of-the-art graph processing accelerators.
Focusing on accelerator flexibility, we propose PipeJSON as the first FPGA-based JSON parser for arbitrary JSON documents. PipeJSON is able to achieve
parsing at line-speed, outperforming the fastest, vectorized parsers for CPUs. Lastly, we propose the subgraph query processing accelerator GraphMatch which outperforms state-of-the-art CPU systems for subgraph query processing and is able to flexibly switch queries during runtime in a matter of clock cycles
Qualitative Evaluation of Data Compression in Real-time Ultrasound Imaging
The purpose of this project was to evaluate qualitatively real-time ultrasound imaging using objective and subjective techniques to determine the minimum bandwidth required for clinical diagnosis of various anatomical and pathological states. In the experimental setup live ultrasound video samples representing the most common clinical examinations were compressed at 128, 256, 384, 768, 1152 and 1536 kbps using a compressor-decompressor (CODEC) adhering to International Telecommunication Union (ITU-T) recommendation H.261. A protocol for qualitative evaluation was developed and subjective and objective testing were performed based on this protocol. Subjective methods comprised of inter-rater reliability tests using kappa statistics and three way Analysis of Variance (ANOVA) using General Linear Models (GLM). Objective testing were performed using histogram analysis and estimation of peak signal to noise ratios.
The kappa scores for all bandwidths greater than 256 kbps indicated good inter-rater reliablity and minimum variation in confidence levels. Using the results from GLM and ANOVA we could not establish a trend in degradation of observer confidence with increasing compression ratios. The histogram analysis showed a linear increase in standard deviation values, indicating a linear scatter in pixel intensity, with increasing compression ratios. Although higher compression levels were evaluated, only video clips with bandwidths greater than 256 kbps displayed satisfactory temporal and spatial resolution, good enough to make clinical diagnosis of various anatomical and pathological states. The evaluations also indicate that compressed real-time ultrasound imagery using H.261 can be transmitted over a T1 or ADSL networks
- …