115 research outputs found

    Iterative joint source channel decoding for H.264 compressed video transmission

    Get PDF
    In this thesis, the error resilient transmission of H.264 compressed video using Context-based Adaptive Binary Arithmetic Code (CABAC) as the entropy code is examined. The H.264 compressed video is convolutionally encoded and transmitted over an Additive White Gaussian Noise (AWGN) channel. Two iterative joint source-channel decoding schemes are proposed, in which slice candidates that failed semantic verification are exploited. The first proposed scheme uses soft values of bits produced by a soft-input soft-output channel decoder to generate a list of slice candidates for each slice in the compressed video sequence. These slice candidates are semantically verified to choose the best one. A new semantic checking method is proposed, which uses information from slice candidates that failed semantic verification to virtually check the current slice candidate. The second proposed scheme is built on the first one. This scheme also uses slice candidates that failed semantic verification but it uses them to modify soft values of bits at the source decoder before they are fed back into the channel decoder for the next iteration. Simulation results show that both schemes offer improvements in terms of subjective quality and in terms of objective quality using PSNR and BER as measures. Keywords: Video transmission, H.264, semantics, slice candidate, joint source-channel decoding, error resilienc

    High-performance hardware accelerators for image processing in space applications

    Get PDF
    Mars is a hard place to reach. While there have been many notable success stories in getting probes to the Red Planet, the historical record is full of bad news. The success rate for actually landing on the Martian surface is even worse, roughly 30%. This low success rate must be mainly credited to the Mars environment characteristics. In the Mars atmosphere strong winds frequently breath. This phenomena usually modifies the lander descending trajectory diverging it from the target one. Moreover, the Mars surface is not the best place where performing a safe land. It is pitched by many and close craters and huge stones, and characterized by huge mountains and hills (e.g., Olympus Mons is 648 km in diameter and 27 km tall). For these reasons a mission failure due to a landing in huge craters, on big stones or on part of the surface characterized by a high slope is highly probable. In the last years, all space agencies have increased their research efforts in order to enhance the success rate of Mars missions. In particular, the two hottest research topics are: the active debris removal and the guided landing on Mars. The former aims at finding new methods to remove space debris exploiting unmanned spacecrafts. These must be able to autonomously: detect a debris, analyses it, in order to extract its characteristics in terms of weight, speed and dimension, and, eventually, rendezvous with it. In order to perform these tasks, the spacecraft must have high vision capabilities. In other words, it must be able to take pictures and process them with very complex image processing algorithms in order to detect, track and analyse the debris. The latter aims at increasing the landing point precision (i.e., landing ellipse) on Mars. Future space-missions will increasingly adopt Video Based Navigation systems to assist the entry, descent and landing (EDL) phase of space modules (e.g., spacecrafts), enhancing the precision of automatic EDL navigation systems. For instance, recent space exploration missions, e.g., Spirity, Oppurtunity, and Curiosity, made use of an EDL procedure aiming at following a fixed and precomputed descending trajectory to reach a precise landing point. This approach guarantees a maximum landing point precision of 20 km. By comparing this data with the Mars environment characteristics, it is possible to understand how the mission failure probability still remains really high. A very challenging problem is to design an autonomous-guided EDL system able to even more reduce the landing ellipse, guaranteeing to avoid the landing in dangerous area of Mars surface (e.g., huge craters or big stones) that could lead to the mission failure. The autonomous behaviour of the system is mandatory since a manual driven approach is not feasible due to the distance between Earth and Mars. Since this distance varies from 56 to 100 million of km approximately due to the orbit eccentricity, even if a signal transmission at the light speed could be possible, in the best case the transmission time would be around 31 minutes, exceeding so the overall duration of the EDL phase. In both applications, algorithms must guarantee self-adaptability to the environmental conditions. Since the Mars (and in general the space) harsh conditions are difficult to be predicted at design time, these algorithms must be able to automatically tune the internal parameters depending on the current conditions. Moreover, real-time performances are another key factor. Since a software implementation of these computational intensive tasks cannot reach the required performances, these algorithms must be accelerated via hardware. For this reasons, this thesis presents my research work done on advanced image processing algorithms for space applications and the associated hardware accelerators. My research activity has been focused on both the algorithm and their hardware implementations. Concerning the first aspect, I mainly focused my research effort to integrate self-adaptability features in the existing algorithms. While concerning the second, I studied and validated a methodology to efficiently develop, verify and validate hardware components aimed at accelerating video-based applications. This approach allowed me to develop and test high performance hardware accelerators that strongly overcome the performances of the actual state-of-the-art implementations. The thesis is organized in four main chapters. Chapter 2 starts with a brief introduction about the story of digital image processing. The main content of this chapter is the description of space missions in which digital image processing has a key role. A major effort has been spent on the missions in which my research activity has a substantial impact. In particular, for these missions, this chapter deeply analizes and evaluates the state-of-the-art approaches and algorithms. Chapter 3 analyzes and compares the two technologies used to implement high performances hardware accelerators, i.e., Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs). Thanks to this information the reader may understand the main reasons behind the decision of space agencies to exploit FPGAs instead of ASICs for high-performance hardware accelerators in space missions, even if FPGAs are more sensible to Single Event Upsets (i.e., transient error induced on hardware component by alpha particles and solar radiation in space). Moreover, this chapter deeply describes the three available space-grade FPGA technologies (i.e., One-time Programmable, Flash-based, and SRAM-based), and the main fault-mitigation techniques against SEUs that are mandatory for employing space-grade FPGAs in actual missions. Chapter 4 describes one of the main contribution of my research work: a library of high-performance hardware accelerators for image processing in space applications. The basic idea behind this library is to offer to designers a set of validated hardware components able to strongly speed up the basic image processing operations commonly used in an image processing chain. In other words, these components can be directly used as elementary building blocks to easily create a complex image processing system, without wasting time in the debug and validation phase. This library groups the proposed hardware accelerators in IP-core families. The components contained in a same family share the same provided functionality and input/output interface. This harmonization in the I/O interface enables to substitute, inside a complex image processing system, components of the same family without requiring modifications to the system communication infrastructure. In addition to the analysis of the internal architecture of the proposed components, another important aspect of this chapter is the methodology used to develop, verify and validate the proposed high performance image processing hardware accelerators. This methodology involves the usage of different programming and hardware description languages in order to support the designer from the algorithm modelling up to the hardware implementation and validation. Chapter 5 presents the proposed complex image processing systems. In particular, it exploits a set of actual case studies, associated with the most recent space agency needs, to show how the hardware accelerator components can be assembled to build a complex image processing system. In addition to the hardware accelerators contained in the library, the described complex system embeds innovative ad-hoc hardware components and software routines able to provide high performance and self-adaptable image processing functionalities. To prove the benefits of the proposed methodology, each case study is concluded with a comparison with the current state-of-the-art implementations, highlighting the benefits in terms of performances and self-adaptability to the environmental conditions

    Dynamic information and constraints in source and channel coding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 237-251).This thesis explore dynamics in source coding and channel coding. We begin by introducing the idea of distortion side information, which does not directly depend on the source but instead affects the distortion measure. Such distortion side information is not only useful at the encoder but under certain conditions knowing it at the encoder is optimal and knowing it at the decoder is useless. Thus distortion side information is a natural complement to Wyner-Ziv side information and may be useful in exploiting properties of the human perceptual system as well as in sensor or control applications. In addition to developing the theoretical limits of source coding with distortion side information, we also construct practical quantizers based on lattices and codes on graphs. Our use of codes on graphs is also of independent interest since it highlights some issues in translating the success of turbo and LDPC codes into the realm of source coding. Finally, to explore the dynamics of side information correlated with the source, we consider fixed lag side information at the decoder. We focus on the special case of perfect side information with unit lag corresponding to source coding with feedforward (the dual of channel coding with feedback).(cont.) Using duality, we develop a linear complexity algorithm which exploits the feedforward information to achieve the rate-distortion bound. The second part of the thesis focuses on channel dynamics in communication by introducing a new system model to study delay in streaming applications. We first consider an adversarial channel model where at any time the channel may suffer a burst of degraded performance (e.g., due to signal fading, interference, or congestion) and prove a coding theorem for the minimum decoding delay required to recover from such a burst. Our coding theorem illustrates the relationship between the structure of a code, the dynamics of the channel, and the resulting decoding delay. We also consider more general channel dynamics. Specifically, we prove a coding theorem establishing that, for certain collections of channel ensembles, delay-universal codes exist that simultaneously achieve the best delay for any channel in the collection. Practical constructions with low encoding and decoding complexity are described for both cases.(cont.) Finally, we also consider architectures consisting of both source and channel coding which deal with channel dynamics by spreading information over space, frequency, multiple antennas, or alternate transmission paths in a network to avoid coding delays. Specifically, we explore whether the inherent diversity in such parallel channels should be exploited at the application layer via multiple description source coding, at the physical layer via parallel channel coding, or through some combination of joint source-channel coding. For on-off channel models application layer diversity architectures achieve better performance while for channels with a continuous range of reception quality (e.g., additive Gaussian noise channels with Rayleigh fading), the reverse is true. Joint source-channel coding achieves the best of both by performing as well as application layer diversity for on-off channels and as well as physical layer diversity for continuous channels.by Emin Martinian.Ph.D

    Minimum Description Length Model Selection - Problems and Extensions

    Get PDF
    The thesis treats a number of open problems in Minimum Description Length model selection, especially prediction problems. It is shown how techniques from the "Prediction with Expert Advice" literature can be used to improve model selection performance, which is particularly useful in nonparametric settings

    Digital encoding of black and white facsimile signals

    Get PDF
    As the costs of digital signal processing and memory hardware are decreasing each year compared to those of transmission, it is increasingly economical to apply sophisticated source encoding techniques to reduce the transmission time for facsimile documents. With this intent, information lossy encoding schemes have been investigated in which the encoder is divided into two stages. Firstly, preprocessing, which removes redundant information from the original documents, and secondly, actual encoding of the preprocessed documents. [Continues.

    Design of a secure architecture for the exchange of biomedical information in m-Health scenarios

    Get PDF
    El paradigma de m-Salud (salud móvil) aboga por la integración masiva de las más avanzadas tecnologías de comunicación, red móvil y sensores en aplicaciones y sistemas de salud, para fomentar el despliegue de un nuevo modelo de atención clínica centrada en el usuario/paciente. Este modelo tiene por objetivos el empoderamiento de los usuarios en la gestión de su propia salud (p.ej. aumentando sus conocimientos, promocionando estilos de vida saludable y previniendo enfermedades), la prestación de una mejor tele-asistencia sanitaria en el hogar para ancianos y pacientes crónicos y una notable disminución del gasto de los Sistemas de Salud gracias a la reducción del número y la duración de las hospitalizaciones. No obstante, estas ventajas, atribuidas a las aplicaciones de m-Salud, suelen venir acompañadas del requisito de un alto grado de disponibilidad de la información biomédica de sus usuarios para garantizar una alta calidad de servicio, p.ej. fusionar varias señales de un usuario para obtener un diagnóstico más preciso. La consecuencia negativa de cumplir esta demanda es el aumento directo de las superficies potencialmente vulnerables a ataques, lo que sitúa a la seguridad (y a la privacidad) del modelo de m-Salud como factor crítico para su éxito. Como requisito no funcional de las aplicaciones de m-Salud, la seguridad ha recibido menos atención que otros requisitos técnicos que eran más urgentes en etapas de desarrollo previas, tales como la robustez, la eficiencia, la interoperabilidad o la usabilidad. Otro factor importante que ha contribuido a retrasar la implementación de políticas de seguridad sólidas es que garantizar un determinado nivel de seguridad implica unos costes que pueden ser muy relevantes en varias dimensiones, en especial en la económica (p.ej. sobrecostes por la inclusión de hardware extra para la autenticación de usuarios), en el rendimiento (p.ej. reducción de la eficiencia y de la interoperabilidad debido a la integración de elementos de seguridad) y en la usabilidad (p.ej. configuración más complicada de dispositivos y aplicaciones de salud debido a las nuevas opciones de seguridad). Por tanto, las soluciones de seguridad que persigan satisfacer a todos los actores del contexto de m-Salud (usuarios, pacientes, personal médico, personal técnico, legisladores, fabricantes de dispositivos y equipos, etc.) deben ser robustas y al mismo tiempo minimizar sus costes asociados. Esta Tesis detalla una propuesta de seguridad, compuesta por cuatro grandes bloques interconectados, para dotar de seguridad a las arquitecturas de m-Salud con unos costes reducidos. El primer bloque define un esquema global que proporciona unos niveles de seguridad e interoperabilidad acordes con las características de las distintas aplicaciones de m-Salud. Este esquema está compuesto por tres capas diferenciadas, diseñadas a la medidas de los dominios de m-Salud y de sus restricciones, incluyendo medidas de seguridad adecuadas para la defensa contra las amenazas asociadas a sus aplicaciones de m-Salud. El segundo bloque establece la extensión de seguridad de aquellos protocolos estándar que permiten la adquisición, el intercambio y/o la administración de información biomédica -- por tanto, usados por muchas aplicaciones de m-Salud -- pero no reúnen los niveles de seguridad detallados en el esquema previo. Estas extensiones se concretan para los estándares biomédicos ISO/IEEE 11073 PHD y SCP-ECG. El tercer bloque propone nuevas formas de fortalecer la seguridad de los tests biomédicos, que constituyen el elemento esencial de muchas aplicaciones de m-Salud de carácter clínico, mediante codificaciones novedosas. Finalmente el cuarto bloque, que se sitúa en paralelo a los anteriores, selecciona herramientas genéricas de seguridad (elementos de autenticación y criptográficos) cuya integración en los otros bloques resulta idónea, y desarrolla nuevas herramientas de seguridad, basadas en señal -- embedding y keytagging --, para reforzar la protección de los test biomédicos.The paradigm of m-Health (mobile health) advocates for the massive integration of advanced mobile communications, network and sensor technologies in healthcare applications and systems to foster the deployment of a new, user/patient-centered healthcare model enabling the empowerment of users in the management of their health (e.g. by increasing their health literacy, promoting healthy lifestyles and the prevention of diseases), a better home-based healthcare delivery for elderly and chronic patients and important savings for healthcare systems due to the reduction of hospitalizations in number and duration. It is a fact that many m-Health applications demand high availability of biomedical information from their users (for further accurate analysis, e.g. by fusion of various signals) to guarantee high quality of service, which on the other hand entails increasing the potential surfaces for attacks. Therefore, it is not surprising that security (and privacy) is commonly included among the most important barriers for the success of m-Health. As a non-functional requirement for m-Health applications, security has received less attention than other technical issues that were more pressing at earlier development stages, such as reliability, eficiency, interoperability or usability. Another fact that has contributed to delaying the enforcement of robust security policies is that guaranteeing a certain security level implies costs that can be very relevant and that span along diferent dimensions. These include budgeting (e.g. the demand of extra hardware for user authentication), performance (e.g. lower eficiency and interoperability due to the addition of security elements) and usability (e.g. cumbersome configuration of devices and applications due to security options). Therefore, security solutions that aim to satisfy all the stakeholders in the m-Health context (users/patients, medical staff, technical staff, systems and devices manufacturers, regulators, etc.) shall be robust and, at the same time, minimize their associated costs. This Thesis details a proposal, composed of four interrelated blocks, to integrate appropriate levels of security in m-Health architectures in a cost-efcient manner. The first block designes a global scheme that provides different security and interoperability levels accordingto how critical are the m-Health applications to be implemented. This consists ofthree layers tailored to the m-Health domains and their constraints, whose security countermeasures defend against the threats of their associated m-Health applications. Next, the second block addresses the security extension of those standard protocols that enable the acquisition, exchange and/or management of biomedical information | thus, used by many m-Health applications | but do not meet the security levels described in the former scheme. These extensions are materialized for the biomedical standards ISO/IEEE 11073 PHD and SCP-ECG. Then, the third block proposes new ways of enhancing the security of biomedical standards, which are the centerpiece of many clinical m-Health applications, by means of novel codings. Finally the fourth block, with is parallel to the others, selects generic security methods (for user authentication and cryptographic protection) whose integration in the other blocks results optimal, and also develops novel signal-based methods (embedding and keytagging) for strengthening the security of biomedical tests. The layer-based extensions of the standards ISO/IEEE 11073 PHD and SCP-ECG can be considered as robust, cost-eficient and respectful with their original features and contents. The former adds no attributes to its data information model, four new frames to the service model |and extends four with new sub-frames|, and only one new sub-state to the communication model. Furthermore, a lightweight architecture consisting of a personal health device mounting a 9 MHz processor and an aggregator mounting a 1 GHz processor is enough to transmit a 3-lead electrocardiogram in real-time implementing the top security layer. The extra requirements associated to this extension are an initial configuration of the health device and the aggregator, tokens for identification/authentication of users if these devices are to be shared and the implementation of certain IHE profiles in the aggregator to enable the integration of measurements in healthcare systems. As regards to the extension of SCP-ECG, it only adds a new section with selected security elements and syntax in order to protect the rest of file contents and provide proper role-based access control. The overhead introduced in the protected SCP-ECG is typically 2{13 % of the regular file size, and the extra delays to protect a newly generated SCP-ECG file and to access it for interpretation are respectively a 2{10 % and a 5 % of the regular delays. As regards to the signal-based security techniques developed, the embedding method is the basis for the proposal of a generic coding for tests composed of biomedical signals, periodic measurements and contextual information. This has been adjusted and evaluated with electrocardiogram and electroencephalogram-based tests, proving the objective clinical quality of the coded tests, the capacity of the coding-access system to operate in real-time (overall delays of 2 s for electrocardiograms and 3.3 s for electroencephalograms) and its high usability. Despite of the embedding of security and metadata to enable m-Health services, the compression ratios obtained by this coding range from ' 3 in real-time transmission to ' 5 in offline operation. Complementarily, keytagging permits associating information to images (and other signals) by means of keys in a secure and non-distorting fashion, which has been availed to implement security measures such as image authentication, integrity control and location of tampered areas, private captioning with role-based access control, traceability and copyright protection. The tests conducted indicate a remarkable robustness-capacity tradeoff that permits implementing all this measures simultaneously, and the compatibility of keytagging with JPEG2000 compression, maintaining this tradeoff while setting the overall keytagging delay in only ' 120 ms for any image size | evidencing the scalability of this technique. As a general conclusion, it has been demonstrated and illustrated with examples that there are various, complementary and structured manners to contribute in the implementation of suitable security levels for m-Health architectures with a moderate cost in budget, performance, interoperability and usability. The m-Health landscape is evolving permanently along all their dimensions, and this Thesis aims to do so with its security. Furthermore, the lessons learned herein may offer further guidance for the elaboration of more comprehensive and updated security schemes, for the extension of other biomedical standards featuring low emphasis on security or privacy, and for the improvement of the state of the art regarding signal-based protection methods and applications

    An acoustic-phonetic approach in automatic Arabic speech recognition

    Get PDF
    In a large vocabulary speech recognition system the broad phonetic classification technique is used instead of detailed phonetic analysis to overcome the variability in the acoustic realisation of utterances. The broad phonetic description of a word is used as a means of lexical access, where the lexicon is structured into sets of words sharing the same broad phonetic labelling. This approach has been applied to a large vocabulary isolated word Arabic speech recognition system. Statistical studies have been carried out on 10,000 Arabic words (converted to phonemic form) involving different combinations of broad phonetic classes. Some particular features of the Arabic language have been exploited. The results show that vowels represent about 43% of the total number of phonemes. They also show that about 38% of the words can uniquely be represented at this level by using eight broad phonetic classes. When introducing detailed vowel identification the percentage of uniquely specified words rises to 83%. These results suggest that a fully detailed phonetic analysis of the speech signal is perhaps unnecessary. In the adopted word recognition model, the consonants are classified into four broad phonetic classes, while the vowels are described by their phonemic form. A set of 100 words uttered by several speakers has been used to test the performance of the implemented approach. In the implemented recognition model, three procedures have been developed, namely voiced-unvoiced-silence segmentation, vowel detection and identification, and automatic spectral transition detection between phonemes within a word. The accuracy of both the V-UV-S and vowel recognition procedures is almost perfect. A broad phonetic segmentation procedure has been implemented, which exploits information from the above mentioned three procedures. Simple phonological constraints have been used to improve the accuracy of the segmentation process. The resultant sequence of labels are used for lexical access to retrieve the word or a small set of words sharing the same broad phonetic labelling. For the case of having more than one word-candidates, a verification procedure is used to choose the most likely one
    corecore