12 research outputs found

    Enhanced quality reconstruction of erroneous video streams using packet filtering based on non-desynchronizing bits and UDP checksum-filtered list decoding

    Get PDF
    The latest video coding standards, such as H.264 and H.265, are extremely vulnerable in error-prone networks. Due to their sophisticated spatial and temporal prediction tools, the effect of an error is not limited to the erroneous area but it can easily propagate spatially to the neighboring blocks and temporally to the following frames. Thus, reconstructed video packets at the decoder side may exhibit significant visual quality degradation. Error concealment and error corrections are two mechanisms that have been developed to improve the quality of reconstructed frames in the presence of errors. In most existing error concealment approaches, the corrupted packets are ignored and only the correctly received information of the surrounding areas (spatially and/or temporally) is used to recover the erroneous area. This is due to the fact that there is no perfect error detection mechanism to identify correctly received blocks within a corrupted packet, and moreover because of the desynchronization problem caused by the transmission errors on the variable-length code (VLC). But, as many studies have shown, the corrupted packets may contain valuable information that can be used to reconstruct adequately of the lost area (e.g. when the error is located at the end of a slice). On the other hand, error correction approaches, such as list decoding, exploit the corrupted packets to generate several candidate transmitted packets from the corrupted received packet. They then select, among these candidates, the one with the highest likelihood of being the transmitted packet based on the available soft information (e.g. log-likelihood ratio (LLR) of each bit). However, list decoding approaches suffer from a large solution space of candidate transmitted packets. This is worsened when the soft information is not available at the application layer; a more realistic scenario in practice. Indeed, since it is unknown which bits have higher probabilities of having been modified during transmission, the candidate received packets cannot be ranked by likelihood. In this thesis, we propose various strategies to improve the quality of reconstructed packets which have been lightly damaged during transmission (e.g. at most a single error per packet). We first propose a simple but efficient mechanism to filter damaged packets in order to retain those likely to lead to a very good reconstruction and discard the others. This method can be used as a complement to most existing concealment approaches to enhance their performance. The method is based on the novel concept of non-desynchronizing bits (NDBs) defined, in the context of an H.264 context-adaptive variable-length coding (CAVLC) coded sequence, as a bit whose inversion does not cause desynchronization at the bitstream level nor changes the number of decoded macroblocks. We establish that, on typical coded bitstreams, the NDBs constitute about a one-third (about 30%) of a bitstream, and that the effect on visual quality of flipping one of them in a packet is mostly insignificant. In most cases (90%), the quality of the reconstructed packet when modifying an individual NDB is almost the same as the intact one. We thus demonstrate that keeping, under certain conditions, a corrupted packet as a candidate for the lost area can provide better visual quality compared to the concealment approaches. We finally propose a non-desync-based decoding framework, which retains a corrupted packet, under the condition of not causing desynchronization and not altering the number of expected macroblocks. The framework can be combined with most current concealment approaches. The proposed approach is compared to the frame copy (FC) concealment of Joint Model (JM) software (JM-FC) and a state-of-the-art concealment approach using the spatiotemporal boundary matching algorithm (STBMA) mechanism, in the case of one bit in error, and on average, respectively, provides 3.5 dB and 1.42 dB gain over them. We then propose a novel list decoding approach called checksum-filtered list decoding (CFLD) which can correct a packet at the bit stream level by exploiting the receiver side user datagram protocol (UDP) checksum value. The proposed approach is able to identify the possible locations of errors by analyzing the pattern of the calculated UDP checksum on the corrupted packet. This makes it possible to considerably reduce the number of candidate transmitted packets in comparison to conventional list decoding approaches, especially when no soft information is available. When a packet composed of N bits contains a single bit in error, instead of considering N candidate packets, as is the case in conventional list decoding approaches, the proposed approach considers approximately N/32 candidate packets, leading to a 97% reduction in the number of candidates. This reduction can increase to 99.6% in the case of a two-bit error. The method’s performance is evaluated using H.264 and high efficiency video coding (HEVC) test model software. We show that, in the case H.264 coded sequence, on average, the CFLD approach is able to correct the packet 66% of the time. It also offers a 2.74 dB gain over JM-FC and 1.14 dB and 1.42 dB gains over STBMA and hard output maximum likelihood decoding (HO-MLD), respectively. Additionally, in the case of HEVC, the CFLD approach corrects the corrupted packet 91% of the time, and offers 2.35 dB and 4.97 dB gains over our implementation of FC concealment in HEVC test model software (HM-FC) in class B (1920×1080) and C (832×480) sequences, respectively

    Bitstream-Corrupted Video Recovery: A Novel Benchmark Dataset and Method

    Full text link
    The past decade has witnessed great strides in video recovery by specialist technologies, like video inpainting, completion, and error concealment. However, they typically simulate the missing content by manual-designed error masks, thus failing to fill in the realistic video loss in video communication (e.g., telepresence, live streaming, and internet video) and multimedia forensics. To address this, we introduce the bitstream-corrupted video (BSCV) benchmark, the first benchmark dataset with more than 28,000 video clips, which can be used for bitstream-corrupted video recovery in the real world. The BSCV is a collection of 1) a proposed three-parameter corruption model for video bitstream, 2) a large-scale dataset containing rich error patterns, multiple corruption levels, and flexible dataset branches, and 3) a plug-and-play module in video recovery framework that serves as a benchmark. We evaluate state-of-the-art video inpainting methods on the BSCV dataset, demonstrating existing approaches' limitations and our framework's advantages in solving the bitstream-corrupted video recovery problem. The benchmark and dataset are released at https://github.com/LIUTIGHE/BSCV-Dataset.Comment: Accepted by NeurIPS Dataset and Benchmark Track 202

    Performance of an error detection mechanism for damaged H. 264/AVC sequences

    Get PDF
    In mobile video applications, the error-prone wireless connection can cause the stream to be incorrectly received. An occurring error will propagate both spatially (in the current frame) and temporally (to the following frames). This work presents the implementation of an error detection and concealment mechanism for H.264/AVC encoded video and the design of a quality estimator. The detection is performed by means of two interacting strategies. At bit level, the syntax of the received bitstream will be analyzed in order to detect inconsistent or illegal codewords. At the pixel level, the remaining visual impairments in the decoded frame will be detected. The quality estimator is capable of, given the information output by the decoder, to estimate the subjective quality of the decoded H.264 video. This detection and concealment is implemented in the H.264/AVC decoder, without causing transmission overhead. Simulations show improvements both in objective (luminance peak-signal-to-noise ratio) and subjective (mean opinion score) tests with respect to the common slice rejection mechanism. The quality estimator is only a Matlab design and is not implemented in the decoder

    Performance of an error detection mechanism for damaged H. 264/AVC sequences

    Get PDF
    In mobile video applications, the error-prone wireless connection can cause the stream to be incorrectly received. An occurring error will propagate both spatially (in the current frame) and temporally (to the following frames). This work presents the implementation of an error detection and concealment mechanism for H.264/AVC encoded video and the design of a quality estimator. The detection is performed by means of two interacting strategies. At bit level, the syntax of the received bitstream will be analyzed in order to detect inconsistent or illegal codewords. At the pixel level, the remaining visual impairments in the decoded frame will be detected. The quality estimator is capable of, given the information output by the decoder, to estimate the subjective quality of the decoded H.264 video. This detection and concealment is implemented in the H.264/AVC decoder, without causing transmission overhead. Simulations show improvements both in objective (luminance peak-signal-to-noise ratio) and subjective (mean opinion score) tests with respect to the common slice rejection mechanism. The quality estimator is only a Matlab design and is not implemented in the decoder

    Acta Cybernetica : Volume 24. Number 4.

    Get PDF

    Augmentation of Brain Function: Facts, Fiction and Controversy. Volume III: From Clinical Applications to Ethical Issues and Futuristic Ideas

    Get PDF
    The final volume in this tripartite series on Brain Augmentation is entitled “From Clinical Applications to Ethical Issues and Futuristic Ideas”. Many of the articles within this volume deal with translational efforts taking the results of experiments on laboratory animals and applying them to humans. In many cases, these interventions are intended to help people with disabilities in such a way so as to either restore or extend brain function. Traditionally, therapies in brain augmentation have included electrical and pharmacological techniques. In contrast, some of the techniques discussed in this volume add specificity by targeting select neural populations. This approach opens the door to where and how to promote the best interventions. Along the way, results have empowered the medical profession by expanding their understanding of brain function. Articles in this volume relate novel clinical solutions for a host of neurological and psychiatric conditions such as stroke, Parkinson’s disease, Huntington’s disease, epilepsy, dementia, Alzheimer’s disease, autism spectrum disorders (ASD), traumatic brain injury, and disorders of consciousness. In disease, symptoms and signs denote a departure from normal function. Brain augmentation has now been used to target both the core symptoms that provide specificity in the diagnosis of a disease, as well as other constitutional symptoms that may greatly handicap the individual. The volume provides a report on the use of repetitive transcranial magnetic stimulation (rTMS) in ASD with reported improvements of core deficits (i.e., executive functions). TMS in this regard departs from the present-day trend towards symptomatic treatment that leaves unaltered the root cause of the condition. In diseases, such as schizophrenia, brain augmentation approaches hold promise to avoid lengthy pharmacological interventions that are usually riddled with side effects or those with limiting returns as in the case of Parkinson’s disease. Brain stimulation can also be used to treat auditory verbal hallucination, visuospatial (hemispatial) neglect, and pain in patients suffering from multiple sclerosis. The brain acts as a telecommunication transceiver wherein different bandwidth of frequencies (brainwave oscillations) transmit information. Their baseline levels correlate with certain behavioral states. The proper integration of brain oscillations provides for the phenomenon of binding and central coherence. Brain augmentation may foster the normalization of brain oscillations in nervous system disorders. These techniques hold the promise of being applied remotely (under the supervision of medical personnel), thus overcoming the obstacle of travel in order to obtain healthcare. At present, traditional thinking would argue the possibility of synergism among different modalities of brain augmentation as a way of increasing their overall effectiveness and improving therapeutic selectivity. Thinking outside of the box would also provide for the implementation of brain-to-brain interfaces where techniques, proper to artificial intelligence, could allow us to surpass the limits of natural selection or enable communications between several individual brains sharing memories, or even a global brain capable of self-organization. Not all brains are created equal. Brain stimulation studies suggest large individual variability in response that may affect overall recovery/treatment, or modify desired effects of a given intervention. The subject’s age, gender, hormonal levels may affect an individual’s cortical excitability. In addition, this volume discusses the role of social interactions in the operations of augmenting technologies. Finally, augmenting methods could be applied to modulate consciousness, even though its neural mechanisms are poorly understood. Finally, this volume should be taken as a debate on social, moral and ethical issues on neurotechnologies. Brain enhancement may transform the individual into someone or something else. These techniques bypass the usual routes of accommodation to environmental exigencies that exalted our personal fortitude: learning, exercising, and diet. This will allow humans to preselect desired characteristics and realize consequent rewards without having to overcome adversity through more laborious means. The concern is that humans may be playing God, and the possibility of an expanding gap in social equity where brain enhancements may be selectively available to the wealthier individuals. These issues are discussed by a number of articles in this volume. Also discussed are the relationship between the diminishment and enhancement following the application of brain-augmenting technologies, the problem of “mind control” with BMI technologies, free will the duty to use cognitive enhancers in high-responsibility professions, determining the population of people in need of brain enhancement, informed public policy, cognitive biases, and the hype caused by the development of brain- augmenting approaches

    The functional Role of Gamma-Band Synchronization in selective Routing and Network Configuration within the visual Cortex

    Get PDF
    First psychophysical experiments performed more than 100 years ago by the German psychologist and physicist Hermann von Helmholtz, showed that visual attention is a central component of perception and, therefore, of substantial relevance for successful behavior. In the decades that followed, much research has been performed to investigate how attention modulates neuronal activity in order to explain the effects of attention on behavior and perception. A well-described finding is that visual neurons responding to the same attended object synchronize their activity in the gamma-frequency range (30 - 100 Hz). In chapter 2, I present the results of an experiment that was designed to find evidence for a causal role of gamma-band synchronization in selective information routing and processing. The underlying idea is that neurons, which synchronize their activity deliver their respective outputs (spikes) more precisely at times the receiving neuron is sensitive for it, i.e. the incoming spikes are more likely to evoke spikes of the receiving neuron. The selective synchronization between input and receiver neurons representing an attended and therefore relevant object could constitute a powerful selection mechanism. To test this gamma recorded neuronal activity in area V4 of two macaque monkeys while applying single electrical pulses to neurons located in area V2. Those V2 neurons delivered afferent input to the recorded V4 population, including the electrically evoked spikes. By relating the effects of these electrically evoked spikes to the gamma-oscillation in V4, I could show that the impact of stimulation on behavior and neuronal activity is causally dependent on the gamma-phase. In chapter 3, I investigated whether the effective processing of a given object requires a specific level of gamma-band synchronization within a local neuronal population. I hypothesized that different objects require different combinations of neurons of the same population to be functionally coupled with one another for effective processing. Furthermore, we hypothesized that this dynamic establishment of functional connections is implemented by gamma-band synchronization, resulting in a specific level of gamma-band synchronization for a specific stimulus. I tested these predictions by first recording neuronal activity in area V4 and quantifying the level of gamma-synchronization in response to two different single stimuli, which had to be attended. Second, I compared these levels to the level of gamma-synchronization when neurons received input of both stimuli simultaneously, and one of them was attended. The level of gamma-synchronization was almost 'as if' the attended stimulus was presented alone, strongly indicating that the processing of this stimulus requires this specific gamma-synchronization level. Chapter 4 describes and characterizes a method that I used for analyzing multi-unit activity in area V4. It does not rely on setting up an amplitude-threshold for separating spikes from background noise as standard procedures do. Thus, this measure takes the entire spike activity into account, which I, therefore, refer to as ESA. I used semi-chronically recorded data of five macaque monkeys in order to quantify the sensitivity of the ESA to detect neuronal responses. The ESA-signal was significantly more sensitive than the standard procedures, especially for data with low signal-to-noise ratio, but preserves information about receptive field sizes and orientation selectivity of the underlying neuronal population. The fifth chapter is describing a method for offline stimulation-artifact removal and restoration of the original broadband neuronal signal. I could show that in contrast to existing methods the here described procedure does not disturb the original signal and therefore allows for analysis of neuronal activity even shortly after electrical stimulation. In summary, the results presented here give further insight into the functional roles of gamma-band synchronization. I could show that (1) gamma-phase synchronization plays a causal role in selective information processing and routing, and (2) that a specific pattern of intra-areal gamma-synchronization is required for effective processing of a given stimulus

    Phase entrainment and perceptual cycles in audition and vision

    Get PDF
    Des travaux récents indiquent qu'il existe des différences fondamentales entre les systèmes visuel et auditif: tandis que le premier semble échantillonner le flux d'information en provenance de l'environnement, en passant d'un "instantané" à un autre (créant ainsi des cycles perceptifs), la plupart des expériences destinées à examiner ce phénomène de discrétisation dans le système auditif ont mené à des résultats mitigés. Dans cette thèse, au travers de deux expériences de psychophysique, nous montrons que le sous-échantillonnage de l'information à l'entrée des systèmes perceptifs est en effet plus destructif pour l'audition que pour la vision. Cependant, nous révélons que des cycles perceptifs dans le système auditif pourraient exister à un niveau élevé du traitement de l'information. En outre, nos résultats suggèrent que du fait des fluctuations rapides du flot des sons en provenance de l'environnement, le système auditif tend à avoir son activité alignée sur la structure rythmique de ce flux. En synchronisant la phase des oscillations neuronales, elles-mêmes correspondant à différents états d'excitabilité, le système auditif pourrait optimiser activement le moment d'arrivée de ses "instantanés" et ainsi favoriser le traitement des informations pertinentes par rapport aux événements de moindre importance. Non seulement nos résultats montrent que cet entrainement de la phase des oscillations neuronales a des conséquences importantes sur la façon dont sont perçus deux flux auditifs présentés simultanément ; mais de plus, ils démontrent que l'entraînement de phase par un flux langagier inclut des mécanismes de haut niveau. Dans ce but, nous avons créé des stimuli parole/bruit dans lesquels les fluctuations de l'amplitude et du contenu spectral de la parole ont été enlevés, tout en conservant l'information phonétique et l'intelligibilité. Leur utilisation nous a permis de démontrer, au travers de plusieurs expériences, que le système auditif se synchronise à ces stimuli. Plus précisément, la perception, estimée par la détection d'un clic intégré dans les stimuli parole/bruit, et les oscillations neuronales, mesurées par Electroencéphalographie chez l'humain et à l'aide d'enregistrements intracrâniens dans le cortex auditif chez le singe, suivent la rythmique "de haut niveau" liée à la parole. En résumé, les résultats présentés ici suggèrent que les oscillations neuronales sont un mécanisme important pour la discrétisation des informations en provenance de l'environnement en vue de leur traitement par le cerveau, non seulement dans la vision, mais aussi dans l'audition. Pourtant, il semble exister des différences fondamentales entre les deux systèmes: contrairement au système visuel, il est essentiel pour le système auditif de se synchroniser (par entraînement de phase) à son environnement, avec un échantillonnage du flux des informations vraisemblablement réalisé à un niveau hiérarchique élevé.Recent research indicates fundamental differences between the auditory and visual systems: Whereas the visual system seems to sample its environment, cycling between "snapshots" at discrete moments in time (creating perceptual cycles), most attempts at discovering discrete perception in the auditory system failed. Here, we show in two psychophysical experiments that subsampling the very input to the visual and auditory systems is indeed more disruptive for audition; however, the existence of perceptual cycles in the auditory system is possible if they operate on a relatively high level of auditory processing. Moreover, we suggest that the auditory system, due to the rapidly fluctuating nature of its input, might rely to a particularly strong degree on phase entrainment, the alignment between neural activity and the rhythmic structure of its input: By using the low and high excitability phases of neural oscillations, the auditory system might actively control the timing of its "snapshots" and thereby amplify relevant information whereas irrelevant events are suppressed. Not only do our results suggest that the oscillatory phase has important consequences on how simultaneous auditory inputs are perceived; additionally, we can show that phase entrainment to speech sound does entail an active high-level mechanism. We do so by using specifically constructed speech/noise sounds in which fluctuations in low-level features (amplitude and spectral content) of speech have been removed, but intelligibility and high-level features (including, but not restricted to phonetic information) have been conserved. We demonstrate, in several experiments, that the auditory system can entrain to these stimuli, as both perception (the detection of a click embedded in the speech/noise stimuli) and neural oscillations (measured with electroencephalography, EEG, and in intracranial recordings in primary auditory cortex of the monkey) follow the conserved "high-level" rhythm of speech. Taken together, the results presented here suggest that, not only in vision, but also in audition, neural oscillations are an important tool for the discretization and processing of the brain's input. However, there seem to be fundamental differences between the two systems: In contrast to the visual system, it is critical for the auditory system to adapt (via phase entrainment) to its environment, and input subsampling is done most likely on a hierarchically high level of stimulus processing

    Choreographing the extended agent : performance graphics for dance theater

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (v. 2, leaves 448-458).The marriage of dance and interactive image has been a persistent dream over the past decades, but reality has fallen far short of potential for both technical and conceptual reasons. This thesis proposes a new approach to the problem and lays out the theoretical, technical and aesthetic framework for the innovative art form of digitally augmented human movement. I will use as example works a series of installations, digital projections and compositions each of which contains a choreographic component - either through collaboration with a choreographer directly or by the creation of artworks that automatically organize and understand purely virtual movement. These works lead up to two unprecedented collaborations with two of the greatest choreographers working today; new pieces that combine dance and interactive projected light using real-time motion capture live on stage. The existing field of"dance technology" is one with many problems. This is a domain with many practitioners, few techniques and almost no theory; a field that is generating "experimental" productions with every passing week, has literally hundreds of citable pieces and no canonical works; a field that is oddly disconnected from modern dance's history, pulled between the practical realities of the body and those of computer art, and has no influence on the prevailing digital art paradigms that it consumes.(cont.) This thesis will seek to address each of these problems: by providing techniques and a basis for "practical theory"; by building artworks with resources and people that have never previously been brought together, in theaters and in front of audiences previously inaccessible to the field; and by proving through demonstration that a profitable and important dialogue between digital art and the pioneers of modern dance can in fact occur. The methodological perspective of this thesis is that of biologically inspired, agent-based artificial intelligence, taken to a high degree of technical depth. The representations, algorithms and techniques behind such agent architectures are extended and pushed into new territory for both interactive art and artificial intelligence. In particular, this thesis ill focus on the control structures and the rendering of the extended agents' bodies, the tools for creating complex agent-based artworks in intense collaborative situations, and the creation of agent structures that can span live image and interactive sound production. Each of these parts becomes an element of what it means to "choreograph" an extended agent for live performance.Marc Downie.Ph.D

    Enhancing memory-related sleep spindles through learning and electrical brain stimulation

    Get PDF
    Sleep has been strongly implicated in mediating memory consolidation through hippocampal-neocortical communication. Evidence suggests offline processing of encoded information in the brain during slow wave sleep (SWS), specifically during slow oscillations and spindles. In this work, we used active exploration and learning tasks to study post-experience sleep spindle density changes in rats. Experiences lead to subsequent changes in sleep spindles, but the strength and timing of the effect was task-dependent. Brain stimulation in humans and rats have been shown to enhance memory consolidation. However, the exact stimulation parameters which lead to the strongest memory enhancement have not been fully explored. We tested the efficacy of both cortical sinusoidal direct current stimulation and intracortical pulse stimulation to enhance slow oscillations and spindle density. Pulse stimulation reliably evoked state-dependent slow oscillations and spindles during SWS with increased hippocampal ripple-spindle coupling, demonstrating potential in memory enhancement
    corecore