10,918 research outputs found
Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web
The Internet Protocol (IP) environment poses two relevant sources of distortion to the speech recognition problem: lossy speech coding and packet loss. In this paper, we propose a new front-end for speech recognition over IP networks. Specifically, we suggest extracting the recognition feature vectors directly from the encoded speech (i.e., the bit stream) instead of decoding it and subsequently extracting the feature vectors. This approach offers two significant benefits. First, the recognition system is only affected by the quantization distortion of the spectral envelope. Thus, we are avoiding the influence of other sources of distortion due to the encoding-decoding process. Second, when packet loss occurs, our front-end becomes more effective since it is not constrained to the error handling mechanism of the codec. We have considered the ITU G.723.1 standard codec, which is one of the most preponderant coding algorithms in voice over IP (VoIP) and compared the proposed front-end with the conventional approach in two automatic speech recognition (ASR) tasks, namely, speaker-independent isolated digit recognition and speaker-independent continuous speech recognition. In general, our approach outperforms the conventional procedure, for a variety of simulated packet loss rates. Furthermore, the improvement is higher as network conditions worsen.Publicad
Impact of packet losses in scalable 3D holoscopic video coding
Holoscopic imaging became a prospective glassless 3D technology to provide more natural 3D viewing experiences to the end user. Additionally, holoscopic systems also allow new post-production degrees of freedom, such as controlling the plane of focus or the viewing angle presented to the user. However, to successfully introduce this technology into the consumer market, a display scalable coding approach is essential to achieve backward compatibility with legacy 2D and 3D displays. Moreover, to effectively transmit 3D holoscopic content over error-prone networks, e.g., wireless networks or the Internet, error resilience techniques are required to mitigate the impact of data impairments in the user quality perception. Therefore, it is essential to deeply understand the impact of packet losses in terms of decoding video quality for the specific case of 3D holoscopic content, notably when a scalable approach is used. In this context, this paper studies the impact of packet losses when using a three-layer display scalable 3D holoscopic video coding architecture previously proposed, where each layer represents a different level of display scalability (i.e., L0 - 2D, L1 - stereo or multiview, and L2 - full 3D holoscopic). For this, a simple error concealment algorithm is used, which makes use of inter-layer redundancy between multiview and 3D holoscopic content and the inherent correlation of the 3D holoscopic content to estimate lost data. Furthermore, a study of the influence of 2D views generation parameters used in lower layers on the performance of the used error concealment algorithm is also presented.info:eu-repo/semantics/acceptedVersio
Multi-View Frame Reconstruction with Conditional GAN
Multi-view frame reconstruction is an important problem particularly when
multiple frames are missing and past and future frames within the camera are
far apart from the missing ones. Realistic coherent frames can still be
reconstructed using corresponding frames from other overlapping cameras. We
propose an adversarial approach to learn the spatio-temporal representation of
the missing frame using conditional Generative Adversarial Network (cGAN). The
conditional input to each cGAN is the preceding or following frames within the
camera or the corresponding frames in other overlapping cameras, all of which
are merged together using a weighted average. Representations learned from
frames within the camera are given more weight compared to the ones learned
from other cameras when they are close to the missing frames and vice versa.
Experiments on two challenging datasets demonstrate that our framework produces
comparable results with the state-of-the-art reconstruction method in a single
camera and achieves promising performance in multi-camera scenario.Comment: 5 pages, 4 figures, 3 tables, Accepted at IEEE Global Conference on
Signal and Information Processing, 201
Impact of packet losses in scalable light field video coding
Light field imaging technology has been recently attracting the attention of the research community and the industry. However, to effectively transmit light field content to the end-user over error-prone networks—e.g., wireless networks or the Internet—error resilience techniques are required to mitigate the impact of data impairments in the user quality perception. In this context, this chapter analyzes the impact of packet losses when using a three-layer display scalable light field video coding architecture, which has been presented in Chap. 6. For this, a simple error concealment algorithm is used, which makes use of inter-layer redundancy between multiview and light field content and the inherent correlation of the light field content to estimate lost data. Furthermore, a study of the influence of 2D views generation parameters used in lower layers on the performance of the used error concealment algorithm is also presented.info:eu-repo/semantics/acceptedVersio
Review on Common Steganography Techniques
تحرص الجهات المختلفة على الحفاظ على سرية معلوماتها وحمايتها من الأطراف المتنافسة أو المعادية التي حرصت أيضًا على الوصول إلى تلك المعلومات بكافة الوسائل المتاحة. بما أن تشفير المعلومات ينكشف لأنه ينتج نصوصًا غير مفهومة تثير الشك، يميل البعض إلى العمل بطريقة تزيل الشكوك عن طريق إخفاء المعلومات في وسيط مثل النص أو الصورة بحيث يبدو ما يتم إرساله وتداوله طبيعيًا وخاليًا من العلامات أو رموز غير مفهومة كما لو لم يتم تحميلها بأي معلومات إضافية. هذا البحث يقدم مراجعة للتقنيات المستخدمة لإخفاء البيانات في الصور باعتبارها واحدة من أكثر تقنيات الإخفاء شيوعًا.Various authorities are keen to preserve the confidentiality of their information and protect it from competing or hostile parties who were also keen to access that information by all available means. Since the encryption of information is exposed as it produces incomprehensible texts that arouse suspicion, some tend to work in a way that removes suspicions by hiding the information in a medium like text or picture so that what is sent and circulated appears natural and free of signs or incomprehensible symbols as if not loaded with any additional information. This paper introduces a review the techniques used to hide data in images as one of the most common concealment techniques
- …