6,424 research outputs found

    The listening room, Camden Arts Centre

    Full text link
    This version of The Listening Room is minimal, one microphone and two loudspeakers in the Reading Room of Camden Arts Centre, a relatively small space for this work. The Reading Room is the former entrance to the building, this entrance has been bricked over to create three highly reflective wall surfaces in the room. The room resonance is so pronounced that my usual placement of microphone and speakers would tend to fix on one pitch and stay there - to introduce more of the available frequencies from the space I left the Reading Room table in the space to allow an additional reflective element and used an asymmetric placement of loudspeakers, one at the side and one under the table

    CHORUS Deliverable 3.4: Vision Document

    Get PDF
    The goal of the CHORUS Vision Document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area and to highlight trends and challenges in this domain. The vision of CHORUS is strongly connected to the CHORUS Roadmap Document (D2.3). A concise document integrating the outcomes of the two deliverables will be prepared for the end of the project (NEM Summit)

    MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension

    Full text link
    [EN] This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politècnica de València for the Albayzín-RTVE 2020 Speech-to-Text Challenge, and includes an extension of the work consisting of building and evaluating equivalent systems under the closed data conditions from the 2018 challenge. The primary system (p-streaming_1500ms_nlt) was a hybrid ASR system using streaming one-pass decoding with a context window of 1.5 seconds. This system achieved 16.0% WER on the test-2020 set. We also submitted three contrastive systems. From these, we highlight the system c2-streaming_600ms_t which, following a similar configuration as the primary system with a smaller context window of 0.6 s, scored 16.9% WER points on the same test set, with a measured empirical latency of 0.81 ± 0.09 s (mean ± stdev). That is, we obtained state-of-the-art latencies for high-quality automatic live captioning with a small WER degradation of 6% relative. As an extension, the equivalent closed-condition systems obtained 23.3% WER and 23.5% WER, respectively. When evaluated with an unconstrained language model, we obtained 19.9% WER and 20.4% WER; i.e., not far behind the top-performing systems with only 5% of the full acoustic data and with the extra ability of being streaming-capable. Indeed, all of these streaming systems could be put into production environments for automatic captioning of live media streams.The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements no. 761758 (X5Gon) and 952215 (TAILOR), and Erasmus+ Education programme under grant agreement no. 20-226-093604-SCH (EXPERT); the Government of Spain's grant RTI2018-094879-B-I00 (Multisub) funded by MCIN/AEI/10.13039/501100011033 & "ERDF A way of making Europe", and FPU scholarships FPU14/03981 and FPU18/04135; the Generalitat Valenciana's research project Classroom Activity Recognition (ref. PROMETEO/2019/111), and predoctoral research scholarship ACIF/2017/055; and the Universitat Politecnica de Valencia's PAID-01-17 R&D support programme.Baquero-Arnal, P.; Jorge-Cano, J.; Giménez Pastor, A.; Iranzo-Sánchez, J.; Pérez-González De Martos, AM.; Garcés Díaz-Munío, G.; Silvestre Cerdà, JA.... (2022). MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension. Applied Sciences. 12(2):1-14. https://doi.org/10.3390/app1202080411412

    Transmissão de video melhorada com recurso a SDN em ambientes baseados em cloud

    Get PDF
    The great technological development of informatics has opened the way for provisioning various services and new online-based entertainment services, which have expanded significantly after the increase in social media applications and the number of users. This significant expansion has posed an additional challenge to Internet Service Providers (ISP)s in terms of management for network, equipment and the efficiency of service delivery. New notions and techniques have been developed to offer innovative solutions such as SDN for network management, virtualization for optimal resource utilization and others like cloud computing and network function virtualization. This dissertation aims to manage live video streaming in the network automatically by adding a design architecture to the virtual network environment that helps to filter video packets from the remaining ones into a certain tunnel and this tunnel will be handled as a higher priority to be able to provide better service for customers. With the dedicated architecture, side by side, a monitoring application integrated into the system was used to detect the video packets and notify the SDN server to the existence of the video through the networkOs grandes avanços tecnológicos em informática abriram o caminho para o fornecimento de vários serviços e novos aplicações de entretenimento baseadas na web, que expandiram significativamente com a explosão no número de aplicações e utilizadores das redes sociais. Esta expansão significativa colocou desafios adicionais aos fornecedores de serviços de rede, em termos de gestão de rede, equipamento e a eficácia do fornecimento de serviços. Novas noções e técnicas foram desenvolvidas para oferecer soluções inovadoras, tais como redes definidas por software (SDN) para a gestão de rede, virtualização para a optimização da utilização dos recursos e outros, tais como a computação em nuvem e as funções de rede virtualizadas. Esta dissertação pretende gerir automaticamente a emissão de vídeo ao vivo na rede, através da adição de uma arquitetura ao ambiente de rede virtualizado, que auxilie a filtragem de pacotes de vídeo dos do restante tráfego, para um túnel específico, que será gerido com uma prioridade maior, capaz de fornecer melhor serviço aos clientes. Além do desenho da arquitectura, scripts de Python foram usados para detectar os pacotes de vídeo e injetar novas regras no controlador SDN que monitoriza o tráfego ao longo da rede.Mestrado em Engenharia de Computadores e Telemátic

    Design and Implementation of a Communication Protocol to Improve Multimedia QoS and QoE in Wireless Ad Hoc Networks

    Full text link
    [EN] This dissertation addresses the problem of multimedia delivery over multi-hop ad hoc wireless networks, and especially over wireless sensor networks. Due to their characteristics of low power consumption, low processing capacity and low memory capacity, they have major difficulties in achieving optimal quality levels demanded by end users in such communications. In the first part of this work, it has been carried out a study to determine the behavior of a variety of multimedia streams and how they are affected by the network conditions when they are transmitted over topologies formed by devices of different technologies in multi hop wireless ad hoc mode. To achieve this goal, we have performed experimental tests using a test bench, which combine the main codecs used in audio and video streaming over IP networks with different sound and video captures representing the characteristic patterns of multimedia services such as phone calls, video communications, IPTV and video on demand (VOD). With the information gathered in the laboratory, we have been able to establish the correlation between the induced changes in the physical and logical topology and the network parameters that measure the quality of service (QoS) of a multimedia transmission, such as latency, jitter or packet loss. At this stage of the investigation, a study was performed to determine the state of the art of the proposed protocols, algorithms, and practical implementations that have been explicitly developed to optimize the multimedia transmission over wireless ad hoc networks, especially in ad hoc networks using clusters of nodes distributed over a geographic area and wireless sensor networks. Next step of this research was the development of an algorithm focused on the logical organization of clusters formed by nodes capable of adapting to the circumstances of real-time traffic. The stated goal was to achieve the maximum utilization of the resources offered by the set of nodes that forms the network, allowing simultaneously sending reliably and efficiently all types of content through them, and mixing conventional IP data traffic with multimedia traffic with stringent QoS and QoE requirements. Using the information gathered in the previous phase, we have developed a network architecture that improves overall network performance and multimedia streaming. In parallel, it has been designed and programmed a communication protocol that allows implementing the proposal and testing its operation on real network infrastructures. In the last phase of this thesis we have focused our work on sending multimedia in wireless sensor networks (WSN). Based on the above results, we have adapted both the architecture and the communication protocol for this particular type of network, whose use has been growing hugely in recent years.[ES] Esta tesis doctoral aborda el problema de la distribución de contenidos multimedia a través de redes inalámbricas ad hoc multisalto, especialmente las redes inalámbricas de sensores que, debido a sus características de bajo consumo energético, baja capacidad de procesamiento y baja capacidad de memoria, plantean grandes dificultades para alcanzar los niveles de calidad óptimos que exigen los usuarios finales en dicho tipo de comunicaciones. En la primera parte de este trabajo se ha llevado a cabo un estudio para determinar el comportamiento de una gran variedad de flujos multimedia y como se ven afectados por las condiciones de la red cuando son transmitidos a través topologías formadas por dispositivos de diferentes tecnologías que se comunican en modo ad hoc multisalto inalámbrico. Para ello, se han realizado pruebas experimentales sobre una maqueta de laboratorio, combinando los principales códecs empleados en la transmisión de audio y video a través de redes IP con diversas capturas de sonido y video que representan patrones característicos de servicios multimedia tales como las llamadas telefónicas, videoconferencias, IPTV o video bajo demanda (VOD). Con la información reunida en el laboratorio se ha podido establecer la correlación entre los cambios inducidos en la topología física y lógica de la red con los parámetros que miden la calidad de servicio (QoS) de una transmisión multimedia, tales como la latencia el jitter o la pérdida de paquetes. En esta fase de la investigación se realiza un estudio para determinar el estado del arte de las propuestas de desarrollo e implementación de protocolos y algoritmos que se han generado de forma explícita para optimizar la transmisión de tráfico multimedia sobre redes ad hoc inalámbricas, especialmente en las redes inalámbricas de sensores y redes ad hoc utilizando clústeres de nodos distribuidos en un espacio geográfico. El siguiente paso en la investigación ha consistido en el desarrollo de un algoritmo propio para la organización lógica de clústeres formados por nodos capaces de adaptarse a las circunstancias del tráfico en tiempo real. El objetivo planteado es conseguir un aprovechamiento máximo de los recursos ofrecidos por el conjunto de nodos que forman la red, permitiendo de forma simultánea el envío de todo tipo de contenidos a través de ellos de forma confiable y eficiente, permitiendo la convivencia de tráfico de datos IP convencional con tráfico multimedia con requisitos exigentes de QoS y QoE. A partir de la información conseguida en la fase anterior, se ha desarrollado una arquitectura de red que mejora el rendimiento general de la red y el de las transmisiones multimedia de audio y video en particular. De forma paralela, se ha diseñado y programado un protocolo de comunicación que permite implementar el modelo y testear su funcionamiento sobre infraestructuras de red reales. En la última fase de esta tesis se ha dirigido la atención hacia la transmisión multimedia en las redes de sensores inalámbricos (WSN). Partiendo de los resultados anteriores, se ha adaptado tanto la arquitectura como el protocolo de comunicaciones para este tipo concreto de red, cuyo uso se ha extendido en los últimos años de forma considerable[CA] Esta tesi doctoral aborda el problema de la distribució de continguts multimèdia a través de xarxes sense fil ad hoc multi salt, especialment les xarxes sense fil de sensors que, a causa de les seues característiques de baix consum energètic, baixa capacitat de processament i baixa capacitat de memòria, plantegen grans dificultats per a aconseguir els nivells de qualitat òptims que exigixen els usuaris finals en eixos tipus de comunicacions. En la primera part d'este treball s'ha dut a terme un estudi per a determinar el comportament d'una gran varietat de fluxos multimèdia i com es veuen afectats per les condicions de la xarxa quan són transmesos a través topologies formades per dispositius de diferents tecnologies que es comuniquen en mode ad hoc multi salt sense fil. Per a això, s'han realitzat proves experimentals sobre una maqueta de laboratori, combinant els principals códecs empleats en la transmissió d'àudio i vídeo a través de xarxes IP amb diverses captures de so i vídeo que representen patrons característics de serveis multimèdia com son les cridades telefòniques, videoconferències, IPTV o vídeo baix demanda (VOD). Amb la informació reunida en el laboratori s'ha pogut establir la correlació entre els canvis induïts en la topologia física i lògica de la xarxa amb els paràmetres que mesuren la qualitat de servei (QoS) d'una transmissió multimèdia, com la latència el jitter o la pèrdua de paquets. En esta fase de la investigació es realitza un estudi per a determinar l'estat de l'art de les propostes de desenvolupament i implementació de protocols i algoritmes que s'han generat de forma explícita per a optimitzar la transmissió de tràfic multimèdia sobre xarxes ad hoc sense fil, especialment en les xarxes sense fil de sensors and xarxes ad hoc utilitzant clusters de nodes distribuïts en un espai geogràfic. El següent pas en la investigació ha consistit en el desenvolupament d'un algoritme propi per a l'organització lògica de clusters formats per nodes capaços d'adaptar-se a les circumstàncies del tràfic en temps real. L'objectiu plantejat és aconseguir un aprofitament màxim dels recursos oferits pel conjunt de nodes que formen la xarxa, permetent de forma simultània l'enviament de qualsevol tipus de continguts a través d'ells de forma confiable i eficient, permetent la convivència de tràfic de dades IP convencional amb tràfic multimèdia amb requisits exigents de QoS i QoE. A partir de la informació aconseguida en la fase anterior, s'ha desenvolupat una arquitectura de xarxa que millora el rendiment general de la xarxa i el de les transmissions multimèdia d'àudio i vídeo en particular. De forma paral¿lela, s'ha dissenyat i programat un protocol de comunicació que permet implementar el model i testejar el seu funcionament sobre infraestructures de xarxa reals. En l'última fase d'esta tesi s'ha dirigit l'atenció cap a la transmissió multimèdia en les xarxes de sensors sense fil (WSN). Partint dels resultats anteriors, s'ha adaptat tant l'arquitectura com el protocol de comunicacions per a aquest tipus concret de xarxa, l'ús del qual s'ha estés en els últims anys de forma considerable.Díaz Santos, JR. (2016). Design and Implementation of a Communication Protocol to Improve Multimedia QoS and QoE in Wireless Ad Hoc Networks [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/62162TESI

    Space station Simulation Computer System (SCS) study for NASA/MSFC. Volume 3: Refined conceptual design report

    Get PDF
    The results of the refined conceptual design phase (task 5) of the Simulation Computer System (SCS) study are reported. The SCS is the computational portion of the Payload Training Complex (PTC) providing simulation based training on payload operations of the Space Station Freedom (SSF). In task 4 of the SCS study, the range of architectures suitable for the SCS was explored. Identified system architectures, along with their relative advantages and disadvantages for SCS, were presented in the Conceptual Design Report. Six integrated designs-combining the most promising features from the architectural formulations-were additionally identified in the report. The six integrated designs were evaluated further to distinguish the more viable designs to be refined as conceptual designs. The three designs that were selected represent distinct approaches to achieving a capable and cost effective SCS configuration for the PTC. Here, the results of task 4 (input to this task) are briefly reviewed. Then, prior to describing individual conceptual designs, the PTC facility configuration and the SSF systems architecture that must be supported by the SCS are reviewed. Next, basic features of SCS implementation that have been incorporated into all selected SCS designs are considered. The details of the individual SCS designs are then presented before making a final comparison of the three designs

    Multipoint connection management in ATM networks

    Get PDF

    Evaluating Novel Speech Transcription Architectures on the Spanish RTVE2020 Database

    Get PDF
    This work presents three novel speech recognition architectures evaluated on the Spanish RTVE2020 dataset, employed as the main evaluation set in the Albayzín S2T Transcription Challenge 2020. The main objective was to improve the performance of the systems previously submitted by the authors to the challenge, in which the primary system scored the second position. The novel systems are based on both DNN-HMM and E2E acoustic models, for which fully-and self-supervised learning methods were included. As a result, the new speech recognition engines clearly outper-formed the performance of the initial systems from the previous best WER of 19.27 to the new best of 17.60 achieved by the DNN-HMM based system. This work therefore describes an interesting benchmark of the latest acoustic models over a highly challenging dataset, and identifies the most optimal ones depending on the expected quality, the available resources and the required latency
    corecore