440,496 research outputs found

    Cross-Modal Variational Inference For Bijective Signal-Symbol Translation

    Get PDF
    International audienceExtraction of symbolic information from signals is an active field of research enabling numerous applications especially in the Musical Information Retrieval domain. This complex task, that is also related to other topics such as pitch extraction or instrument recognition, is a demanding subject that gave birth to numerous approaches , mostly based on advanced signal processing-based algorithms. However, these techniques are often non-generic, allowing the extraction of definite physical properties of the signal (pitch, octave), but not allowing arbitrary vocabularies or more general annotations. On top of that, these techniques are one-sided, meaning that they can extract symbolic data from an audio signal, but cannot perform the reverse process and make symbol-to-signal generation. In this paper, we propose an bijective approach for signal/symbol translation by turning this problem into a density estimation task over signal and symbolic domains, considered both as related random variables. We estimate this joint distribution with two different variational auto-encoders, one for each domain, whose inner representations are forced to match with an additive constraint, allowing both models to learn and generate separately while allowing signal-to-symbol and symbol-to-signal inference. In this article, we test our models on pitch, octave and dynamics symbols, which comprise a fundamental step towards music transcription and label-constrained audio generation. In addition to its versatility, this system is rather light during training and generation while allowing several interesting creative uses that we outline at the end of the article

    Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems

    Full text link
    In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, which we call active predictive coding (ActPC), that balances an internally-generated epistemic signal (meant to encourage intelligent exploration) with an internally-generated instrumental signal (meant to encourage goal-seeking behavior) to ultimately learn how to control various simulated robotic systems as well as a complex robotic arm using a realistic robotics simulator, i.e., the Surreal Robotics Suite, for the block lifting task and can pick-and-place problems. Notably, our experimental results demonstrate that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.Comment: Contains appendix with pseudocode and additional detail

    A Virtual Environment for Remote Testing of Complex Systems

    Get PDF
    Complex systems, realized by integration of several components or subsystems, pose specific problems to simulation environments. It is, in fact, desirable to simulate the complex system altogether, and not component by component, since the operation of the single part depends on the surrounding system and an early verification can prevent damages and save time for modifications. The availability of detailed and validated models of the single parts is therefore critical. This task may be difficult to achieve. In fact, in industrial applications, where a system can be a mix of different devices produced by different manufacturers, the physical device may not be accessible to the modeler for proprietary or safety concerns. Starting from this point, the idea of creating a virtual environment able to test the real single component remotely, employing simulators with remote signal processing capability, has been considered. In this paper a methodology for remote model validation is presented. The effectiveness of the approach is experimentally verified locally and remotely. For the remote testing, in particular, the physical device under test is located at the Politecnico di Milano, Italy, and the Virtual Test Bed model is located at the University of South Carolina

    Tracking mobile targets through Wireless Sensor Networks

    Get PDF
    In recent years, advances in signal processing have led to small, low power, inexpensive Wireless Sensor Network (WSN). The signal processing in WSN is different from the traditional wireless networks in two critical aspects: firstly, the signal processing in WSN is performed in a fully distributed manner, unlike in traditional wireless networks; secondly, due to the limited computation capabilities of sensor networks, it is essential to develop an energy and bandwidth efficient signal processing algorithms. Target localisation and tracking problems in WSNs have received considerable attention recently, driven by the necessity to achieve higher localisation accuracy, lower cost, and the smallest form factor. Received Signal Strength (RSS) based localisation techniques are at the forefront of tracking research applications. Since tracking algorithms have been attracting research and development attention recently, prolific literature and a wide range of proposed approaches regarding the topic have emerged. This thesis is devoted to discussing the existing WSN-based localisation and tracking approaches. This thesis includes five studies. The first study leads to the design and implementation of a triangulation-based localisation approach using RSS technique for indoor tracking applications. The presented work achieves low localisation error in complex environments by predicting the environmental characteristics among beacon nodes. The second study concentrates on investigating a fingerprinting localisation method for indoor tracking applications. The proposed approach offers reasonable localisation accuracy while requiring a short period of offline computation time. The third study focuses on designing and implementing a decentralised tracking approach for tracking multiple mobile targets with low resource requirements. Despite the interest in target tracking and localisation issues, there are few systems deployed using ZigBee network standard, and no tracking system has used the full features of the ZigBee network standard. Tracking through the ZigBee is a challenging task when the density of router and end-device nodes is low, due to the limited communication capabilities of end-device nodes. The fourth study focuses on developing and designing a practical ZigBee-based tracking approach. To save energy, different strategies were adopted. The fifth study outlines designing and implementing an energy-efficient approach for tracking applications. This study consists of two main approaches: a data aggregation approach, proposed and implemented in order to reduce the total number of messages transmitted over the network; and a prediction approach, deployed to increase the lifetime of the WSN. For evaluation purposes, two environmental models were used in this thesis: firstly, real experiments, in which the proposed approaches were implemented on real sensor nodes, to test the validity for the proposed approaches; secondly, simulation experiments, in which NS-2 was used to evaluate the power-consumption issues of the two approaches proposed in this thesis

    A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

    Full text link
    Far-field speech recognition is a challenging task that conventionally uses signal processing beamforming to attack noise and interference problem. But the performance has been found usually limited due to heavy reliance on environmental assumption. In this paper, we propose a unified multichannel far-field speech recognition system that combines the neural beamforming and transformer-based Listen, Spell, Attend (LAS) speech recognition system, which extends the end-to-end speech recognition system further to include speech enhancement. Such framework is then jointly trained to optimize the final objective of interest. Specifically, factored complex linear projection (fCLP) has been adopted to form the neural beamforming. Several pooling strategies to combine look directions are then compared in order to find the optimal approach. Moreover, information of the source direction is also integrated in the beamforming to explore the usefulness of source direction as a prior, which is usually available especially in multi-modality scenario. Experiments on different microphone array geometry are conducted to evaluate the robustness against spacing variance of microphone array. Large in-house databases are used to evaluate the effectiveness of the proposed framework and the proposed method achieve 19.26\% improvement when compared with a strong baseline

    Asymptotic Task-Based Quantization with Application to Massive MIMO

    Get PDF
    Quantizers take part in nearly every digital signal processing system which operates on physical signals. They are commonly designed to accurately represent the underlying signal, regardless of the specific task to be performed on the quantized data. In systems working with high-dimensional signals, such as massive multiple-input multiple-output (MIMO) systems, it is beneficial to utilize low-resolution quantizers, due to cost, power, and memory constraints. In this work we study quantization of high-dimensional inputs, aiming at improving performance under resolution constraints by accounting for the system task in the quantizers design. We focus on the task of recovering a desired signal statistically related to the high-dimensional input, and analyze two quantization approaches: We first consider vector quantization, which is typically computationally infeasible, and characterize the optimal performance achievable with this approach. Next, we focus on practical systems which utilize hardware-limited scalar uniform analog-to-digital converters (ADCs), and design a task-based quantizer under this model. The resulting system accounts for the task by linearly combining the observed signal into a lower dimension prior to quantization. We then apply our proposed technique to channel estimation in massive MIMO networks. Our results demonstrate that a system utilizing low-resolution scalar ADCs can approach the optimal channel estimation performance by properly accounting for the task in the system design

    Collaborative signal and information processing for target detection with heterogeneous sensor networks

    Get PDF
    In this paper, an approach for target detection and acquisition with heterogeneous sensor networks through strategic resource allocation and coordination is presented. Based on sensor management and collaborative signal and information processing, low-capacity low-cost sensors are strategically deployed to guide and cue scarce high performance sensors in the network to improve the data quality, with which the mission is eventually completed more efficiently with lower cost. We focus on the problem of designing such a network system in which issues of resource selection and allocation, system behaviour and capacity, target behaviour and patterns, the environment, and multiple constraints such as the cost must be addressed simultaneously. Simulation results offer significant insight into sensor selection and network operation, and demonstrate the great benefits introduced by guided search in an application of hunting down and capturing hostile vehicles on the battlefield
    corecore