3,182 research outputs found

    A review of differentiable digital signal processing for music and speech synthesis

    Get PDF
    The term “differentiable digital signal processing” describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music and speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably, which is further supported by a web book containing practical advice on differentiable synthesiser programming (https://intro2ddsp.github.io/). Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research

    Under construction: infrastructure and modern fiction

    Full text link
    In this dissertation, I argue that infrastructural development, with its technological promises but widening geographic disparities and social and environmental consequences, informs both the narrative content and aesthetic forms of modernist and contemporary Anglophone fiction. Despite its prevalent material forms—roads, rails, pipes, and wires—infrastructure poses particular formal and narrative problems, often receding into the background as mere setting. To address how literary fiction theorizes the experience of infrastructure requires reading “infrastructurally”: that is, paying attention to the seemingly mundane interactions between characters and their built environments. The writers central to this project—James Joyce, William Faulkner, Karen Tei Yamashita, and Mohsin Hamid—take up the representational challenges posed by infrastructure by bringing transit networks, sanitation systems, and electrical grids and the histories of their development and use into the foreground. These writers call attention to the political dimensions of built environments, revealing the ways infrastructures produce, reinforce, and perpetuate racial and socioeconomic fault lines. They also attempt to formalize the material relations of power inscribed by and within infrastructure; the novel itself becomes an imaginary counterpart to the technologies of infrastructure, a form that shapes and constrains what types of social action and affiliation are possible

    DDSP-Piano: A Neural Sound Synthesizer Informed by Instrument Knowledge

    Get PDF
    Instrument sound synthesis using deep neural networks has received numerous improvements over the last couple of years. Among them, the Differentiable Digital Signal Processing (DDSP) framework has modernized the spectral modeling paradigm by including signal-based synthesizers and effects into fully differentiable architectures. The present work extends the applications of DDSP to the task of polyphonic sound synthesis, with the proposal of a differentiable piano synthesizer conditioned on MIDI inputs. The model architecture is motivated by high-level acoustic modeling knowledge of the instrument, which, along with the sound structure priors inherent to the DDSP components, makes for a lightweight, interpretable, and realistic-sounding piano model. A subjective listening test has revealed that the proposed approach achieves better sound quality than a state-of-the-art neural-based piano synthesizer, but physical-modeling-based models still hold the best quality. Leveraging its interpretability and modularity, a qualitative analysis of the model behavior was also conducted: it highlights where additional modeling knowledge and optimization procedures could be inserted in order to improve the synthesis quality and the manipulation of sound properties. Eventually, the proposed differentiable synthesizer can be further used with other deep learning models for alternative musical tasks handling polyphonic audio and symbolic data

    Contributions to improve the technologies supporting unmanned aircraft operations

    Get PDF
    Mención Internacional en el título de doctorUnmanned Aerial Vehicles (UAVs), in their smaller versions known as drones, are becoming increasingly important in today's societies. The systems that make them up present a multitude of challenges, of which error can be considered the common denominator. The perception of the environment is measured by sensors that have errors, the models that interpret the information and/or define behaviors are approximations of the world and therefore also have errors. Explaining error allows extending the limits of deterministic models to address real-world problems. The performance of the technologies embedded in drones depends on our ability to understand, model, and control the error of the systems that integrate them, as well as new technologies that may emerge. Flight controllers integrate various subsystems that are generally dependent on other systems. One example is the guidance systems. These systems provide the engine's propulsion controller with the necessary information to accomplish a desired mission. For this purpose, the flight controller is made up of a control law for the guidance system that reacts to the information perceived by the perception and navigation systems. The error of any of the subsystems propagates through the ecosystem of the controller, so the study of each of them is essential. On the other hand, among the strategies for error control are state-space estimators, where the Kalman filter has been a great ally of engineers since its appearance in the 1960s. Kalman filters are at the heart of information fusion systems, minimizing the error covariance of the system and allowing the measured states to be filtered and estimated in the absence of observations. State Space Models (SSM) are developed based on a set of hypotheses for modeling the world. Among the assumptions are that the models of the world must be linear, Markovian, and that the error of their models must be Gaussian. In general, systems are not linear, so linearization are performed on models that are already approximations of the world. In other cases, the noise to be controlled is not Gaussian, but it is approximated to that distribution in order to be able to deal with it. On the other hand, many systems are not Markovian, i.e., their states do not depend only on the previous state, but there are other dependencies that state space models cannot handle. This thesis deals a collection of studies in which error is formulated and reduced. First, the error in a computer vision-based precision landing system is studied, then estimation and filtering problems from the deep learning approach are addressed. Finally, classification concepts with deep learning over trajectories are studied. The first case of the collection xviiistudies the consequences of error propagation in a machine vision-based precision landing system. This paper proposes a set of strategies to reduce the impact on the guidance system, and ultimately reduce the error. The next two studies approach the estimation and filtering problem from the deep learning approach, where error is a function to be minimized by learning. The last case of the collection deals with a trajectory classification problem with real data. This work completes the two main fields in deep learning, regression and classification, where the error is considered as a probability function of class membership.Los vehículos aéreos no tripulados (UAV) en sus versiones de pequeño tamaño conocidos como drones, van tomando protagonismo en las sociedades actuales. Los sistemas que los componen presentan multitud de retos entre los cuales el error se puede considerar como el denominador común. La percepción del entorno se mide mediante sensores que tienen error, los modelos que interpretan la información y/o definen comportamientos son aproximaciones del mundo y por consiguiente también presentan error. Explicar el error permite extender los límites de los modelos deterministas para abordar problemas del mundo real. El rendimiento de las tecnologías embarcadas en los drones, dependen de nuestra capacidad de comprender, modelar y controlar el error de los sistemas que los integran, así como de las nuevas tecnologías que puedan surgir. Los controladores de vuelo integran diferentes subsistemas los cuales generalmente son dependientes de otros sistemas. Un caso de esta situación son los sistemas de guiado. Estos sistemas son los encargados de proporcionar al controlador de los motores información necesaria para cumplir con una misión deseada. Para ello se componen de una ley de control de guiado que reacciona a la información percibida por los sistemas de percepción y navegación. El error de cualquiera de estos sistemas se propaga por el ecosistema del controlador siendo vital su estudio. Por otro lado, entre las estrategias para abordar el control del error se encuentran los estimadores en espacios de estados, donde el filtro de Kalman desde su aparición en los años 60, ha sido y continúa siendo un gran aliado para los ingenieros. Los filtros de Kalman son el corazón de los sistemas de fusión de información, los cuales minimizan la covarianza del error del sistema, permitiendo filtrar los estados medidos y estimarlos cuando no se tienen observaciones. Los modelos de espacios de estados se desarrollan en base a un conjunto de hipótesis para modelar el mundo. Entre las hipótesis se encuentra que los modelos del mundo han de ser lineales, markovianos y que el error de sus modelos ha de ser gaussiano. Generalmente los sistemas no son lineales por lo que se realizan linealizaciones sobre modelos que a su vez ya son aproximaciones del mundo. En otros casos el ruido que se desea controlar no es gaussiano, pero se aproxima a esta distribución para poder abordarlo. Por otro lado, multitud de sistemas no son markovianos, es decir, sus estados no solo dependen del estado anterior, sino que existen otras dependencias que los modelos de espacio de estados no son capaces de abordar. Esta tesis aborda un compendio de estudios sobre los que se formula y reduce el error. En primer lugar, se estudia el error en un sistema de aterrizaje de precisión basado en visión por computador. Después se plantean problemas de estimación y filtrado desde la aproximación del aprendizaje profundo. Por último, se estudian los conceptos de clasificación con aprendizaje profundo sobre trayectorias. El primer caso del compendio estudia las consecuencias de la propagación del error de un sistema de aterrizaje de precisión basado en visión artificial. En este trabajo se propone un conjunto de estrategias para reducir el impacto sobre el sistema de guiado, y en última instancia reducir el error. Los siguientes dos estudios abordan el problema de estimación y filtrado desde la perspectiva del aprendizaje profundo, donde el error es una función que minimizar mediante aprendizaje. El último caso del compendio aborda un problema de clasificación de trayectorias con datos reales. Con este trabajo se completan los dos campos principales en aprendizaje profundo, regresión y clasificación, donde se plantea el error como una función de probabilidad de pertenencia a una clase.I would like to thank the Ministry of Science and Innovation for granting me the funding with reference PRE2018-086793, associated to the project TEC2017-88048-C2-2-R, which provide me the opportunity to carry out all my PhD. activities, including completing an international research internship.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Antonio Berlanga de Jesús.- Secretario: Daniel Arias Medina.- Vocal: Alejandro Martínez Cav

    Complexity Science in Human Change

    Get PDF
    This reprint encompasses fourteen contributions that offer avenues towards a better understanding of complex systems in human behavior. The phenomena studied here are generally pattern formation processes that originate in social interaction and psychotherapy. Several accounts are also given of the coordination in body movements and in physiological, neuronal and linguistic processes. A common denominator of such pattern formation is that complexity and entropy of the respective systems become reduced spontaneously, which is the hallmark of self-organization. The various methodological approaches of how to model such processes are presented in some detail. Results from the various methods are systematically compared and discussed. Among these approaches are algorithms for the quantification of synchrony by cross-correlational statistics, surrogate control procedures, recurrence mapping and network models.This volume offers an informative and sophisticated resource for scholars of human change, and as well for students at advanced levels, from graduate to post-doctoral. The reprint is multidisciplinary in nature, binding together the fields of medicine, psychology, physics, and neuroscience

    Noise and vestibular perception of passive self-motion

    Get PDF
    Noise defined as random disturbances is ubiquitous in both the external environment and the nervous system. Depending on the context, noise can degrade or improve information processing and performance. In all cases, it contributes to neural systems dynamics. We review some effects of various sources of noise on the neural processing of self-motion signals at different stages of the vestibular pathways and the resulting perceptual responses. Hair cells in the inner ear reduce the impact of noise by means of mechanical and neural filtering. Hair cells synapse on regular and irregular afferents. Variability of discharge (noise) is low in regular afferents and high in irregular units. The high variability of irregular units provides information about the envelope of naturalistic head motion stimuli. A subset of neurons in the vestibular nuclei and thalamus are optimally tuned to noisy motion stimuli that reproduce the statistics of naturalistic head movements. In the thalamus, variability of neural discharge increases with increasing motion amplitude but saturates at high amplitudes, accounting for behavioral violation of Weber’s law. In general, the precision of individual vestibular neurons in encoding head motion is worse than the perceptual precision measured behaviorally. However, the global precision predicted by neural population codes matches the high behavioral precision. The latter is estimated by means of psychometric functions for detection or discrimination of whole-body displacements. Vestibular motion thresholds (inverse of precision) reflect the contribution of intrinsic and extrinsic noise to perception. Vestibular motion thresholds tend to deteriorate progressively after the age of 40 years, possibly due to oxidative stress resulting from high discharge rates and metabolic loads of vestibular afferents. In the elderly, vestibular thresholds correlate with postural stability: the higher the threshold, the greater is the postural imbalance and risk of falling. Experimental application of optimal levels of either galvanic noise or whole-body oscillations can ameliorate vestibular function with a mechanism reminiscent of stochastic resonance. Assessment of vestibular thresholds is diagnostic in several types of vestibulopathies, and vestibular stimulation might be useful in vestibular rehabilitation

    Moving Through Experience: Disruption, Emergence, and the Aesthetic of Repose

    Get PDF
    Contemporary aesthetic philosophy engages the notion of aesthetic experience from two conflicting lenses; on one hand are those who support a connection between the aesthetic and political while the other favors a more pragmatic position. An area of aesthetic engagement not yet explored inhabits an intermediary between these opposing poles, a modality of aesthetic experience I term, the aesthetic of repose. This dissertation traces the evolution of ideas regarding aesthetic experience through a survey of several philosophers whose varied perspectives form the foundation for my inquiry. Beginning with an exploration of Immanuel Kant’s Critique of Judgement, proceeding through Friedrich Nietzsche’s Birth of Tragedy, and progressing to John Dewey’s Art as Experience, my aim is first, to situate their individual aesthetic philosophies within the context of 21st century aesthetic experience. Despite their differing viewpoints, these thinkers share in common; 1) the importance of sense and sensation to valuable aesthetic experience and 2) a desire to find value and meaning in aesthetic experience for overcoming the ills of humanity and advancing culture. Secondly, this dissertation examines a polarity of ideas that challenge the notion of authentic aesthetic experience in our times. Similar to their predecessors, contemporary aesthetic philosophers desire to make aesthetic experience a portal for humanity’s recuperation. There are thinkers such as Jacques Ranciére and Santiago Zabala, who advance an aesthetics of action; others, like Richard Shusterman and Hans Ulrich Gumbrecht advocate for an aesthetics of presence. The aesthetic of repose rests uniquely between action and presence, as an area of slumber, where neither action nor presence is necessary. Rather, the idea is to remain in repose, linger there, where repositioning occurs naturally, as though without perception. One emerges from this seemingly imperceptible experience, having done nothing save moving through it, yet being forever changed by it.https://digitalmaine.com/academic/1046/thumbnail.jp
    corecore