1,888 research outputs found

    A review of differentiable digital signal processing for music and speech synthesis

    Get PDF
    The term “differentiable digital signal processing” describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music and speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably, which is further supported by a web book containing practical advice on differentiable synthesiser programming (https://intro2ddsp.github.io/). Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research

    A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks

    Full text link
    Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data. Unlike conventional neural networks or updated versions of Recurrent Neural Networks (RNNs) such as Long Short-Term Memory (LSTM), transformer models excel in handling long dependencies between input sequence elements and enable parallel processing. As a result, transformer-based models have attracted substantial interest among researchers in the field of artificial intelligence. This can be attributed to their immense potential and remarkable achievements, not only in Natural Language Processing (NLP) tasks but also in a wide range of domains, including computer vision, audio and speech processing, healthcare, and the Internet of Things (IoT). Although several survey papers have been published highlighting the transformer's contributions in specific fields, architectural differences, or performance evaluations, there is still a significant absence of a comprehensive survey paper encompassing its major applications across various domains. Therefore, we undertook the task of filling this gap by conducting an extensive survey of proposed transformer models from 2017 to 2022. Our survey encompasses the identification of the top five application domains for transformer-based models, namely: NLP, Computer Vision, Multi-Modality, Audio and Speech Processing, and Signal Processing. We analyze the impact of highly influential transformer-based models in these domains and subsequently classify them based on their respective tasks using a proposed taxonomy. Our aim is to shed light on the existing potential and future possibilities of transformers for enthusiastic researchers, thus contributing to the broader understanding of this groundbreaking technology

    AI: Limits and Prospects of Artificial Intelligence

    Get PDF
    The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence

    On the Utility of Representation Learning Algorithms for Myoelectric Interfacing

    Get PDF
    Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden

    A survey on run-time power monitors at the edge

    Get PDF
    Effectively managing energy and power consumption is crucial to the success of the design of any computing system, helping mitigate the efficiency obstacles given by the downsizing of the systems while also being a valuable step towards achieving green and sustainable computing. The quality of energy and power management is strongly affected by the prompt availability of reliable and accurate information regarding the power consumption for the different parts composing the target monitored system. At the same time, effective energy and power management are even more critical within the field of devices at the edge, which exponentially proliferated within the past decade with the digital revolution brought by the Internet of things. This manuscript aims to provide a comprehensive conceptual framework to classify the different approaches to implementing run-time power monitors for edge devices that appeared in literature, leading the reader toward the solutions that best fit their application needs and the requirements and constraints of their target computing platforms. Run-time power monitors at the edge are analyzed according to both the power modeling and monitoring implementation aspects, identifying specific quality metrics for both in order to create a consistent and detailed taxonomy that encompasses the vast existing literature and provides a sound reference to the interested reader

    A New Front-End System For UAV-Based Antenna Measurements For Polarimetric Weather Radars

    Get PDF
    Radar system calibration is vital for ensuring optimal performance, especially in weather radars that have stringent requirements for co-polarization mismatch. In-field calibration is essential, particularly for mobile weather radars, as environmental conditions can vary between deployments. Traditionally, conventional far-field ranges or airborne systems such as helicopters and aircraft have been used to measure and calibrate radar systems. However, in recent years, Unmanned Aerial Systems (UAS) have emerged as a cost-effective and flexible alternative for antenna measurement and radar calibration. Previous studies have demonstrated the feasibility of using UAS for far-field antenna measurements across various operating frequencies. These works have achieved high accuracy in characterizing and calibrating polarimetric weather radar systems, meeting critical requirements such as co-polarization mismatch below 0.1 dB and cross-polarization isolation below -45 dB. However, existing UAS-based systems are complex to operate, requiring multiple equipment both on the UAS and the ground station. They are primarily limited to one-way transmission from the UAS to the AUT and lack the capability to switch between RX and TX measurements or H- and V-polarization without physical modifications. The objective of this thesis is to develop a lightweight and self-contained front-end system for UAS-based in-situ antenna characterization. This system will eliminate the need for additional RF instruments on the ground, providing remote real-time control to switch between RX and TX modes in both V- and H-polarization. It will also facilitate the transmission and reception of measurement data over long distances, enabling far-field measurements beyond 120 m. The proposed system aims to address the limitations of existing UAS-based calibration systems, offering a sophisticated and accurate solution for measuring the strictest radar systems. By developing a versatile and lightweight front-end system, this research seeks to advance the field of UAS-based antenna characterization and contribute to the improvement of radar calibration techniques

    Blending the Material and Digital World for Hybrid Interfaces

    Get PDF
    The development of digital technologies in the 21st century is progressing continuously and new device classes such as tablets, smartphones or smartwatches are finding their way into our everyday lives. However, this development also poses problems, as these prevailing touch and gestural interfaces often lack tangibility, take little account of haptic qualities and therefore require full attention from their users. Compared to traditional tools and analog interfaces, the human skills to experience and manipulate material in its natural environment and context remain unexploited. To combine the best of both, a key question is how it is possible to blend the material world and digital world to design and realize novel hybrid interfaces in a meaningful way. Research on Tangible User Interfaces (TUIs) investigates the coupling between physical objects and virtual data. In contrast, hybrid interfaces, which specifically aim to digitally enrich analog artifacts of everyday work, have not yet been sufficiently researched and systematically discussed. Therefore, this doctoral thesis rethinks how user interfaces can provide useful digital functionality while maintaining their physical properties and familiar patterns of use in the real world. However, the development of such hybrid interfaces raises overarching research questions about the design: Which kind of physical interfaces are worth exploring? What type of digital enhancement will improve existing interfaces? How can hybrid interfaces retain their physical properties while enabling new digital functions? What are suitable methods to explore different design? And how to support technology-enthusiast users in prototyping? For a systematic investigation, the thesis builds on a design-oriented, exploratory and iterative development process using digital fabrication methods and novel materials. As a main contribution, four specific research projects are presented that apply and discuss different visual and interactive augmentation principles along real-world applications. The applications range from digitally-enhanced paper, interactive cords over visual watch strap extensions to novel prototyping tools for smart garments. While almost all of them integrate visual feedback and haptic input, none of them are built on rigid, rectangular pixel screens or use standard input modalities, as they all aim to reveal new design approaches. The dissertation shows how valuable it can be to rethink familiar, analog applications while thoughtfully extending them digitally. Finally, this thesis’ extensive work of engineering versatile research platforms is accompanied by overarching conceptual work, user evaluations and technical experiments, as well as literature reviews.Die Durchdringung digitaler Technologien im 21. Jahrhundert schreitet stetig voran und neue Geräteklassen wie Tablets, Smartphones oder Smartwatches erobern unseren Alltag. Diese Entwicklung birgt aber auch Probleme, denn die vorherrschenden berührungsempfindlichen Oberflächen berücksichtigen kaum haptische Qualitäten und erfordern daher die volle Aufmerksamkeit ihrer Nutzer:innen. Im Vergleich zu traditionellen Werkzeugen und analogen Schnittstellen bleiben die menschlichen Fähigkeiten ungenutzt, die Umwelt mit allen Sinnen zu begreifen und wahrzunehmen. Um das Beste aus beiden Welten zu vereinen, stellt sich daher die Frage, wie neuartige hybride Schnittstellen sinnvoll gestaltet und realisiert werden können, um die materielle und die digitale Welt zu verschmelzen. In der Forschung zu Tangible User Interfaces (TUIs) wird die Verbindung zwischen physischen Objekten und virtuellen Daten untersucht. Noch nicht ausreichend erforscht wurden hingegen hybride Schnittstellen, die speziell darauf abzielen, physische Gegenstände des Alltags digital zu erweitern und anhand geeigneter Designparameter und Entwurfsräume systematisch zu untersuchen. In dieser Dissertation wird daher untersucht, wie Materialität und Digitalität nahtlos ineinander übergehen können. Es soll erforscht werden, wie künftige Benutzungsschnittstellen nützliche digitale Funktionen bereitstellen können, ohne ihre physischen Eigenschaften und vertrauten Nutzungsmuster in der realen Welt zu verlieren. Die Entwicklung solcher hybriden Ansätze wirft jedoch übergreifende Forschungsfragen zum Design auf: Welche Arten von physischen Schnittstellen sind es wert, betrachtet zu werden? Welche Art von digitaler Erweiterung verbessert das Bestehende? Wie können hybride Konzepte ihre physischen Eigenschaften beibehalten und gleichzeitig neue digitale Funktionen ermöglichen? Was sind geeignete Methoden, um verschiedene Designs zu erforschen? Wie kann man Technologiebegeisterte bei der Erstellung von Prototypen unterstützen? Für eine systematische Untersuchung stützt sich die Arbeit auf einen designorientierten, explorativen und iterativen Entwicklungsprozess unter Verwendung digitaler Fabrikationsmethoden und neuartiger Materialien. Im Hauptteil werden vier Forschungsprojekte vorgestellt, die verschiedene visuelle und interaktive Prinzipien entlang realer Anwendungen diskutieren. Die Szenarien reichen von digital angereichertem Papier, interaktiven Kordeln über visuelle Erweiterungen von Uhrarmbändern bis hin zu neuartigen Prototyping-Tools für intelligente Kleidungsstücke. Um neue Designansätze aufzuzeigen, integrieren nahezu alle visuelles Feedback und haptische Eingaben, um Alternativen zu Standard-Eingabemodalitäten auf starren Pixelbildschirmen zu schaffen. Die Dissertation hat gezeigt, wie wertvoll es sein kann, bekannte, analoge Anwendungen zu überdenken und sie dabei gleichzeitig mit Bedacht digital zu erweitern. Dabei umfasst die vorliegende Arbeit sowohl realisierte technische Forschungsplattformen als auch übergreifende konzeptionelle Arbeiten, Nutzerstudien und technische Experimente sowie die Analyse existierender Forschungsarbeiten

    Taylor University Catalog 2023-2024

    Get PDF
    The 2023-2024 academic catalog of Taylor University in Upland, Indiana.https://pillars.taylor.edu/catalogs/1128/thumbnail.jp

    Microcircuit structures of inhibitory connectivity in the rat parahippocampal gyrus

    Get PDF
    Komplexe Berechnungen im Gehirn werden durch das Zusammenspiel von exzitatorischen und hemmenden Neuronen in lokalen Netzwerken ermöglicht. In kortikalen Netzwerken, wird davon ausgegangen, dass hemmende Neurone, besonders Parvalbumin positive Korbzellen, ein „blanket of inhibition” generieren. Dieser Sichtpunkt wurde vor kurzem durch Befunde strukturierter Inhibition infrage gestellt, jedoch ist die Organisation solcher Konnektivität noch unklar. In dieser Dissertation, präsentiere ich die Ergebnisse unserer Studie Parvabumin positiver Korbzellen, in Schichten II / III des entorhinalen Kortexes und Präsubiculums der Ratte. Im entorhinalen Kortex haben wir dorsale und ventrale Korbzellen beschrieben und festgestellt, dass diese morphologisch und physiologisch ähnlich, jedoch in ihrer Konnektivität zu Prinzipalzellen dorsal stärker als ventral verbunden sind. Dieser Unterschied korreliert mit Veränderungen der Gitterzellenphysiologie. Ähnlich zeige ich im Präsubiculum, dass inhibitorische Konnektivität eine essenzielle Rolle im lokalen Netzwerk spielt. Hemmung im Präsubiculum ist deutlich spärlicher ist als im entorhinalen Kortex, was ein unterschiedliches Prinzip der Netzwerkorganisation suggeriert. Um diesen Unterschied zu studieren, haben wir Morphologie und Netzwerkeigenschaften Präsubiculärer Korbzellen analysiert. Prinzipalzellen werden über ein vorherrschendes reziprokes Motif gehemmt die durch die polarisierte Struktur der Korbzellaxone ermöglicht wird. Unsere Netzwerksimulationen zeigen, dass eine polarisierte Inhibition Kopfrichtungs-Tuning verbessert. Insgesamt zeigen diese Ergebnisse, dass inhibitorische Konnektivität, funktioneller Anforderungen der lokalen Netzwerke zur Folge, unterschiedlich strukturiert sein kann. Letztlich stelle ich die Hypothese auf, dass für lokale inhibitorische Konnektivität eine Abweichung von „blanket of inhibition― zur „maßgeschneiderten― Inhibition zur Lösung spezifischer computationeller Probleme vorteilhaft sein kann.Local microcircuits in the brain mediate complex computations through the interplay of excitatory and inhibitory neurons. It is generally assumed that fast-spiking parvalbumin basket cells, mediate a non-selective -blanket of inhibition-. This view has been recently challenged by reports structured inhibitory connectivity, but it’s precise organization and relevance remain unresolved. In this thesis, I present the results of our studies examining the properties of fast-spiking parvalbumin basket cells in the superficial medial entorhinal cortex and presubiculum of the rat. Characterizing these interneurons in the dorsal and ventral medial entorhinal cortex, we found basket cells of the two subregions are more likely to be connected to principal cells in the dorsal compared to the ventral region. This difference is correlated with changes in grid physiology. Our findings further indicated that inhibitory connectivity is essential for local computation in the presubiculum. Interestingly though, we found that in this region, local inhibition is lower than in the medial entorhinal cortex, suggesting a different microcircuit organizational principle. To study this difference, we analyzed the properties of fast-spiking basket cells in the presubiculum and found a characteristic spatially organized connectivity principle, facilitated by the polarized axons of the presubicular fast-spiking basket cells. Our network simulations showed that such polarized inhibition can improve head direction tuning of principal cells. Overall, our results show that inhibitory connectivity is differently organized in the medial entorhinal cortex and the presubiculum, likely due to functional requirements of the local microcircuit. As a conclusion to the studies presented in this thesis, I hypothesize that a deviation from the blanket of inhibition, towards a region-specific, tailored inhibition can provide solutions to distinct computational problems

    Leveraging audio-visual speech effectively via deep learning

    Get PDF
    The rising popularity of neural networks, combined with the recent proliferation of online audio-visual media, has led to a revolution in the way machines encode, recognize, and generate acoustic and visual speech. Despite the ubiquity of naturally paired audio-visual data, only a limited number of works have applied recent advances in deep learning to leverage the duality between audio and video within this domain. This thesis considers the use of neural networks to learn from large unlabelled datasets of audio-visual speech to enable new practical applications. We begin by training a visual speech encoder that predicts latent features extracted from the corresponding audio on a large unlabelled audio-visual corpus. We apply the trained visual encoder to improve performance on lip reading in real-world scenarios. Following this, we extend the idea of video learning from audio by training a model to synthesize raw speech directly from raw video, without the need for text transcriptions. Remarkably, we find that this framework is capable of reconstructing intelligible audio from videos of new, previously unseen speakers. We also experiment with a separate speech reconstruction framework, which leverages recent advances in sequence modeling and spectrogram inversion to improve the realism of the generated speech. We then apply our research in video-to-speech synthesis to advance the state-of-the-art in audio-visual speech enhancement, by proposing a new vocoder-based model that performs particularly well under extremely noisy scenarios. Lastly, we aim to fully realize the potential of paired audio-visual data by proposing two novel frameworks that leverage acoustic and visual speech to train two encoders that learn from each other simultaneously. We leverage these pre-trained encoders for deepfake detection, speech recognition, and lip reading, and find that they consistently yield improvements over training from scratch.Open Acces
    • …
    corecore