17 research outputs found

    A non-linear polynomial approximation filter for robust speaker verification

    Bibliography: leaves 101-109

    Investigating Latent State and Uncertainty Representations in Reinforcement Learning

    Learning latent space representations of high-dimensional world states has been at the core of recent rapid growth in reinforcement learning(RL). At the same time, RL algo- rithms have suffered from ignored uncertainties in the predicted estimates of model-free or model-based methods. In our work, we investigate both of these aspects independently. Firstly, we studied the explainability of policies learned over latent representations. In particular, we focus on control policies represented as recurrent neural networks (RNNs) which are difficult to explain, understand, and analyze due to their use of continuous- valued memory vectors and observation features. We introduced a new technique, Quan- tized Bottleneck Insertion, to learn finite representations of these vectors and features. This helped us to create a finite-state machine representation of the policies which we show improves their interpretability. Secondly, we studied model-based reinforcement learning approach for continuous action spaces based on tree-based planning over learned latent dynamics. We demonstrate improvement in sample efficiency and performance on a majority of challenging continuous-control benchmarks compared to the state-of-the- art methods by including look-ahead search during decision-time planning. Thirdly, we study policy evaluation over offline historical data and highlight the need to couple confi- dence values with the estimated policy evaluations for capturing uncertainties. Towards this, we created a benchmark to study confidence estimation by offline reinforcement learning(ORL) methods. This benchmark is derived by adding sets of policy comparison queries to datasets from ORL and comes with a set of evaluation metrics. In addition, we present an empirical evaluation of a class of model-based baselines over our benchmark. These baselines learn ensembles of dynamics models, which are used in various ways to produce simulations for answering queries with confidence values. While our results suggested advantages for certain baseline variations, there appears to be significant room for improvement in future work

    Maritime Augmented Reality mit a prioriWissen aus Seekarten

    The main objective of this thesis is to provide a concept to augment mar- itime sea chart information into the camera view of the user. The benefit is the simpler navigation due to the offered 3D information and the overlay onto the real 3D environment. In the maritime context special conditions hold. The sensor technologies have to be reliable in the environment of a ship’s ferrous construction. The aug- mentation of the objects has to be very precise due to the far distances of observable objects on the sea surface. Furthermore, the approach has to be reliable due to the wide range of light conditions. For a practical solution, the system has to be mobile, light-weight and with a real-time performance. To achieve this goal, the requirements are set, the possible measurement units and the data base structure are presented. First, the requirements are analyzed and a suitable system is designed. By the combination of proper sensor techniques, the local position and orienta- tion of the user can be estimated. To verify the concept, several prototypes with exchangeable units have been evaluated. This first concept is based on a marker-based approach which leads to some drawbacks. To overcome the drawbacks, the second aspect is the improvement of the sys- tem and the analysis of markerless approaches. One possible strategy will be presented. The approach uses the statistical technique of Bayesian networks to vote for single objects in the environment. By this procedure it will be shown, that due to the a priori information the underlying sea chart system has the most benefit. The analysis of the markerless approach shows, that the sea charts structure has to be adapted to the new requirements of interactive 3D augmentation scenes. After the analysis of the chart data concept, an approach for the optimization of the charts by building up an object-to-object topology within the charts data and the Bayesian object detection approach is presented. Finally, several evaluations show the performance of the imple- mented evaluation application.Diese Arbeit stellt ein Konzept zur Verfügung, um Seekarteninformationen in eine Kamera so einzublenden, dass die Informationen lagerichtig im Sichtfeld des Benutzers erscheinen. Der Mehrwert ist eine einfachere Navigation durch die Nutzung von 3D-Symbolen in der realen Umgebung. Im maritimen Umfeld gelten besondere Anforderungen an die Aufgabenstellung. Die genutzten Sensoren müssen in der Lage sein, robuste Daten in Anwesenheit der eisenhaltigen Materialien auf dem Schiff zu liefern. Die Augmentierung muss hoch genau berechnet werden, da die beobachtbaren Objekte zum Teil sehr weit entfernt auf der Meeresoberfläche verteilt sind. Weiterhin gelten die Bedingungen einer Außenumgebung, wie variierende Wetter- und Lichtbedingungen. Um eine praktikable Anwendung gewährleisten zu können, ist ein mobiles, leicht-gewichtiges und echtzeitfähiges System zu entwickeln. In dieser Arbeit werden die Anforderungen gesetzt und Konzepte für die Hardware- und Softwarelösungen beschrieben. Im ersten Teil werden die Anforderungen analysiert und ein geeignetes Hardwaresystem entwickelt. Durch die passende Kombination von Sensortechnologien kann damit die lokale Position und Orientierung des Benutzers berechnet werden. Um das Konzept zu evaluieren sind verschiedene modulare Hardware- und Softwarekonzepte als Prototypen umgesetzt worden. Das erste Softwarekonzept befasst sich mit einem markerbasierten Erkennungsalgorithmus, der in der Evaluation einige Nachteile zeigt. Dementsprechende Verbesserungen wurden in einem zweiten Softwarekonzept durch einen markerlosen Ansatz umgesetzt. Dieser Lösungsansatz nutzt Bayes'sche Netzwerke zur Erkennung einzelner Objekte in der Umgebung. Damit kann gezeigt werden, dass mit der Hilfe von a priori Informationen die dem System zugrunde liegenden Seekarten sehr gut zu diesem Zweck genutzt werden können. Die Analyse des Systemkonzeptes zeigt des weiteren, dass die Datenstruktur der Seekarten für die Anforderungen einer interaktiven, benutzergeführten 3D- Augmentierungsszene angepasst werden müssen. Nach der ausführlichen Analyse des Seekarten-Datenkonzeptes wird ein Lösungsansatz zur Optimierung der internen Seekartenstruktur aufgezeigt. Dies wird mit der Erstellung einer Objekt-zu-Objekt-Topologie in der Datenstruktur und der Verbindung zum Bayes'schen Objekterkennungsalgorithmus umgesetzt. Anschließend zeigen Evaluationen die Fähigkeiten des endgültigen Systems

    The generative, analytic and instructional capacities of sound in architecture : fundamentals, tools and evaluation of a design methodology

    Premi extraordinari doctorat UPC curs 2017-2018, Àmbit d’Arquitectura, Urbanisme i EdificacióThe disciplines of space and time form two domains to which it is daring to compare, since it is obvious that they are of a different nature. Music happens in time, while architecture happens in space. However, from the first treatises on both architecture and music, repeated calls for comparison, complementarity and influence of both disciplines can be read, at least to the observation of certain common orders between the two domains. In this doctoral thesis we do not question this whole theoretical corpus that has been enriching the relationship between both disciplines. We received it and joined that stream of knowledge. What we do notice, however, is the almost impertinent question that follows: can sound help the architect in his daily tasks? And, therefore, what are the contributions of sound to the architect? To do this we must seek the connection in the principles of both arts, where we can detach ourselves from time and space, and approach the most universal of art forms. The architect, in his daily work, is faced with three particular tasks: the architectural project, the architectural analysis and the teaching of architecture. Each of the three tasks is connected with the other two tasks: the project is carried out again with the analysis and transmitted to the new architect; the analysis supports the project decisions and gives tools to the disciple; and the teaching has the project as its purpose and the analysis as its method. The thesis presented here shows what sound offers to the task of the project, to that of analysis and to that of teaching. These three tasks are approached from three premises: theoretical foundations, tools and evaluation. The interaction of the three tasks with the three premises gives rise to nine lines of work that articulate the chapters of the thesis. The first, fourth and seventh chapters approach the three tasks from the premise of theoretical foundations, foundations that perhaps because they are obvious, have been ignored or overlooked but which constitute the nature of both disciplines. The first shows, by the hand of two 20th century authors - the architect Dom Hans van der Laan and the composer Olivier Messiaen - that creation in both disciplines is of a systematic nature. The fourth one revaluates the analytical systems of representation of form both in architecture and in music which, starting with the basic characteristics of its elements, lead to a symbolic notation and a tool for the analysis of the work: the plan and the score. The seventh introduces the student of architecture to the growing separation between music and architecture that has been accentuated to this day. The second, fifth and eighth chapters approach the three particular tasks from the premise of tools, working tools that help to understand more directly the influence of architecture on sound. The second places virtual reality and auralization techniques at the service of the architectural and urban planning project, enhancing the sound experience in these projects. The fifth deals with the acoustic analysis of exterior spaces and their relationship with the urban configuration of these spaces. The eighth section presents the study of acoustic heritage as an educational tool. The third, sixth and ninth chapters deal with the three tasks from the premise of evaluation, a check that ensures the influence of sound on them through teaching experiments. The third argues and exemplifies that a sound landscape can be the engine and generator of an architectural design. The sixth one reviews the methods for evaluating the subjective and objective parameters of architectural acoustics. The ninth shows that in teaching sound to architects, "learning by listening" should be given priority over "passive learning".Las disciplinas del espacio y del tiempo forman dos dominios a los que resulta atrevido comparar, pues es obvio que son de naturaleza distinta. La música ocurre en el tiempo, mientras que la arquitectura en el espacio. No obstante, desde los primeros tratados tanto de arquitectura como de música, se pueden leer repetidas llamadas a la comparación, al complemento y a la influencia de ambas disciplinas, cuanto menos a la constatación de ciertos órdenes comunes entre ambos dominios. En esta tesis doctoral no ponemos en cuestión todo este corpus teórico que ha venido enriqueciendo la relación entre ambas disciplinas. La recibimos y nos unimos a esa corriente de conocimiento. En lo que sí reparamos, en cambio, es en la pregunta casi impertinente que surge seguidamente: ¿puede el sonido ayudar al arquitecto en sus tareas diarias? Y, por tanto, ¿cuáles son las contribuciones del sonido para el arquitecto? Para ello debemos buscar la conexión en los principios de ambas artes, allí donde podemos despegarnos del tiempo y del espacio, y acercarnos a la más universal de las formas de arte. El arquitecto, en su tarea diaria, se enfrenta a tres tareas particulares: el proyecto arquitectónico, el análisis arquitectónico y la enseñanza de la arquitectura. Cada una de las tres tareas está conectada con las otras dos: el proyecto se reconduce con el análisis y se transmite al nuevo arquitecto; el análisis soporta las decisiones de proyecto y da herramientas al discípulo; y la enseñanza tiene como fin el proyecto y como método el análisis. La tesis aquí presentada pone de manifiesto lo que el sonido ofrece a la tarea del proyecto, a la del análisis y a la de la enseñanza. Estas tres tareas son abordadas desde tres premisas: los fundamentos teóricos, las herramientas y la evaluación. La interacción de las tres tareas con las tres premisas da lugar a nueve líneas de trabajo que articulan los capítulos de la tesis. Los capítulos primero, cuarto y séptimo abordan las tres tareas desde la premisa de los fundamentos teóricos, fundamentos que quizá por ser obvios, se han obviado o pasado por alto pero que constituyen la naturaleza de ambas disciplinas. El primero muestra, de la mano de dos autores del siglo XX -el arquitecto Dom Hans van der Laan y el compositor Olivier Messiaen- que la creación en ambas disciplinas es de naturaleza sistemática. El cuarto revaloriza los sistemas analíticos de representación de la forma tanto en arquitectura como en música que, empezando por las características básicas de sus elementos, conducen a una notación simbólica y una herramienta de análisis de la obra: el plano y la partitura. El séptimo presenta al estudiante de arquitectura la creciente separación entre la música y la arquitectura que se ha venido acentuando hasta nuestros días. Los capítulos segundo, quinto y octavo abordan las tres tareas particulares desde la premisa de las herramientas, útiles de trabajo que ayudan a comprender de modo más directo la influencia de la arquitectura en el sonido. El segundo sitúa la realidad virtual y las técnicas de auralización al servicio del proyecto de arquitectura y urbanismo, potenciando la experiencia sonora en estos proyectos. El quinto aborda el análisis acústico de espacios exteriores y su relación con la configuración urbana de estos espacios. El octavo presenta el estudio del patrimonio acústico como herramienta pedagógica. Los capítulos tercero, sexto y noveno abordan las tres tareas desde la premisa de la evaluación, comprobación que asegura mediante experimentos docentes la influencia del sonido en ellas. El tercero argumenta y ejemplifica que un paisaje sonoro puede ser el motor y generador de un diseño arquitectónico. El sexto realiza una revisión de los métodos de evaluación de los parámetros subjetivos y objetivos de la acústica arquitectónica. El noveno muestra que en la enseñanza del sonido para los arquitectos debe priorizarse "aprender escuchando" antes que el "aprendizaje pasivo".Award-winningPostprint (published version

    Toward the real time estimation of the attentional state through ocular activity analysis

    L'analyse d'incidents aéronautiques et d'expériences en laboratoire a montré que la tunnélisation attentionnelle amène les pilotes à négliger des alarmes critiques. Une piste intéressante pour répondre à ce problème s'appuie sur les systèmes adaptatifs qui pourraient assister l'opérateur en temps réel (en changeant le comportement du pilote automatique par exemple). Ce type de systèmes adaptatifs requiert l'état de l'opérateur en entrée. Pour cela, des méthodes d'inférence de l'état de l'opérateur doublées de métriques de la tunnélisation attentionnelle doivent être proposées. Le but de cette thèse de doctorat est d'apporter la preuve que la détection de la tunnélisation attentionnelle est possible en temps réel. Pour cela une méthode adaptative neuro-floue utilisant les métriques de la tunnélisation attentionnelle sera proposée, ainsi que de nouvelles métriques de la tunnélisation attentionnelle qui ne dépendent pas du contexte de l'opérateur, et qui sont calculables en temps réel. L'algorithme d'identification des états de l'oeil (ESIA) est proposé en ce sens. Les métriques attentionnelles en sont dérivées et testées dans le contexte d'une expérience robotique dont le design favorise la tunnélisation attentionnellle. Nous proposons également une nouvelle définition du ratio exploitation/exploration d'information dont la pertinence en tant que marqueur de la tunnélisation attentionnelle est démontrée statistiquement. Le travail est ensuite discuté et appliqué sur divers cas d'étude en aviation et robotique.The analysis of aerospace incidents and laboratory experiments have shown that attentional tunneling leads pilots to neglect critical alarms. One interesting avenue to deal with this issue is to consider adaptive systems that would help the operator in real time (for instance: switching the auto-pilot mode). Such adaptive systems require the operator's state as an input. Therefore, both attentional tunneling metrics and state inference techniques have to be proposed. The goal of the PhD Thesis is to provide attentional tunneling metrics that are real-time and context independent. The Eye State Identification Algorithm (ESIA) that analyses ocular activity is proposed. Metrics are then derived and tested on a robotic experiment meant for favouring attentional tunneling. We also propose a new definition of the explore/exploit ratio that was proven statistically to be a relevant attentional tunneling marker. This work is then discussed and applied to different case studies in aviation and robotics

    Wall of Noise, Web of Silence

    The Artist Formerly Known as David Pledger will release his first concept album WALL OF NOISE WEB OF SILENCE with LINER NOTES Using the moniker, dp, the artist takes his lead from all those brave artists who embraced the challenge of the concept album, from Brian Wilson of The Beach Boys through to Radiohead and Bjork. “The concept album best serves my ambition to create a kind of knowledge in which aesthetics and scholarship operate in a complementary and expansive mode, so that argument may be understood simultaneously through the processes of thinking and feeling.” dp is renowned for his exploratory and experimental artistic adventures that include The Austral/Asian Post-Cartoon: sports edition (1997), Blowback (2004), The Strangeland Triptych (2006-9), Ampersand et al (2010) and David Pledger Is Running For Office (2015). Side 1 is dp’s thesis on democracy in the age of neo-liberalism through the lens of the arts. Side 2 is its antithesis, an atmosphere of democracy in which artistic practice is the prevailing determinant. Noise and silence are the aesthetic frames. The arts, society and politics provide the windows. “On Side 1, you’ll play ‘concept’ tracks, a mini-EP and three 12-inch 45s backgrounded by the beautiful sounds and disturbing speeches of my artistic oeuvre. On Side 2, I take you on a journey that aspires to ‘listen’ a way towards a ‘solution’ to our deepest problem: what to do about democracy?” Two sides riffing off each other and a comprehensive clutch of Liner Notes that reflect on what it means for dp’s artistic practice. “For thirty years, my life as an artist has been a procession of master shots of Western culture from different vantage points: minor pop culture personality, writer, performer, director, producer, arts leader, cultural activist. These are the band members on my new album taking the lead on some tracks, playing back-up on others. The most ambitious and comprehensive explication of my practice to date, Wall of Noise, Web of Silence poses the perfect question: what next, dp, what next….

    Evaluating the Perceived Quality of Binaural Technology

    This thesis studies binaural sound reproduction from both a technical and a perceptual perspective, with the aim of improving the headphone listening experience for entertainment media audiences. A detailed review is presented of the relevant binaural technology and of the concepts and methods for evaluating perceived quality. A pilot study assesses the application of state-of-the-art binaural rendering systems to existing broadcast programmes, finding no substantial improvements in quality over conventional stereo signals. A second study gives evidence that realistic binaural simulation can be achieved without personalised acoustic calibration, showing promise for the application of binaural technology. Flexible technical apparatus is presented to allow further investigation of rendering techniques and content production processes. Two web-based studies show that appropriate combination of techniques can lead to improved experience for typical audience members, compared to stereo signals, even without personalised rendering or listener head-tracking. Recent developments in spatial audio applications are then discussed. These have made dynamic client-side binaural rendering with listener head-tracking feasible for mass audiences, but also present technical constraints. To limit distribution bandwidth and computational complexity during rendering, loudspeaker virtualisation is widely used. The effects on perceived quality of these techniques are studied in depth for the first time. A descriptive analysis experiment demonstrates that loudspeaker virtualisation during binaural rendering causes degradations to a range of perceptual characteristics and that these vary across other system conditions. A final experiment makes novel use of the check-all-that-apply method to efficiently characterise the quality of seven spatial audio representations and associated dynamic binaural rendering techniques, using single sound sources and complex dramatic scenes. The perceived quality of these different representations varies significantly across a wide range of characteristics and with programme material. These methods and findings can be used to improve the quality of current binaural technology applications