28 research outputs found

    Microphone Array Processing Techniques for Automatic Lecture Monitoring

    Get PDF
    The gain in popularity of massive open online courses and other online educational lectures prompts the investigation of methods for automatically recording such lectures. While most previous systems in this area have utilized computer vision techniques for tracking, we take an approach utilizing microphone arrays for both recording audio and tracking lecturers. Different source localization and source tracking methods are tested, including cross correlation and beamforming methods combined with various state space model approaches. We investigate how certain constraints granted by a lecture setting may be used to influence our tracking models, and evaluate the relative strengths and weaknesses of several possible techniques. In addition, we explore characterizations of the lecture space that allow for the microphone array to work along with a separate camera to properly record the lecturer's movement. By using the audio to track lecturers we add flexibility to the system, but also introduce difficulties in consolidating information between the microphone array and the camera. Possible methods for communication between the two are addressed, and we again find that constraints imposed by the lecture setting may be used to resolve such problems.Ope

    Optimizing Techniques and Cramer-Rao Bound for Passive Source Location Estimation

    Get PDF
    This work is motivated by the problem of locating potential unstable areas in underground potash mines with better accuracy more consistently while introducing minimum extra computational load. It is important for both efficient mine design and safe mining activities, since these unstable areas may experience local, low-intensity earthquakes in the vicinity of an underground mine. The object of this thesis is to present localization algorithms that can deliver the most consistent and accurate estimation results for the application of interest. As the first step towards the goal, three most representative source localization algorithms given in the literature are studied and compared. A one-step energy based grid search (EGS) algorithm is selected to address the needs of the application of interest. The next step is the development of closed-form Cram´er-Rao bound (CRB) expressions. The mathematical derivation presented in this work deals with continuous signals using the Karhunen-Lo`eve (K-L) expansion, which makes the derivation applicable to non-stationary Gaussian noise problems. Explicit closed-form CRB expressions are presented only for stationary Gaussian noise cases using the spectrum representation of the signal and noise though. Using the CRB comparisons, two approaches are proposed to further improve the EGS algorithm. The first approach utilizes the corresponding analytic expression of the error estimation variance (EEV) given in [1] to derive an amplitude weight expression, optimal in terms of minimizing this EEV, for the case of additive Gaussian noise with a common spectrum interpretation across all the sensors. An alternate noniterative amplitude weighting scheme is proposed based on the optimal amplitude weight expression. It achieves the same performance with less calculation compared with the traditional iterative approach. The second approach tries to optimize the EGS algorithm in the frequency domain. An analytic frequency weighted EEV expression is derived using spectrum representation and the stochastic process theory. Based on this EEV expression, an integral equation is established and solved using the calculus of variations technique. The solution corresponds to a filter transfer function that is optimal in the sense that it minimizes this analytic frequency domain EEV. When various parts of the frequency domain EEV expression are ignored during the minimization procedure using Cauchy-Schwarz inequality, several different filter transfer functions result. All of them turn out to be well known classical filters that have been developed in the literature and used to deal with source localization problems. This demonstrates that in terms of minimizing the analytic EEV, they are all suboptimal, not optimal. Monte Carlo simulation is performed and shows that both amplitude and frequency weighting bring obvious improvement over the unweighted EGS estimator

    Sound Event Localization, Detection, and Tracking by Deep Neural Networks

    Get PDF
    In this thesis, we present novel sound representations and classification methods for the task of sound event localization, detection, and tracking (SELDT). The human auditory system has evolved to localize multiple sound events, recognize and further track their motion individually in an acoustic environment. This ability of humans makes them context-aware and enables them to interact with their surroundings naturally. Developing similar methods for machines will provide an automatic description of social and human activities around them and enable machines to be context-aware similar to humans. Such methods can be employed to assist the hearing impaired to visualize sounds, for robot navigation, and to monitor biodiversity, the home, and cities. A real-life acoustic scene is complex in nature, with multiple sound events that are temporally and spatially overlapping, including stationary and moving events with varying angular velocities. Additionally, each individual sound event class, for example, a car horn can have a lot of variabilities, i.e., different cars have different horns, and within the same model of the car, the duration and the temporal structure of the horn sound is driver dependent. Performing SELDT in such overlapping and dynamic sound scenes while being robust is challenging for machines. Hence we propose to investigate the SELDT task in this thesis and use a data-driven approach using deep neural networks (DNNs). The sound event detection (SED) task requires the detection of onset and offset time for individual sound events and their corresponding labels. In this regard, we propose to use spatial and perceptual features extracted from multichannel audio for SED using two different DNNs, recurrent neural networks (RNNs) and convolutional recurrent neural networks (CRNNs). We show that using multichannel audio features improves the SED performance for overlapping sound events in comparison to traditional single-channel audio features. The proposed novel features and methods produced state-of-the-art performance for the real-life SED task and won the IEEE AASP DCASE challenge consecutively in 2016 and 2017. Sound event localization is the task of spatially locating the position of individual sound events. Traditionally, this has been approached using parametric methods. In this thesis, we propose a CRNN for detecting the azimuth and elevation angles of multiple temporally overlapping sound events. This is the first DNN-based method performing localization in complete azimuth and elevation space. In comparison to parametric methods which require the information of the number of active sources, the proposed method learns this information directly from the input data and estimates their respective spatial locations. Further, the proposed CRNN is shown to be more robust than parametric methods in reverberant scenarios. Finally, the detection and localization tasks are performed jointly using a CRNN. This method additionally tracks the spatial location with time, thus producing the SELDT results. This is the first DNN-based SELDT method and is shown to perform equally with stand-alone baselines for SED, localization, and tracking. The proposed SELDT method is evaluated on nine datasets that represent anechoic and reverberant sound scenes, stationary and moving sources with varying velocities, a different number of overlapping sound events and different microphone array formats. The results show that the SELDT method can track multiple overlapping sound events that are both spatially stationary and moving

    Online Audio-Visual Multi-Source Tracking and Separation: A Labeled Random Finite Set Approach

    Get PDF
    The dissertation proposes an online solution for separating an unknown and time-varying number of moving sources using audio and visual data. The random finite set framework is used for the modeling and fusion of audio and visual data. This enables an online tracking algorithm to estimate the source positions and identities for each time point. With this information, a set of beamformers can be designed to separate each desired source and suppress the interfering sources

    Radio Communications

    Get PDF
    In the last decades the restless evolution of information and communication technologies (ICT) brought to a deep transformation of our habits. The growth of the Internet and the advances in hardware and software implementations modified our way to communicate and to share information. In this book, an overview of the major issues faced today by researchers in the field of radio communications is given through 35 high quality chapters written by specialists working in universities and research centers all over the world. Various aspects will be deeply discussed: channel modeling, beamforming, multiple antennas, cooperative networks, opportunistic scheduling, advanced admission control, handover management, systems performance assessment, routing issues in mobility conditions, localization, web security. Advanced techniques for the radio resource management will be discussed both in single and multiple radio technologies; either in infrastructure, mesh or ad hoc networks

    Secure Geo-location Techniques using Trusted Hyper-visor

    Get PDF
    Για πολλούς, η γεωγραφική θέση είναι μια απλή διαδικασία όπου με τη χρήση του GPS ένα άτομο μπορεί να εντοπιστεί όπου και όποτε ζητείται. Ωστόσο, ακόμη και αν η χρήση του GPS για γεωγραφική τοποθέτηση είναι ο πιο συνηθισμένος τρόπος και ταυτόχρονα ακριβής ως σύστημα, αποτελεί μια τεράστια κατανάλωση ενέργειας για να επιτευχθεί αυτή η διαδικασία και υστερεί σε μηχανισμούς και τεχνικές ασφαλείας. Σκοπός αυτής της εργασίας είναι να παρουσιάσουμε μια άλλη όψη για το πώς μπορούμε να εντοπίσουμε μια άγνωστη θέση ενός κόμβου σε ένα σύστημα και πώς θα μπορούσε να δημιουργηθεί ένα ασφαλές περιβάλλον για αυτόν τον κόμβο. Βασική μας ιδέα ήταν η δημιουργία ενός μηχανισμού όπου θα μπορούσαμε να δημιουργήσουμε ένα τρισδιάστατο πεδίο στο οποίο θα μπορούσε να εντοπιστεί άγνωστος κόμβος και στη συνέχεια θα δημιουργηθεί ένα ασφαλές περιβάλλον για τον νέο κόμβο. Μετά από μια έρευνα σε δημοσιεύσεις σχετικά με τρισδιάστατους μηχανισμούς και τεχνικές γεω-εντοπισμού, παράλληλα με την έννοια των hypervisors για τη δημιουργία ασφαλούς περιβάλλοντος με την αξιοποίηση της κρυπτογραφίας, καταλήξαμε στο συμπέρασμα της δημιουργίας ενός πλαισίου που θα ικανοποιούσε αυτά απαιτήσεις. Δημιουργήσαμε ένα τρισδιάστατο πεδίο τεσσάρων σταθμών κόμβων, όπου χρησιμοποιήσαμε δύο αλγορίθμους εντοπισμού, χωρίς GPS, για τον εντοπισμό της θέση ενός πέμπτου άγνωστου κόμβου παράλληλα με έναν hypervisor για τη δημιουργία περιβάλλοντος εμπιστοσύνης. Χρησιμοποιήσαμε ένα TPM για τη δημιουργία κρυπτογραφικών μηχανισμών και κλειδιών ασφαλείας. Σε αυτή την εργασία δημιουργήσαμε μια προσομοίωση όπου συγκρίνουμε την απόδοση αυτών των δύο αλγορίθμων γεωγραφικής τοποθέτησης από την άποψη της ταχύτητας και της ακρίβειας του υπολογισμού, παράλληλα με την απόδοση των μηχανισμών ασφαλείας του hypervisor και την ικανότητά του για ασφάλιση ακεραιότητας δεδομένων. Εκτός από τα συστατικά του προτεινόμενου μηχανισμού, παρουσιάζουμε και άλλες πληροφορίες που βρήκαμε σε σχετικά έγγραφα, όπως μια ποικιλία από hypervisors και μια ποικιλία τεχνικών εντοπισμού, για περισσότερες πληροφορίες για μελλοντικές εργασίες παράλληλα με τα βήματα υλοποίησης και εκτέλεσης.For many, geo-location is a simple process where with the utilization of GPS a person can be located wherever and whenever is requested. However, even if the utilization of GPS for geolocation is the most common way and accurate as a system, it is a huge consumption of energy in order to achieve this process and it lucks on safety mechanisms and techniques. The purpose of this paper is to present another view of how we could locate an unknown node position in a system and how a safe environment could be created for this node. Our main idea was about the creation of a framework where we could create a three-dimensional field in which an unknown node could be located and afterwards a safe environment would be created for the new node. After a research on papers relevant with three-dimensional geo-localization mechanisms and techniques, alongside with the concept of hypervisors for the creation of safe environment with the utilization of cryptography, we came to the conclusion of the creation of a framework which would satisfy those requirements. We created a 3-Dimentional field of four base nodes stations, where we utilized two localization GPS-free algorithms for the location of a fifth unknown node alongside with a hypervisor for the trust environment creation. We utilized a TPM for the cryptography mechanisms and safety keys creation. In this paper we created a simulation where we compare the performance of those two geolocation algorithms in terms of accuracy and computation speed and accuracy, alongside with the hypervisor’s security mechanisms performance and its ability for data integrity insurance. Except our proposed framework components, we present also further information that we found in relevant papers, such as a variety of hypervisors and a variety of localization techniques, for more information for future work alongside with implementation steps and guidanc

    Directional Antenna System-Based DoA/RSS Estimation, Localization and Tracking in Future Wireless Networks: Algorithms and Performance Analysis

    Get PDF
    Location information plays an important role in many emerging technologies such as robotics, autonomous vehicles, and augmented reality. Already now the majority of smartphone owners use their devices' localization capabilities for a broad range of location-based services. Currently, location information in smartphones is mostly obtained in a device-centric approach, where the device to be localized, here referred to as the target node (TN), estimates its own location using, for example, the global positioning system (GPS). However, TNs with wireless communication capabilities can be localized based on their transmitted signals by a third party. In particular, localization can be implemented as a functionality of a wireless network. Depending on the application area and implementation, this network-centric approach has several advantages compared to device-centric localization, such as reducing the energy consumption within the TNs, enabling localization of non-cooperative TNs, and making location information available in the network itself. Current generation wireless networks are already capable of coarse localization. However, these existing localization capabilities do not suffice for the challenging demands of future applications. The majority of approaches moreover does not exploit the fact that an increasing number of base stations (BSs) and user devices are equipped with directional antennas. However, directional antennas enable direction of arrival (DoA) estimation that can, in turn, serve as the basis for advanced localization and location tracking. In this thesis, we thus study the application of directional antennas for localization and location tracking in future generation wireless networks. The contributions of this thesis can be grouped into two topics.First, this thesis provides a detailed study of DoA/received signal strength (RSS) estimation and localization with a group of directional antennas herein denoted as sectorized antennas. This group of antennas is of particular interest as it encompasses a broad range of directional antennas that can be implemented with a single RF front-end. Thus, the hardware complexity of sectorized antennas is low in comparison to the conventionally used antenna arrays that require multiple transceiver branches. However, at the same time this means that DoA estimation with sectorized antennas has to be implemented in a fundamentally different way. In order to address these differences, the study of sectorized antennas in this thesis includes the derivation of Cramer-Rao bounds (CRBs) for DoA/RSS estimation and localization, the proposal of three different DoA/RSS estimators, as well as numerical and analytical performance evaluations of DoA/RSS estimation and localization using sectorized antennas.Second, this thesis deals with localization based on the fusion of DoA and RSS estimates as well as DoA and time of arrival (ToA) estimates. It is shown that the combination of these estimates can result in a much increased localization performance compared to a localization based on one of these estimates alone. For the localization based on DoA/RSS estimates, a mechanism explaining this improvement is revealed by means of a CRB analysis. Thereafter, DoA/RSS-based fusion is further studied using an extended Kalman filter (EKF) as an example location tracking algorithm. Finally, an EKF is proposed that tracks the location of a TN by fusing DoA and ToA estimates. Apart from a significantly improved tracking performance, this joint DoA/ToA-EKF moreover provides estimates for the TN device clock offset and is able to localize the TN in situations where a classical DoA-only EKF fails to provide a location estimate altogether.Overall, this thesis thus provides insights into benefits of localization and location tracking using directional antennas, accompanied by specific DoA/RSS estimation, localization and location tracking solutions, as well as design guidelines for implementing localization systems in future generation wireless networks

    Neuromorphic Models of the Amygdala with Applications to Spike Based Computing and Robotics

    Get PDF
    Computational neural simulations do not match the functionality and operation of the brain processes they attempt to model. This gap exists due to both our incomplete understanding of brain function and the technological limitations of computers. Moreover, given that the shrinking of transistors has reached its physical limit, fundamentally different computer paradigms are needed to help bridge this gap. Neuromorphic hardware technologies attempt to abstract the form of brain function to provide a computational solution post-Moore’s Law, and neuromorphic algorithms provide software frameworks to increase biological plausibility within neural models. This dissertation focuses on utilizing neuromorphic frameworks to better understand how the brain processes social and emotional stimuli. It describes the creation of a spiking-neuron computational model of the amygdala, the brain region behind our social interactions, and the simulation of the model using brain-inspired computer hardware, as well as the implementations of other spike-based computations on these hardwares. Although scientists agree that the amygdala is the main component of the social brain, few models exist to explain amygdala function beyond “fight or flight”. This model incorporates neuroscientists’ more nuanced understanding of the amygdala, and is validated by comparing the neural responses measured from the model to responses measured in primate amygdalae under the same experimental conditions. This model will inform future physiological experiments, which will generate deeper neuroscientific insights, which will in turn allow for better neural models. Repeated iteratively, this positive feedback loop in which better models beget better under- standing of biology and vice versa will help close the gap between the computer and the brain. The computer networks and hardware that emerge from this process have the potential to achieve higher computing efficiency, approaching or perhaps surpassing the efficiency of the human brain; provide the foundation for new approaches to artificial intelligence and machine learning within a spike-based computing paradigm; and widen our understanding of brain function

    Swarm Robotics

    Get PDF
    Collectively working robot teams can solve a problem more efficiently than a single robot, while also providing robustness and flexibility to the group. Swarm robotics model is a key component of a cooperative algorithm that controls the behaviors and interactions of all individuals. The robots in the swarm should have some basic functions, such as sensing, communicating, and monitoring, and satisfy the following properties
    corecore