476 research outputs found

    Gravitational Clustering: A Simple, Robust and Adaptive Approach for Distributed Networks

    Full text link
    Distributed signal processing for wireless sensor networks enables that different devices cooperate to solve different signal processing tasks. A crucial first step is to answer the question: who observes what? Recently, several distributed algorithms have been proposed, which frame the signal/object labelling problem in terms of cluster analysis after extracting source-specific features, however, the number of clusters is assumed to be known. We propose a new method called Gravitational Clustering (GC) to adaptively estimate the time-varying number of clusters based on a set of feature vectors. The key idea is to exploit the physical principle of gravitational force between mass units: streaming-in feature vectors are considered as mass units of fixed position in the feature space, around which mobile mass units are injected at each time instant. The cluster enumeration exploits the fact that the highest attraction on the mobile mass units is exerted by regions with a high density of feature vectors, i.e., gravitational clusters. By sharing estimates among neighboring nodes via a diffusion-adaptation scheme, cooperative and distributed cluster enumeration is achieved. Numerical experiments concerning robustness against outliers, convergence and computational complexity are conducted. The application in a distributed cooperative multi-view camera network illustrates the applicability to real-world problems.Comment: 12 pages, 9 figure

    Visual / acoustic detection and localisation in embedded systems

    Get PDF
    ©Cranfield UniversityThe continuous miniaturisation of sensing and processing technologies is increasingly offering a variety of embedded platforms, enabling the accomplishment of a broad range of tasks using such systems. Motivated by these advances, this thesis investigates embedded detection and localisation solutions using vision and acoustic sensors. Focus is particularly placed on surveillance applications using sensor networks. Existing vision-based detection solutions for embedded systems suffer from the sensitivity to environmental conditions. In the literature, there seems to be no algorithm able to simultaneously tackle all the challenges inherent to real-world videos. Regarding the acoustic modality, many research works have investigated acoustic source localisation solutions in distributed sensor networks. Nevertheless, it is still a challenging task to develop an ecient algorithm that deals with the experimental issues, to approach the performance required by these systems and to perform the data processing in a distributed and robust manner. The movement of scene objects is generally accompanied with sound emissions with features that vary from an environment to another. Therefore, considering the combination of the visual and acoustic modalities would offer a significant opportunity for improving the detection and/or localisation using the described platforms. In the light of the described framework, we investigate in the first part of the thesis the use of a cost-effective visual based method that can deal robustly with the issue of motion detection in static, dynamic and moving background conditions. For motion detection in static and dynamic backgrounds, we present the development and the performance analysis of a spatio- temporal form of the Gaussian mixture model. On the other hand, the problem of motion detection in moving backgrounds is addressed by accounting for registration errors in the captured images. By adopting a robust optimisation technique that takes into account the uncertainty about the visual measurements, we show that high detection accuracy can be achieved. In the second part of this thesis, we investigate solutions to the problem of acoustic source localisation using a trust region based optimisation technique. The proposed method shows an overall higher accuracy and convergence improvement compared to a linear-search based method. More importantly, we show that through characterising the errors in measurements, which is a common problem for such platforms, higher accuracy in the localisation can be attained. The last part of this work studies the different possibilities of combining visual and acoustic information in a distributed sensors network. In this context, we first propose to include the acoustic information in the visual model. The obtained new augmented model provides promising improvements in the detection and localisation processes. The second investigated solution consists in the fusion of the measurements coming from the different sensors. An evaluation of the accuracy of localisation and tracking using a centralised/decentralised architecture is conducted in various scenarios and experimental conditions. Results have shown the capability of this fusion approach to yield higher accuracy in the localisation and tracking of an active acoustic source than by using a single type of data

    Real-time acoustic event classification in urban environments using low-cost devices

    Get PDF
    En la societat moderna i en constant evolució, la presència de soroll s'ha convertit en un perill diari per a una quantitat preocupant de la població. Estar sobreexposats a alts nivells de soroll pot interferir en activitats quotidianes i podria causar greus efectes secundaris en termes de salut com mal humor, deteriorament cognitiu en nens o malalties cardiovasculars. Hi ha estudis que assenyalen que no només afecta el nivell de soroll al qual estan exposats els ciutadans, sinó que també és important el tipus de so. Així doncs, no tots els esdeveniments acústics tenen el mateix impacte en la població. Amb les tecnologies que es fan servir actualment per a monitorar la contaminació acústica, és difícil identificar automàticament quins sorolls estan més presents en les zones més contaminades. De fet, per avaluar les queixes dels ciutadans, normalment s'envien tècnics a la zona on s'hi ha produït la queixa per avaluar si aquesta és rellevant. A causa de l'elevat nombre de queixes que es generen diàriament (especialment en zones molt poblades), el desenvolupament de Xarxes de Sensors Acústics Sense Fils (WASN) que monitorin automàticament la contaminació acústica d'una zona s'ha convertit en una tendència d'investigació. En l'actualitat, la majoria de les xarxes desplegades en entorns urbans només mesuren el nivell de soroll equivalent fent servir equipaments cars i precisos però no permeten d'identificar les fonts de soroll presents a cada lloc. Donat l'elevat cost dels sensors, aquests solen col·locar-se en llocs estratègics, però no monitoren zones àmplies. L'objectiu d'aquesta tesi és abordar un important repte que encara està latent en aquest camp: monitorar acústicament zones de gran envergadura en temps real i de forma escalable i econòmica. En aquest sentit, s'ha seleccionat el centre de la ciutat de Barcelona com a cas d'ús de referència per a dur a terme aquesta investigació. En primer lloc, aquesta tesi parteix d'una anàlisi precís d'un conjunt de 6 hores de dades anotades corresponents al paisatge sonor d'una zona concreta de la ciutat (l'Eixample). Després, es presenta una arquitectura distribuïda escalable que fa servir dispositius de baix cost per a reconèixer esdeveniments acústics. Per validar la viabilitat d'aquest enfocament, s'ha implementat un algorisme d'aprenentatge profund que s'executa sobre aquesta arquitectura per a classificar 10 categories acústiques diferents. Com que els nodes del sistema proposats estan disposats en una topologia amb redundància física (més d'un node pot escoltar el mateix esdeveniment acústic simultàniament), s'han recollit dades en quatre punts del centre de Barcelona respectant l'arquitectura dels sensors. Per últim, donat que els esdeveniments del món real tendeixen a produir-se simultàniament, s'ha millorat l'algorisme d'aprenentatge profund perquè suporti la classificació multietiqueta (és a dir, polifònica). Els resultats mostren que, amb l'arquitectura del sistema proposat, és possible classificat esdeveniments acústic en temps real. En general, les contribucions d'aquesta investigació són les següents: (1) el disseny d'una WASN de baix cost i escalable, que pugui monitorar àrees a gran escala i (2) el desenvolupament d'un algorisme de classificació en temps real executat sobre els nodes de detecció dissenyats.En la sociedad moderna y en constante evolución, la presencia de ruido se ha convertido en una amenaza diaria para una cantidad preocupante de la población. Estar sobreexpuesto a altos niveles de ruido puede interferir en las actividades cotidianas y podría acarrear graves efectos secundarios en términos de salud como mal humor, deterioro cognitivo en niños o enfermedades cardiovasculares. Hay estudios que señalan que no solo afecta el nivel de ruido al que están expuestos los ciudadanos: también es importante el tipo de sonido. Es decir, no todos los eventos acústicos tienen el mismo impacto en la población. Con las tecnologías que se utilizan actualmente para monitorizar la contaminación acústica, es difícil identificar automáticamente qué sonidos están más presentes en las zonas más contaminadas. De hecho, para evaluar las quejas de los ciudadanos, normalmente se envían técnicos a la zona donde se ha realizado la queja para evaluar si ésta es relevante. Debido al elevado número de quejas que se generan diariamente (especialmente en zonas muy pobladas), el desarrollo de Redes de Sensores Acústicos Inalámbricos (WASN) que monitoricen automáticamente la contaminación acústica se ha convertido en una tendencia de investigación. Actualmente, la mayoría de redes desplegadas en entornos urbanos solo miden el nivel de ruido equivalente mediante equipos caros y precisos, pero no son capaces de identificar las fuentes de ruido presentes en cada lugar. Dado el elevado precio de estos sensores, los nodos suelen colocarse en lugares estratégicos, pero no monitorizan zonas amplias. El objetivo de esta tesis es abordar un importante reto aún latente en este campo: monitorizar acústicamente zonas de gran tamaño en tiempo real y de forma escalable y económica. En este sentido, se ha seleccionado la ciudad de Barcelona como caso de uso para llevar a cabo esta investigación. Primeramente, esta tesis parte de un análisis preciso de un conjunto de 6 horas de datos anotados correspondientes al paisaje sonoro de una zona concreta de la ciudad (l'Eixample). Después, se presenta una arquitectura distribuida escalable que utiliza dispositivos de bajo coste para reconocer eventos acústicos. Para validar la viabilidad del enfoque, se ha implementado un algoritmo de aprendizaje profundo ejecutado sobre esta arquitectura para clasificar 10 categorías acústicas diferentes. Como los nodos del sistema propuesto están dispuestos en una topología con redundancia física (más de un nodo puede escuchar el mismo evento acústico a la vez), se han recogido datos en cuatro puntos del centro de Barcelona respetando la arquitectura de los sensores. Por último, dado que los eventos del mundo real tienden a producirse simultáneamente, se ha mejorado el algoritmo de aprendizaje profundo para que soporte la clasificación multietiqueta (polifónica). Los resultados muestran que, con la arquitectura del sistema propuesto, es posible clasificar eventos acústicos en tiempo real. En general, las contribuciones de esta investigación son las siguientes (1) diseño de una WASN de bajo coste y escalable, capaz de monitorizar áreas a gran escala y (2) desarrollo de un algoritmo de clasificación en tiempo real ejecutado sobre los nodos de detección diseñados.In the modern and ever-evolving society, the presence of noise has become a daily threat to a worrying amount of the population. Being overexposed to high levels of noise may interfere with day-to-day activities and, thus, could potentially bring severe side-effects in terms of health such as annoyance, cognitive impairment in children or cardiovascular diseases. Some studies point out that it is not only the level of noise that matters but also the type of sound that the citizens are exposed to. That is, not all the acoustic events have the same impact on the population. With current technologies used to track noise levels, for both private and public administrations, it is hard to automatically identify which sounds are more present in most polluted areas. Actually, to assess citizen complaints, technicians are typically sent to the area to be surveyed to evaluate if the complaint is relevant. Due to the high number of complaints that are generated every day (specially in highly populated areas), the development of Wireless Acoustic Sensor Networks (WASN) that would automatically monitor the noise pollution of a certain area have become a research trend. Currently, most of the networks that are deployed in cities measure only the equivalent noise level by means of expensive but highly accurate hardware but cannot identify the noise sources that are present in each spot. Given the elevated price of these sensors, nodes are typically placed in specific locations, but do not monitor wide areas. The purpose of this thesis is to address an important challenge still latent in this field: to acoustically monitor large-scale areas in real-time and in a scalable and cost efficient way. In this regard, the city centre of Barcelona has been selected as a reference use-case scenario to conduct this research. First, this dissertation starts with an accurate analysis of an annotated dataset of 6 hours corresponding to the soundscape of a specific area of the city (l’Eixample). Next, a scalable distributed architecture using low-cost computing devices to recognize acoustic events is presented. To validate the feasibility of this approach, a deep learning algorithm running on top of this architecture has been implemented to classify 10 different acoustic categories. As the sensing nodes of the proposed system are arranged in such a way that it is possible to take advantage of physical redundancy (that is, more than one node may hear the same acoustic event), data has been gathered in four spots of the city centre of Barcelona respecting the sensors topology. Finally, as real-world events tend to occur simultaneously, the deep learning algorithm has been enhanced to support multilabel (i.e., polyphonic) classification. Results show that, with the proposed system architecture, it is possible to classify acoustic events in real-time. Overall, the contributions of this research are the following: (1) the design of a low-cost, scalable WASN able to monitor large-scale areas and (2) the development of a real-time classification algorithm able to run over the designed sensing nodes

    Robust Distributed Multi-Source Detection and Labeling in Wireless Acoustic Sensor Networks

    Get PDF
    The growing demand in complex signal processing methods associated with low-energy large scale wireless acoustic sensor networks (WASNs) urges the shift to a new information and communication technologies (ICT) paradigm. The emerging research perception aspires for an appealing wireless network communication where multiple heterogeneous devices with different interests can cooperate in various signal processing tasks (MDMT). Contributions in this doctoral thesis focus on distributed multi-source detection and labeling applied to audio enhancement scenarios pursuing an MDMT fashioned node-specific source-of-interest signal enhancement in WASNs. In fact, an accurate detection and labeling is a pre-requisite to pursue the MDMT paradigm where nodes in the WASN communicate effectively their sources-of-interest and, therefore, multiple signal processing tasks can be enhanced via cooperation. First, a novel framework based on a dominant source model in distributed WASNs for resolving the activity detection of multiple speech sources in a reverberant and noisy environment is introduced. A preliminary rank-one multiplicative non-negative independent component analysis (M-NICA) for unique dominant energy source extraction given associated node clusters is presented. Partitional algorithms that minimize the within-cluster mean absolute deviation (MAD) and weighted MAD objectives are proposed to determine the cluster membership of the unmixed energies, and thus establish a source specific voice activity recognition. In a second study, improving the energy signal separation to alleviate the multiple source activity discrimination task is targeted. Sparsity inducing penalties are enforced on iterative rank-one singular value decomposition layers to extract sparse right rotations. Then, sparse non-negative blind energy separation is realized using multiplicative updates. Hence, the multiple source detection problem is converted into a sparse non-negative source energy decorrelation. Sparsity tunes the supposedly non-active energy signatures to exactly zero-valued energies so that it is easier to identify active energies and an activity detector can be constructed in a straightforward manner. In a centralized scenario, the activity decision is controlled by a fusion center that delivers the binary source activity detection for every participating energy source. This strategy gives precise detection results for small source numbers. With a growing number of interfering sources, the distributed detection approach is more promising. Conjointly, a robust distributed energy separation algorithm for multiple competing sources is proposed. A robust and regularized tνMt_{\nu}M-estimation of the covariance matrix of the mixed energies is employed. This approach yields a simple activity decision using only the robustly unmixed energy signatures of the sources in the WASN. The performance of the robust activity detector is validated with a distributed adaptive node-specific signal estimation method for speech enhancement. The latter enhances the quality and intelligibility of the signal while exploiting the accurately estimated multi-source voice decision patterns. In contrast to the original M-NICA for source separation, the extracted binary activity patterns with the robust energy separation significantly improve the node-specific signal estimation. Due to the increased computational complexity caused by the additional step of energy signal separation, a new approach to solving the detection question of multi-device multi-source networks is presented. Stability selection for iterative extraction of robust right singular vectors is considered. The sub-sampling selection technique provides transparency in properly choosing the regularization variable in the Lasso optimization problem. In this way, the strongest sparse right singular vectors using a robust 1\ell_1-norm and stability selection are the set of basis vectors that describe the input data efficiently. Active/non-active source classification is achieved based on a robust Mahalanobis classifier. For this, a robust MM-estimator of the covariance matrix in the Mahalanobis distance is utilized. Extensive evaluation in centralized and distributed settings is performed to assess the effectiveness of the proposed approach. Thus, overcoming the computationally demanding source separation scheme is possible via exploiting robust stability selection for sparse multi-energy feature extraction. With respect to the labeling problem of various sources in a WASN, a robust approach is introduced that exploits the direction-of-arrival of the impinging source signals. A short-time Fourier transform-based subspace method estimates the angles of locally stationary wide band signals using a uniform linear array. The median of angles estimated at every frequency bin is utilized to obtain the overall angle for each participating source. The features, in this case, exploit the similarity across devices in the particular frequency bins that produce reliable direction-of-arrival estimates for each source. Reliability is defined with respect to the median across frequencies. All source-specific frequency bands that contribute to correct estimated angles are selected. A feature vector is formed for every source at each device by storing the frequency bin indices that lie within the upper and lower interval of the median absolute deviation scale of the estimated angle. Labeling is accomplished by a distributed clustering of the extracted angle-based feature vectors using consensus averaging

    Unifying terrain awareness for the visually impaired through real-time semantic segmentation.

    Get PDF
    Navigational assistance aims to help visually-impaired people to ambulate the environment safely and independently. This topic becomes challenging as it requires detecting a wide variety of scenes to provide higher level assistive awareness. Vision-based technologies with monocular detectors or depth sensors have sprung up within several years of research. These separate approaches have achieved remarkable results with relatively low processing time and have improved the mobility of impaired people to a large extent. However, running all detectors jointly increases the latency and burdens the computational resources. In this paper, we put forward seizing pixel-wise semantic segmentation to cover navigation-related perception needs in a unified way. This is critical not only for the terrain awareness regarding traversable areas, sidewalks, stairs and water hazards, but also for the avoidance of short-range obstacles, fast-approaching pedestrians and vehicles. The core of our unification proposal is a deep architecture, aimed at attaining efficient semantic understanding. We have integrated the approach in a wearable navigation system by incorporating robust depth segmentation. A comprehensive set of experiments prove the qualified accuracy over state-of-the-art methods while maintaining real-time speed. We also present a closed-loop field test involving real visually-impaired users, demonstrating the effectivity and versatility of the assistive framework

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio
    corecore