33 research outputs found

    Wildlife Communication

    Get PDF
    This report contains a progress report for the ph.d. project titled “Wildlife Communication”. The project focuses on investigating how signal processing and pattern recognition can be used to improve wildlife management in agriculture. Wildlife management systems used today experience habituation from wild animals which makes them ineffective. An intelligent wildlife management system could monitor its own effectiveness and alter its scaring strategy based on this

    Smart Monitoring of Manufacturing Systems for Automated Decision-Making: A Multi-Method Framework

    Get PDF
    Smart monitoring plays a principal role in the intelligent automation of manufacturing systems. Advanced data collection technologies, like sensors, have been widely used to facilitate real-time data collection. Computationally efficient analysis of the operating systems, however, remains relatively underdeveloped and requires more attention. Inspired by the capabilities of signal analysis and information visualization, this study proposes a multi-method framework for the smart monitoring of manufacturing systems and intelligent decision-making. The proposed framework uses the machine signals collected by noninvasive sensors for processing. For this purpose, the signals are filtered and classified to facilitate the realization of the operational status and performance measures to advise the appropriate course of managerial actions considering the detected anomalies. Numerical experiments based on real data are used to show the practicability of the developed monitoring framework. Results are supportive of the accuracy of the method. Applications of the developed approach are worthwhile research topics to research in other manufacturing environments

    Unsupervised Methods for Condition-Based Maintenance in Non-Stationary Operating Conditions

    Get PDF
    Maintenance and operation of modern dynamic engineering systems requires the use of robust maintenance strategies that are reliable under uncertainty. One such strategy is condition-based maintenance (CBM), in which maintenance actions are determined based on the current health of the system. The CBM framework integrates fault detection and forecasting in the form of degradation modeling to provide real-time reliability, as well as valuable insight towards the future health of the system. Coupled with a modern information platform such as Internet-of-Things (IoT), CBM can deliver these critical functionalities at scale. The increasingly complex design and operation of engineering systems has introduced novel problems to CBM. Characteristics of these systems - such as the unavailability of historical data, or highly dynamic operating behaviour - has rendered many existing solutions infeasible. These problems have motivated the development of new and self-sufficient - or in other words - unsupervised CBM solutions. The issue, however, is that many of the necessary methods required by such frameworks have yet to be proposed within the literature. Key gaps pertaining to the lack of suitable unsupervised approaches for the pre-processing of non-stationary vibration signals, parameter estimation for fault detection, and degradation threshold estimation, need to be addressed in order to achieve an effective implementation. The main objective of this thesis is to propose set of three novel approaches to address each of the aforementioned knowledge gaps. A non-parametric pre-processing and spectral analysis approach, termed spectral mean shift clustering (S-MSC) - which applies mean shift clustering (MSC) to the short time Fourier transform (STFT) power spectrum for simultaneous de-noising and extraction of time-varying harmonic components - is proposed for the autonomous analysis of non-stationary vibration signals. A second pre-processing approach, termed Gaussian mixture model operating state decomposition (GMM-OSD) - which uses GMMs to cluster multi-modal vibration signals by their respective, unknown operating states - is proposed to address multi-modal non-stationarity. Applied in conjunction with S-MSC, these two approaches form a robust and unsupervised pre-processing framework tailored to the types of signals found in modern engineering systems. The final approach proposed in this thesis is a degradation detection and fault prediction framework, termed the Bayesian one class support vector machine (B-OCSVM), which tackles the key knowledge gaps pertaining to unsupervised parameter and degradation threshold estimation by re-framing the traditional fault detection and degradation modeling problem as a degradation detection and fault prediction problem. Validation of the three aforementioned approaches is performed across a wide range of machinery vibration data sets and applications, including data obtained from two full-scale field pilots located at Toronto Pearson International Airport. The first of which is located on the gearbox of the LINK Automated People Mover (APM) train at Toronto Pearson International Airport; and, the second which is located on a subset of passenger boarding tunnel pre-conditioned air units (PCA) in Terminal 1 of Pearson airport. Results from validation found that the proposed pre-processing approaches and combined pre-processing framework provides a robust and computationally efficient and robust methodology for the analysis of non-stationary vibration signals in unsupervised CBM. Validation of the B-OCSVM framework showed that the proposed parameter estimation approaches enables the earlier detection of the degradation process compared to existing approaches, and the proposed degradation threshold provides a reasonable estimate of the fault manifestation point. Holistically, the approaches proposed in thesis provide a crucial step forward towards the effective implementation of unsupervised CBM in complex, modern engineering systems

    Model-Based Speech Enhancement

    Get PDF
    Abstract A method of speech enhancement is developed that reconstructs clean speech from a set of acoustic features using a harmonic plus noise model of speech. This is a significant departure from traditional filtering-based methods of speech enhancement. A major challenge with this approach is to estimate accurately the acoustic features (voicing, fundamental frequency, spectral envelope and phase) from noisy speech. This is achieved using maximum a-posteriori (MAP) estimation methods that operate on the noisy speech. In each case a prior model of the relationship between the noisy speech features and the estimated acoustic feature is required. These models are approximated using speaker-independent GMMs of the clean speech features that are adapted to speaker-dependent models using MAP adaptation and for noise using the Unscented Transform. Objective results are presented to optimise the proposed system and a set of subjective tests compare the approach with traditional enhancement methods. Threeway listening tests examining signal quality, background noise intrusiveness and overall quality show the proposed system to be highly robust to noise, performing significantly better than conventional methods of enhancement in terms of background noise intrusiveness. However, the proposed method is shown to reduce signal quality, with overall quality measured to be roughly equivalent to that of the Wiener filter

    Adaptation of Speaker and Speech Recognition Methods for the Automatic Screening of Speech Disorders using Machine Learning

    Get PDF
    This PhD thesis presented methods for exploiting the non-verbal communication of individuals suffering from specific diseases or health conditions aiming to reach an automatic screening of them. More specifically, we employed one of the pillars of non-verbal communication, paralanguage, to explore techniques that could be utilized to model the speech of subjects. Paralanguage is a non-lexical component of communication that relies on intonation, pitch, speed of talking, and others, which can be processed and analyzed in an automatic manner. This is called Computational Paralinguistics, which can be defined as the study of modeling non-verbal latent patterns within the speech of a speaker by means of computational algorithms; these patterns go beyond the linguistic} approach. By means of machine learning, we present models from distinct scenarios of both paralinguistics and pathological speech which are capable of estimating the health status of a given disease such as Alzheimer's, Parkinson's, and clinical depression, among others, in an automatic manner

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Computer audition for emotional wellbeing

    Get PDF
    This thesis is focused on the application of computer audition (i. e., machine listening) methodologies for monitoring states of emotional wellbeing. Computer audition is a growing field and has been successfully applied to an array of use cases in recent years. There are several advantages to audio-based computational analysis; for example, audio can be recorded non-invasively, stored economically, and can capture rich information on happenings in a given environment, e. g., human behaviour. With this in mind, maintaining emotional wellbeing is a challenge for humans and emotion-altering conditions, including stress and anxiety, have become increasingly common in recent years. Such conditions manifest in the body, inherently changing how we express ourselves. Research shows these alterations are perceivable within vocalisation, suggesting that speech-based audio monitoring may be valuable for developing artificially intelligent systems that target improved wellbeing. Furthermore, computer audition applies machine learning and other computational techniques to audio understanding, and so by combining computer audition with applications in the domain of computational paralinguistics and emotional wellbeing, this research concerns the broader field of empathy for Artificial Intelligence (AI). To this end, speech-based audio modelling that incorporates and understands paralinguistic wellbeing-related states may be a vital cornerstone for improving the degree of empathy that an artificial intelligence has. To summarise, this thesis investigates the extent to which speech-based computer audition methodologies can be utilised to understand human emotional wellbeing. A fundamental background on the fields in question as they pertain to emotional wellbeing is first presented, followed by an outline of the applied audio-based methodologies. Next, detail is provided for several machine learning experiments focused on emotional wellbeing applications, including analysis and recognition of under-researched phenomena in speech, e. g., anxiety, and markers of stress. Core contributions from this thesis include the collection of several related datasets, hybrid fusion strategies for an emotional gold standard, novel machine learning strategies for data interpretation, and an in-depth acoustic-based computational evaluation of several human states. All of these contributions focus on ascertaining the advantage of audio in the context of modelling emotional wellbeing. Given the sensitive nature of human wellbeing, the ethical implications involved with developing and applying such systems are discussed throughout
    corecore