273 research outputs found

    Blind Source Separation for the Processing of Contact-Less Biosignals

    Get PDF
    (Spatio-temporale) Blind Source Separation (BSS) eignet sich für die Verarbeitung von Multikanal-Messungen im Bereich der kontaktlosen Biosignalerfassung. Ziel der BSS ist dabei die Trennung von (z.B. kardialen) Nutzsignalen und Störsignalen typisch für die kontaktlosen Messtechniken. Das Potential der BSS kann praktisch nur ausgeschöpft werden, wenn (1) ein geeignetes BSS-Modell verwendet wird, welches der Komplexität der Multikanal-Messung gerecht wird und (2) die unbestimmte Permutation unter den BSS-Ausgangssignalen gelöst wird, d.h. das Nutzsignal praktisch automatisiert identifiziert werden kann. Die vorliegende Arbeit entwirft ein Framework, mit dessen Hilfe die Effizienz von BSS-Algorithmen im Kontext des kamera-basierten Photoplethysmogramms bewertet werden kann. Empfehlungen zur Auswahl bestimmter Algorithmen im Zusammenhang mit spezifischen Signal-Charakteristiken werden abgeleitet. Außerdem werden im Rahmen der Arbeit Konzepte für die automatisierte Kanalauswahl nach BSS im Bereich der kontaktlosen Messung des Elektrokardiogramms entwickelt und bewertet. Neuartige Algorithmen basierend auf Sparse Coding erwiesen sich dabei als besonders effizient im Vergleich zu Standard-Methoden.(Spatio-temporal) Blind Source Separation (BSS) provides a large potential to process distorted multichannel biosignal measurements in the context of novel contact-less recording techniques for separating distortions from the cardiac signal of interest. This potential can only be practically utilized (1) if a BSS model is applied that matches the complexity of the measurement, i.e. the signal mixture and (2) if permutation indeterminacy is solved among the BSS output components, i.e the component of interest can be practically selected. The present work, first, designs a framework to assess the efficacy of BSS algorithms in the context of the camera-based photoplethysmogram (cbPPG) and characterizes multiple BSS algorithms, accordingly. Algorithm selection recommendations for certain mixture characteristics are derived. Second, the present work develops and evaluates concepts to solve permutation indeterminacy for BSS outputs of contact-less electrocardiogram (ECG) recordings. The novel approach based on sparse coding is shown to outperform the existing concepts of higher order moments and frequency-domain features

    Potential use of electronic noses, electronic tongues and biosensors, as multisensor systems for spoilage examination in foods

    Get PDF
    Development and use of reliable and precise detecting systems in the food supply chain must be taken into account to ensure the maximum level of food safety and quality for consumers. Spoilage is a challenging concern in food safety considerations as it is a threat to public health and is seriously considered in food hygiene issues accordingly. Although some procedures and detection methods are already available for the determination ofspoilage in food products, these traditional methods have some limitations and drawbacks as they are time-consuming,labour intensive and relatively expensive. Therefore, there is an urgent need for the development of rapid, reliable, precise and non-expensive systems to be used in the food supply and production chain as monitoring devices to detect metabolic alterations in foodstuff. Attention to instrumental detection systems such as electronic noses, electronic tongues and biosensors coupled with chemometric approaches has greatly increased because they have been demonstrated as a promising alternative for the purpose of detecting and monitoring food spoilage. This paper mainly focuses on the recent developments and the application of such multisensor systems in the food industry. Furthermore, the most traditionally methods for food spoilage detection are introduced in this context as well. The challenges and future trends of the potential use of the systems are also discussed. Based on the published literature, encouraging reports demonstrate that such systems are indeed the most promising candidates for the detection and monitoring of spoilage microorganisms in different foodstuff

    An Image fusion algorithm for spatially enhancing spectral mixture maps

    Get PDF
    An image fusion algorithm, based upon spectral mixture analysis, is presented. The algorithm combines low spatial resolution multi/hyperspectral data with high spatial resolution sharpening image(s) to create high resolution material maps. Spectral (un)mixing estimates the percentage of each material (called endmembers) within each low resolution pixel. The outputs of unmixing are endmember fraction images (material maps) at the spatial resolution of the multispectral system. This research includes developing an improved unmixing algorithm based upon stepwise regression. In the second stage of the process, the unmixing solution is sharpened with data from another sensor to generate high resolution material maps. Sharpening is implemented as a nonlinear optimization using the same type of model as unmixing. Quantifiable results are obtained through the use of synthetically generated imagery. Without synthetic images, a large amount of ground truth would be required in order to measure the accuracy of the material maps. Multiple band sharpening is easily accommodated by the algorithm, and the results are demonstrated at multiple scales. The analysis includes an examination of the effects of constraints and texture variation on the material maps. The results show stepwise unmixing is an improvement over traditional unmixing algorithms. The results also indicate sharpening improves the material maps. The motivation for this research is to take advantage of the next generation of multi/hyperspectral sensors. Although the hyperspectral images will be of modest to low resolution, fusing them with high resolution sharpening images will produce a higher spatial resolution land cover or material map

    Color in scientific visualization: Perception and image-based data display

    Get PDF
    Visualization is the transformation of information into a visual display that enhances users understanding and interpretation of the data. This thesis project has investigated the use of color and human vision modeling for visualization of image-based scientific data. Two preliminary psychophysical experiments were first conducted on uniform color patches to analyze the perception and understanding of different color attributes, which provided psychophysical evidence and guidance for the choice of color space/attributes for color encoding. Perceptual color scales were then designed for univariate and bivariate image data display and their effectiveness was evaluated through three psychophysical experiments. Some general guidelines were derived for effective color scales design. Extending to high-dimensional data, two visualization techniques were developed for hyperspectral imagery. The first approach takes advantage of the underlying relationships between PCA/ICA of hyperspectral images and the human opponent color model, and maps the first three PCs or ICs to several opponent color spaces including CIELAB, HSV, YCbCr, and YUV. The gray world assumption was adopted to automatically set the mapping origins. The rendered images are well color balanced and can offer a first look capability or initial classification for a wide variety of spectral scenes. The second approach combines a true color image and a PCA image based on a biologically inspired visual attention model that simulates the center-surround structure of visual receptive fields as the difference between fine and coarse scales. The model was extended to take into account human contrast sensitivity and include high-level information such as the second order statistical structure in the form of local variance map, in addition to low-level features such as color, luminance, and orientation. It generates a topographic saliency map for both the true color image and the PCA image, a difference map is then derived and used as a mask to select interesting locations where the PCA image has more salient features than available in the visible bands. The resulting representations preserve consistent natural appearance of the scene, while the selected attentional locations may be analyzed by more advanced algorithms

    Sensor fusion with Gaussian processes

    Get PDF
    This thesis presents a new approach to multi-rate sensor fusion for (1) user matching and (2) position stabilisation and lag reduction. The Microsoft Kinect sensor and the inertial sensors in a mobile device are fused with a Gaussian Process (GP) prior method. We present a Gaussian Process prior model-based framework for multisensor data fusion and explore the use of this model for fusing mobile inertial sensors and an external position sensing device. The Gaussian Process prior model provides a principled mechanism for incorporating the low-sampling-rate position measurements and the high-sampling-rate derivatives in multi-rate sensor fusion, which takes account of the uncertainty of each sensor type. We explore the complementary properties of the Kinect sensor and the built-in inertial sensors in a mobile device and apply the GP framework for sensor fusion in the mobile human-computer interaction area. The Gaussian Process prior model-based sensor fusion is presented as a principled probabilistic approach to dealing with position uncertainty and the lag of the system, which are critical for indoor augmented reality (AR) and other location-aware sensing applications. The sensor fusion helps increase the stability of the position and reduce the lag. This is of great benefit for improving the usability of a human-computer interaction system. We develop two applications using the novel and improved GP prior model. (1) User matching and identification. We apply the GP model to identify individual users, by matching the observed Kinect skeletons with the sensed inertial data from their mobile devices. (2) Position stabilisation and lag reduction in a spatially aware display application for user performance improvement. We conduct a user study. Experimental results show the improved accuracy of target selection, and reduced delay from the sensor fusion system, allowing the users to acquire the target more rapidly, and with fewer errors in comparison with the Kinect filtered system. They also reported improved performance in subjective questions. The two applications can be combined seamlessly in a proxemic interaction system as identification of people and their positions in a room-sized environment plays a key role in proxemic interactions

    A multimodal pattern recognition framework for speaker detection

    Get PDF
    Speaker detection is an important component of a speech-based user interface. Audiovisual speaker detection, speech and speaker recognition or speech synthesis for example find multiple applications in human-computer interaction, multimedia content indexing, biometrics, etc. Generally speaking, any interface which relies on speech for communication requires an estimate of the user's speaking state (i.e. whether or not he/she is speaking to the system) for its reliable functioning. One needs therefore to identify the speaker and discriminate from other users or background noise. A human observer would perform such a task very easily, although this decision results from a complex cognitive process referred to as decision-making. Generally speaking, this process starts with the acquisition by the human being of information about the environment, through each of its five senses. The brain then integrates these multiple information. An amazing property of this multi-sensory integration by the brain, as pointed out by cognitive sciences, is the perception of stimuli of different modalities as originating from a single source, provided they are synchronized in space and time. A speaker is a bimodal source emitting jointly an auditory signal and a visual signal (the motion of the articulators during speech production). The two signals are obviously co-occurring spatio-temporally. This interesting property allows us – as human observers – to discriminate between a speaking mouth and a mouth whose motion is not related with the auditory signal. This dissertation deals with the modelling of such a complex decision-making, using a pattern recognition procedure. A pattern recognition process comprises all the stages of an investigation, from data acquisition to classification and assessment of the results. In the audiovisual speaker detection problem, tackled more specifically in this thesis, the data are acquired using only one microphone and camera. The pattern recognizer integrates and combines these two modalities to perform and is therefore denoted as "multimodal". This multimodal approach is expected to increase the performance of the system. But it also raises many questions such as what should be fused, when in the decision process this fusion should take place, and how is it to be achieved. This thesis provides answers to each of these issues through the proposition of detailed solutions for each step of the classification process. The basic principle is to evaluate the synchrony between the audio and video features extracted from potentially speaking mouths, in order to classify each mouth as speaking or not. This synchrony is evaluated through a mutual information based function. A key to success is the extraction of suitable features. The audiovisual data are then processed through an information theoretic feature extraction framework after having been acquired and represented in a tractable way. This feature extraction framework uses jointly the two modalities in a feature-level fusion scheme. This way, the information originating from the common source is recovered while the independent noise is discarded. This approach is shown to minimize the probability of committing an error on the source estimate. These optimal features are put as inputs of the classifier, defined through a hypothesis testing approach. Using jointly the two modalities, it outputs a single decision about the class label of each candidate mouth region ("speaker" or "non-speaker"). Therefore, the acoustic and visual information are combined at both the feature and the decision levels, so that we can talk about a hybrid fusion method. The hypothesis testing approach gives means for evaluating the performance of the classifier itself but also of the whole pattern recognition system. In particular, the added-value offered by the feature extraction step can be assessed. The framework is applied in a first time with a particular emphasis on the audio modality: the information theoretic feature extraction addresses the optimization of the audio features using jointly the video information. As a result, audio features specific to speech production are produced. The system evaluation framework establishes that putting these features at input of the classifier increases its discrimination power with respect to equivalent non-optimized features. Then the enhancement of the video content is addressed more specifically. The mouth motion is obviously the suitable video representation for handling a task such as speaker detection. However, only an estimate of this motion, the optical flow, can be obtained. This estimation relies on the intensity gradient of the image sequence. Graph theory is used to establish a probabilistic model of the relationships between the audio, the motion and the image intensity gradient, in the particular case of a speaking mouth. The interpretation of this model leads back to the optimization function defined for the information theoretic feature extraction. As a result, a scale-space approach is proposed for estimating the optical flow, where the strength of the smoothness constraint is controlled via a mutual information based criterion involving both the audio and the video information. First results are promising even if more extensive tests should be carried out, in noisy conditions in particular. As a conclusion, this thesis proposes a complete pattern recognition framework dedicated to audiovisual speaker detection and minimizing the probability of misclassifying a mouth as "speaker" or "non-speaker". The importance of fusing the audio and video content as soon as at the feature level is demonstrated through the system evaluation stage included in the pattern recognition process

    Probabilistic Human-Robot Information Fusion

    Get PDF
    This thesis is concerned with combining the perceptual abilities of mobile robots and human operators to execute tasks cooperatively. It is generally agreed that a synergy of human and robotic skills offers an opportunity to enhance the capabilities of today’s robotic systems, while also increasing their robustness and reliability. Systems which incorporate both human and robotic information sources have the potential to build complex world models, essential for both automated and human decision making. In this work, humans and robots are regarded as equal team members who interact and communicate on a peer-to-peer basis. Human-robot communication is addressed using probabilistic representations common in robotics. While communication can in general be bidirectional, this work focuses primarily on human-to-robot information flow. More specifically, the approach advocated in this thesis is to let robots fuse their sensor observations with observations obtained from human operators. While robotic perception is well-suited for lower level world descriptions such as geometric properties, humans are able to contribute perceptual information on higher abstraction levels. Human input is translated into the machine representation via Human Sensor Models. A common mathematical framework for humans and robots reinforces the notion of true peer-to-peer interaction. Human-robot information fusion is demonstrated in two application domains: (1) scalable information gathering, and (2) cooperative decision making. Scalable information gathering is experimentally demonstrated on a system comprised of a ground vehicle, an unmanned air vehicle, and two human operators in a natural environment. Information from humans and robots was fused in a fully decentralised manner to build a shared environment representation on multiple abstraction levels. Results are presented in the form of information exchange patterns, qualitatively demonstrating the benefits of human-robot information fusion. The second application domain adds decision making to the human-robot task. Rational decisions are made based on the robots’ current beliefs which are generated by fusing human and robotic observations. Since humans are considered a valuable resource in this context, operators are only queried for input when the expected benefit of an observation exceeds the cost of obtaining it. The system can be seen as adjusting its autonomy at run-time based on the uncertainty in the robots’ beliefs. A navigation task is used to demonstrate the adjustable autonomy system experimentally. Results from two experiments are reported: a quantitative evaluation of human-robot team effectiveness, and a user study to compare the system to classical teleoperation. Results show the superiority of the system with respect to performance, operator workload, and usability
    corecore