6 research outputs found

    A novel micro-doppler coherence loss for deep learning radar applications

    Get PDF
    Deep learning techniques are subject to increasing adoption for a wide range of micro-Doppler applications, where predictions need to be made based on time-frequency signal representations. Most, if not all, of the reported applications focus on translating an existing deep learning framework to this new domain with no adjustment made to the objective function. This practice results in a missed opportunity to encourage the model to prioritize features that are particularly relevant for micro-Doppler applications. Thus the paper introduces a micro-Doppler coherence loss, minimized when the normalized power of micro-Doppler oscillatory components between input and output is matched. The experiments conducted on real data show that the application of the introduced loss results in models more resilient to noise

    Modeling and predicting emotion in music

    Get PDF
    With the explosion of vast and easily-accessible digital music libraries over the past decade, there has been a rapid expansion of research towards automated systems for searching and organizing music and related data. Online retailers now offer vast collections of music, spanning tens of millions of songs, available for immediate download. While these online stores present a drastically different dynamic than the record stores of the past, consumers still arrive with the same requests recommendation of music that is similar to their tastes; for both recommendation and curation, the vast digital music libraries of today necessarily require powerful automated tools.The medium of music has evolved speci cally for the expression of emotions, and it is natural for us to organize music in terms of its emotional associations. But while such organization is a natural process for humans, quantifying it empirically proves to be a very difficult task. Myriad features, such as harmony, timbre, interpretation, and lyrics affect emotion, and the mood of a piece may also change over its duration. Furthermore, in developing automated systems to organize music in terms of emotional content, we are faced with a problem that oftentimes lacks a well-defined answer; there may be considerable disagreement regarding the perception and interpretation of the emotions of a song or even ambiguity within the piece itself.Automatic identi cation of musical mood is a topic still in its early stages, though it has received increasing attention in recent years. Such work offers potential not just to revolutionize how we buy and listen to our music, but to provide deeper insight into the understanding of human emotions in general. This work seeks to relate core concepts from psychology to that of signal processing to understand how to extract information relevant to musical emotion from an acoustic signal. The methods discussed here survey existing features using psychology studies and develop new features using basis functions learned directly from magnitude spectra. Furthermore, this work presents a wide breadth of approaches in developing functional mappings between acoustic data and emotion space parameters. Using these models, a framework is constructed for content-based modeling and prediction of musical emotion.Ph.D., Electrical Engineering -- Drexel University, 201

    Multimodal radar sensing for ambient assisted living

    Get PDF
    Data acquired from health and behavioural monitoring of daily life activities can be exploited to provide real-time medical and nursing service with affordable cost and higher efficiency. A variety of sensing technologies for this purpose have been developed and presented in the literature, for instance, wearable IMU (Inertial Measurement Unit) to measure acceleration and angular speed of the person, cameras to record the images or video sequence, PIR (Pyroelectric infrared) sensor to detect the presence of the person based on Pyroelectric Effect, and radar to estimate distance and radial velocity of the person. Each sensing technology has pros and cons, and may not be optimal for the tasks. It is possible to leverage the strength of all these sensors through information fusion in a multimodal fashion. The fusion can take place at three different levels, namely, i) signal level where commensurate data are combined, ii) feature level where feature vectors of different sensors are concatenated and iii) decision level where confidence level or prediction label of classifiers are used to generate a new output. For each level, there are different fusion algorithms, the key challenge here is mainly on choosing the best existing fusion algorithm and developing novel fusion algorithms that more suitable for the current application. The fundamental contribution of this thesis is therefore exploring possible information fusion between radar, primarily FMCW (Frequency Modulated Continuous Wave) radar, and wearable IMU, between distributed radar sensors, and between UWB impulse radar and pressure sensor array. The objective is to sense and classify daily activities patterns, gait styles and micro-gestures as well as producing early warnings of high-risk events such as falls. Initially, only “snapshot” activities (single activity within a short X-s measurement) have been collected and analysed for verifying the accuracy improvement due to information fusion. Then continuous activities (activities that are performed one after another with random duration and transitions) have been collected to simulate the real-world case scenario. To overcome the drawbacks of conventional sliding-window approach on continuous data, a Bi-LSTM (Bidirectional Long Short-Term Memory) network is proposed to identify the transitions of daily activities. Meanwhile, a hybrid fusion framework is presented to exploit the power of soft and hard fusion. Moreover, a trilateration-based signal level fusion method has been successfully applied on the range information of three UWB (Ultra-wideband) impulse radar and the results show comparable performance as using micro-Doppler signature, at the price of much less computation loads. For classifying ‘snapshot’ activities, fusion between radar and wearable shows approximately 12% accuracy improvement compared to using radar only, whereas for classifying continuous activities and gaits, our proposed hybrid fusion and trilateration-based signal level improves roughly 6.8% (before 89%, after 95.8%) and 7.3% (before 85.4%, after 92.7%), respectively

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered
    corecore