356 research outputs found

    Constructing Robust Emotional State-based Feature with a Novel Voting Scheme for Multi-modal Deception Detection in Videos

    Full text link
    Deception detection is an important task that has been a hot research topic due to its potential applications. It can be applied in many areas, from national security (e.g., airport security, jurisprudence, and law enforcement) to real-life applications (e.g., business and computer vision). However, some critical problems still exist and are worth more investigation. One of the significant challenges in the deception detection tasks is the data scarcity problem. Until now, only one multi-modal benchmark open dataset for human deception detection has been released, which contains 121 video clips for deception detection (i.e., 61 for deceptive class and 60 for truthful class). Such an amount of data is hard to drive deep neural network-based methods. Hence, those existing models often suffer from overfitting problems and low generalization ability. Moreover, the ground truth data contains some unusable frames for many factors. However, most of the literature did not pay attention to these problems. Therefore, in this paper, we design a series of data preprocessing methods to deal with the aforementioned problem first. Then, we propose a multi-modal deception detection framework to construct our novel emotional state-based feature and use the open toolkit openSMILE to extract the features from the audio modality. We also design a voting scheme to combine the emotional states information obtained from visual and audio modalities. Finally, we can determine the novel emotion state transformation feature with our self-designed algorithms. In the experiment, we conduct the critical analysis and comparison of the proposed methods with the state-of-the-art multi-modal deception detection methods. The experimental results show that the overall performance of multi-modal deception detection has a significant improvement in the accuracy from 87.77% to 92.78% and the ROC-AUC from 0.9221 to 0.9265.Comment: 8 pages, for AAAI23 publicatio

    Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors

    Full text link
    The proliferation of DeepFake technology is a rising challenge in today's society, owing to more powerful and accessible generation methods. To counter this, the research community has developed detectors of ever-increasing accuracy. However, the ability to explain the decisions of such models to users is lacking behind and is considered an accessory in large-scale benchmarks, despite being a crucial requirement for the correct deployment of automated tools for content moderation. We attribute the issue to the reliance on qualitative comparisons and the lack of established metrics. We describe a simple set of metrics to evaluate the visual quality and informativeness of explanations of video DeepFake classifiers from a human-centric perspective. With these metrics, we compare common approaches to improve explanation quality and discuss their effect on both classification and explanation performance on the recent DFDC and DFD datasets.Comment: Accepted at BMVC 2022, code repository at https://github.com/baldassarreFe/deepfake-detectio

    Detecting Deceptive Dark-Pattern Web Advertisements for Blind Screen-Reader Users

    Get PDF
    Advertisements have become commonplace on modern websites. While ads are typically designed for visual consumption, it is unclear how they affect blind users who interact with the ads using a screen reader. Existing research studies on non-visual web interaction predominantly focus on general web browsing; the specific impact of extraneous ad content on blind users\u27 experience remains largely unexplored. To fill this gap, we conducted an interview study with 18 blind participants; we found that blind users are often deceived by ads that contextually blend in with the surrounding web page content. While ad blockers can address this problem via a blanket filtering operation, many websites are increasingly denying access if an ad blocker is active. Moreover, ad blockers often do not filter out internal ads injected by the websites themselves. Therefore, we devised an algorithm to automatically identify contextually deceptive ads on a web page. Specifically, we built a detection model that leverages a multi-modal combination of handcrafted and automatically extracted features to determine if a particular ad is contextually deceptive. Evaluations of the model on a representative test dataset and \u27in-the-wild\u27 random websites yielded F1 scores of 0.86 and 0.88, respectively

    Fine-grained Haptics: Sensing and Actuating Haptic Primary Colours (force, vibration, and temperature)

    Get PDF
    This thesis discusses the development of a multimodal, fine-grained visual-haptic system for teleoperation and robotic applications. This system is primarily composed of two complementary components: an input device known as the HaptiTemp sensor (combines “Haptics” and “Temperature”), which is a novel thermosensitive GelSight-like sensor, and an output device, an untethered multimodal finegrained haptic glove. The HaptiTemp sensor is a visuotactile sensor that can sense haptic primary colours known as force, vibration, and temperature. It has novel switchable UV markers that can be made visible using UV LEDs. The switchable markers feature is a real novelty of the HaptiTemp because it can be used in the analysis of tactile information from gel deformation without impairing the ability to classify or recognise images. The use of switchable markers in the HaptiTemp sensor is the solution to the trade-off between marker density and capturing high-resolution images using one sensor. The HaptiTemp sensor can measure vibrations by counting the number of blobs or pulses detected per unit time using a blob detection algorithm. For the first time, temperature detection was incorporated into a GelSight-like sensor, making the HaptiTemp sensor a haptic primary colours sensor. The HaptiTemp sensor can also do rapid temperature sensing with a 643 ms response time for the 31°C to 50°C temperature range. This fast temperature response of the HaptiTemp sensor is comparable to the withdrawal reflex response in humans. This is the first time a sensor can trigger a sensory impulse that can mimic a human reflex in the robotic community. The HaptiTemp sensor can also do simultaneous temperature sensing and image classification using a machine vision camera—the OpenMV Cam H7 Plus. This capability of simultaneous sensing and image classification has not been reported or demonstrated by any tactile sensor. The HaptiTemp sensor can be used in teleoperation because it can communicate or transmit tactile analysis and image classification results using wireless communication. The HaptiTemp sensor is the closest thing to the human skin in tactile sensing, tactile pattern recognition, and rapid temperature response. In order to feel what the HaptiTemp sensor is touching from a distance, a corresponding output device, an untethered multimodal haptic hand wearable, is developed to actuate the haptic primary colours sensed by the HaptiTemp sensor. This wearable can communicate wirelessly and has fine-grained cutaneous feedback to feel the edges or surfaces of the tactile images captured by the HaptiTemp sensor. This untethered multimodal haptic hand wearable has gradient kinesthetic force feedback that can restrict finger movements based on the force estimated by the HaptiTemp sensor. A retractable string from an ID badge holder equipped with miniservos that control the stiffness of the wire is attached to each fingertip to restrict finger movements. Vibrations detected by the HaptiTemp sensor can be actuated by the tapping motion of the tactile pins or by a buzzing minivibration motor. There is also a tiny annular Peltier device, or ThermoElectric Generator (TEG), with a mini-vibration motor, forming thermo-vibro feedback in the palm area that can be activated by a ‘hot’ or ‘cold’ signal from the HaptiTemp sensor. The haptic primary colours can also be embedded in a VR environment that can be actuated by the multimodal hand wearable. A VR application was developed to demonstrate rapid tactile actuation of edges, allowing the user to feel the contours of virtual objects. Collision detection scripts were embedded to activate the corresponding actuator in the multimodal haptic hand wearable whenever the tactile matrix simulator or hand avatar in VR collides with a virtual object. The TEG also gets warm or cold depending on the virtual object the participant has touched. Tests were conducted to explore virtual objects in 2D and 3D environments using Leap Motion control and a VR headset (Oculus Quest 2). Moreover, a fine-grained cutaneous feedback was developed to feel the edges or surfaces of a tactile image, such as the tactile images captured by the HaptiTemp sensor, or actuate tactile patterns in 2D or 3D virtual objects. The prototype is like an exoskeleton glove with 16 tactile actuators (tactors) on each fingertip, 80 tactile pins in total, made from commercially available P20 Braille cells. Each tactor can be controlled individually to enable the user to feel the edges or surfaces of images, such as the high-resolution tactile images captured by the HaptiTemp sensor. This hand wearable can be used to enhance the immersive experience in a virtual reality environment. The tactors can be actuated in a tapping manner, creating a distinct form of vibration feedback as compared to the buzzing vibration produced by a mini-vibration motor. The tactile pin height can also be varied, creating a gradient of pressure on the fingertip. Finally, the integration of the high-resolution HaptiTemp sensor, and the untethered multimodal, fine-grained haptic hand wearable is presented, forming a visuotactile system for sensing and actuating haptic primary colours. Force, vibration, and temperature sensing tests with corresponding force, vibration, and temperature actuating tests have demonstrated a unified visual-haptic system. Aside from sensing and actuating haptic primary colours, touching the edges or surfaces of the tactile images captured by the HaptiTemp sensor was carried out using the fine-grained cutaneous feedback of the haptic hand wearable

    Navigation for automatic guided vehicles using omnidirectional optical sensing

    Get PDF
    Thesis (M. Tech. (Engineering: Electrical)) -- Central University of technology, Free State, 2013Automatic Guided Vehicles (AGVs) are being used more frequently in a manufacturing environment. These AGVs are navigated in many different ways, utilising multiple types of sensors for detecting the environment like distance, obstacles, and a set route. Different algorithms or methods are then used to utilise this environmental information for navigation purposes applied onto the AGV for control purposes. Developing a platform that could be easily reconfigured in alternative route applications utilising vision was one of the aims of the research. In this research such sensors detecting the environment was replaced and/or minimised by the use of a single, omnidirectional Webcam picture stream utilising an own developed mirror and Perspex tube setup. The area of interest in each frame was extracted saving on computational recourses and time. By utilising image processing, the vehicle was navigated on a predetermined route. Different edge detection methods and segmentation methods were investigated on this vision signal for route and sign navigation. Prewitt edge detection was eventually implemented, Hough transfers used for border detection and Kalman filtering for minimising border detected noise for staying on the navigated route. Reconfigurability was added to the route layout by coloured signs incorporated in the navigation process. The result was the manipulation of a number of AGV’s, each on its own designated coloured signed route. This route could be reconfigured by the operator with no programming alteration or intervention. The YCbCr colour space signal was implemented in detecting specific control signs for alternative colour route navigation. The result was used generating commands to control the AGV through serial commands sent on a laptop’s Universal Serial Bus (USB) port with a PIC microcontroller interface board controlling the motors by means of pulse width modulation (PWM). A total MATLAB® software development platform was utilised by implementing written M-files, Simulink® models, masked function blocks and .mat files for sourcing the workspace variables and generating executable files. This continuous development system lends itself to speedy evaluation and implementation of image processing options on the AGV. All the work done in the thesis was validated by simulations using actual data and by physical experimentation

    The role of the magnocellular system in implicit cognition

    Get PDF
    Includes bibliographical references.Implicit cognition paradigms, such as the IAT (Implicit Association Test) are not well understood and are frequently thought to involve ‘unconscious’ attitudes. Although there has been a theoretical shift away from psychoanalytic ideas about consciousness towards more cognitively orientated views, a mystique lingers. The magnocellular (M) system is thought to be involved in rapid but coarse information processing which is a basis for certain kinds of automaticity in information processing, such as reading or visual object recognition. The question this research addressed concerns visual perceptual processes which are not easily controllable and which occur without conscious effort. The role of the M system was investigated in word recognition, object recognition and race feature recognition in a series of experiments. There is evidence that the M system facilitates word recognition in terms of detection accuracy. An experiment involving object recognition replicated the research of Kveraga, et al. (2007b) and similar results were obtained in that when objects were presented under a condition which favoured the M system, recognition accuracy was significantly better and reaction time was significantly shorter than when objects were presented in a condition which favoured the parvocellular (P) system. A series of IAT experiments replicated findings from 1) species, 2) race experiments and similar results were obtained to those reported in the literature. The race IAT experiment was then adapted to use images that were biased towards either the M or P visual systems. As the M system appears to facilitate object recognition, probably by a top-down processing mechanism which may be associated with perceptual automaticity, it was predicted that IAT scores would not be affected in this condition. It was predicted that when images were presented in a condition which inhibited M system functioning, IAT scores would be more neutral (suggesting less response bias). There was a trend which supported this prediction, but the summary score analysis did not show a statistically significant difference

    Detection and identifitication of registration and fishing gear in vessels

    Get PDF
    Illegal, unreported and unregulated (IUU) fishing is a global menace to both marine ecosystems and sustainable fisheries. IUU products often come from fisheries lacking conservation and management measures, which allows the violation of bycatch limits or unreported catching. To counteract such issue, some countries adopted vessel monitoring systems (VMS) in order to track and monitor the activities of fishing vessels. The VMS approach is not flawless and as such, there are still known cases of IUU fishing. The present work is integrated in a project PT2020 SeeItAll of the company Xsealence and was included in INOV tasks in which a monitoring system using video cameras in the Ports (Non-boarded System) was developed, in order to detect registrations of vessels. This system registers the time of entry or exit of the vessel in the port. A second system (Boarded System) works with a camera placed in each vessel and an automatic learning algorithm detects and records fishing activities, for a comparison with the vessel’s fishing report.A pesca ilegal, não declarada e não regulamentada (INDNR) é uma ameaça global tanto para os ecossistemas marinhos quanto para a pesca sustentável. Os produtos INDNR são frequentemente provenientes de pescas que não possuem medidas de conservação e de gestão, o que permite a violação dos limites das capturas ou a captura não declarada. Para contrariar esse problema, alguns países adotaram sistemas de monitoramento de embarcações (VMS) para acompanhar e monitorar as atividades dos navios de pesca. A abordagem VMS não é perfeita e, como tal, ainda há casos conhecidos de pesca INDNR. O presente trabalho encontra-se integrado num projeto PT2020 SeeItAll da empresa Xsealence. Este trabalho integrado nas tarefas do INOV no qual foi desenvolvido um sistema de monitorização das entradas dos navios nos Portos (Sistema não embarcado) no qual pretende-se desenvolver um sistema que detete as matriculas dos navios registando a hora de entrada e saída do porto com recurso da camaras de vídeo. A outra componente (sistema embarcado) é colocada em cada embarcação uma camara de video e, recorrendo a aprendizagem automática e um sistema de CCTV, são detetadas as atividades de pesca e gravadas, para posterior comparação com o relatório de pesca do navio

    Domain anomaly detection in machine perception: a system architecture and taxonomy

    Get PDF
    We address the problem of anomaly detection in machine perception. The concept of domain anomaly is introduced as distinct from the conventional notion of anomaly used in the literature. We propose a unified framework for anomaly detection which exposes the multifacetted nature of anomalies and suggest effective mechanisms for identifying and distinguishing each facet as instruments for domain anomaly detection. The framework draws on the Bayesian probabilistic reasoning apparatus which clearly defines concepts such as outlier, noise, distribution drift, novelty detection (object, object primitive), rare events, and unexpected events. Based on these concepts we provide a taxonomy of domain anomaly events. One of the mechanisms helping to pinpoint the nature of anomaly is based on detecting incongruence between contextual and noncontextual sensor(y) data interpretation. The proposed methodology has wide applicability. It underpins in a unified way the anomaly detection applications found in the literature

    Detecting Deception Through Non-Verbal Behaviour

    Get PDF
    The security protocols used in airport security checkpoints primarily aim to detect prohibited items, as well as the detection of malicious intent and associated deception to thwart any threats. However, some of the security protocols that are used are not substantiated by scientifically validated cues of deception. Instead, some protocols, such as the Screening of Passengers by Observation Techniques (SPOT) program, have been developed based on anecdotal evidence and invalid cues of deception. As such, the use of these protocols has received a lot of criticism in recent years from government agencies, civil rights organisations and academia. These security protocols rely on security personnel’s ability to infer intent from non-verbal behaviour, yet the literature suggests that the relationship between non-verbal cues and deception is unreliable and that people are poor at detecting deception. To improve upon our understanding of the validity of these protocols, this thesis used virtual reality to replicate a security checkpoint to explore whether there were valid cues of deception, specifically in an airport context. People’s ability to identify whether others were behaving deceptively was assessed, as well as the factors that may be informing decision-making. Chapter Four of this thesis found that the non-verbal cues of interest, which were segment displacement, centre of mass displacement, cadence, step length and speed were not significantly different between honest and deceptive people. A verbal measure, response latency, was found to only distinguish between honest people and those who were deceptive about a future intention, but not those who were deceptive about having a prohibited item. In light of the use of non-verbal measures in practice despite the lack of scientific support, Chapters Five to Seven aimed to gain a greater insight into people’s deception detection capabilities. The findings from Chapters Five to Seven reflected that the ability to detect deception from non-verbal behaviour was no better than guessing. Specifically, Chapter Five found that the accuracy of detecting deception was no different from chance levels. Six themes emerged as the factors that were used to inform decision-making. The themes were physical appearance, disposition, walking behaviour, body positioning, looking behaviour and upper limb movement, though a qualitative analysis revealed that there were subjective interpretations of how the themes mapped onto deception. Chapter Six introduced two techniques of information reduction to assess whether accuracy could be improved above chance levels by lessening the impact of biasing factors. Neither technique resulted in accuracy above chance levels. In Chapter Seven, eye tracking was utilised to assess the gaze patterns associated with the detection of deception. People looked at the legs more than other areas of the body prior to decision-making, though only looking at the left arm and hand were linked with accuracy. Detection accuracy was poor overall, though looking at the left arm was linked with reduced accuracy, whilst looking at the left hand was linked with increased accuracy. Overall, this thesis showed that the non-verbal cues that were assessed could not distinguish between honest and deceptive people. In the absence of valid cues, observers were not able to identify deception at a rate above chance even with the reduction of potentially biasing factors. The results of this thesis reinforce the idea that incorporating nonverbal measures into threat/deception detection protocols may not be warranted because of the dubious nature of their reliability and validity, as well as the poor deception identification capabilities when relying on non-verbal behaviour

    Coding Strategies for Genetic Algorithms and Neural Nets

    Get PDF
    The interaction between coding and learning rules in neural nets (NNs), and between coding and genetic operators in genetic algorithms (GAs) is discussed. The underlying principle advocated is that similar things in "the world" should have similar codes. Similarity metrics are suggested for the coding of images and numerical quantities in neural nets, and for the coding of neural network structures in genetic algorithms. A principal component analysis of natural images yields receptive fields resembling horizontal and vertical edge and bar detectors. The orientation sensitivity of the "bar detector" components is found to match a psychophysical model, suggesting that the brain may make some use of principal components in its visual processing. Experiments are reported on the effects of different input and output codings on the accuracy of neural nets handling numeric data. It is found that simple analogue and interpolation codes are most successful. Experiments on the coding of image data demonstrate the sensitivity of final performance to the internal structure of the net. The interaction between the coding of the target problem and reproduction operators of mutation and recombination in GAs are discussed and illustrated. The possibilities for using GAs to adapt aspects of NNs are considered. The permutation problem, which affects attempts to use GAs both to train net weights and adapt net structures, is illustrated and methods to reduce it suggested. Empirical tests using a simulated net design problem to reduce evaluation times indicate that the permutation problem may not be as severe as has been thought, but suggest the utility of a sorting recombination operator, that matches hidden units according to the number of connections they have in common. A number of experiments using GAs to design network structures are reported, both to specify a net to be trained from random weights, and to prune a pre-trained net. Three different coding methods are tried, and various sorting recombination operators evaluated. The results indicate that appropriate sorting can be beneficial, but the effects are problem-dependent. It is shown that the GA tends to overfit the net to the particular set of test criteria, to the possible detriment of wider generalisation ability. A method of testing the ability of a GA to make progress in the presence of noise, by adding a penalty flag, is described
    corecore