40 research outputs found

    ProsocialLearn: D2.5 evaluation strategy and protocols

    No full text
    This document describes the evaluation strategy for the assessment of game effectiveness, market value impact and ethics procedure to drive detailed planning of technical validation, short and longitudinal studies and market viability tests

    VoIP security - attacks and solutions

    Get PDF
    Voice over IP (VoIP) technology is being extensively and rapidly deployed. Flexibility and cost efficiency are the key factors luring enterprises to transition to VoIP. Some security problems may surface with the widespread deployment of VoIP. This article presents an overview of VoIP systems and its security issues. First, we briefly describe basic VoIP architecture and its fundamental differences compared to PSTN. Next, basic VoIP protocols used for signaling and media transport, as well as defense mechanisms are described. Finally, current and potential VoIP attacks along with the approaches that have been adopted to counter the attacks are discussed

    Audio-visual speech activity detection in a two-speaker scenario incorporating depth information from a profile or frontal view

    No full text
    Motivated by increasing popularity of depth visual sensors, such as the Kinect device, we investigate the utility of depth information in audio-visual speech activity detection. A two-subject scenario is assumed, allowing to also consider speech overlap. Two sensory setups are employed, where depth video captures either a frontal or profile view of the subjects, and is subsequently combined with the corresponding planar video and audio streams. Further, multi-view fusion is regarded, using audio and planar video from a sensor at the complementary view setup. Support vector machines provide temporal speech activity classification for each visually detected subject, fusing the available modality streams. Classification results are further combined to yield speaker diarization. Experiments are reported on a suitable audio-visual corpus recorded by two Kinects. Results demonstrate the benefits of depth information, particularly in the frontal depth view setup, reducing speech activity detection and speaker diarization errors over systems that ignore it. © 2016 IEEE

    Strategic and Business Management applied on a business simulation game

    No full text
    104 σ.Τα επιχειρηματικά παίγνια προσομοίωσης συνιστούν ένα μέσο βιωματικής εκμάθησης το οποίο χρησιμοποιείται ολοένα και περισσότερο για την εκπαίδευση σχετικά με θέματα στρατηγικού σχεδιασμού και λήψης αποφάσεων. Η παρούσα διπλωματική εργασία πραγματεύεται την αναβάθμιση του παιγνίου επιχειρήσεων STRATEGY με στόχο την εξέλιξή του σε ένα πιο δυναμικό και ρεαλιστικό μέσο προσομοίωσης. Παρουσιάζονται τα βασικά χαρακτηριστικά του λογισμικού στην προηγούμενη μορφή του και αναλύονται όλες οι προσθήκες και οι τροποποιήσεις που πραγματοποιήθηκαν καθώς και το μαθηματικό μοντέλο που αναπτύχθηκε για την υλοποίησή τους. Το νέο λογισμικό ελέγχθηκε ως προς τη λειτουργία και την ακρίβεια των αποτελεσμάτων του με χρήση ιστορικών δεδομένων και έγιναν κατάλληλες διορθώσεις όπου απαιτούνταν. Επίσης έγινε δοκιμαστική χρήση του από ομάδα ελέγχου δίνοντας ανατροφοδότηση και οδηγώντας σε συμπεράσματα σχετικά με την προστιθέμενη αξία που δημιουργεί στη διδασκαλία των επιθυμητών εννοιών.Business simulation games are valuable learning-by-doing tools, which are used to teach efficiently and effectively strategic management and decision making. This thesis deals with the upgrade of an existing business game (STRATEGY) aiming at the development of a more sustainable and realistic version of it. The most important features of the previous version are presented and based on the in classroom experience new functionality was designed and developed. These additions and modifications along with the mathematic model for their implementation are in depth analyzed. The functionality and the accuracy of the results of the developed version were validated using historical data and the necessary corrections were effectuated. The game was tested by a focus group, leading to conclusions concerning its added value to the learning experience and the acquisition of hands on experience.Μάριος Θερμό

    Joint Object Affordance Reasoning and Segmentation in RGB-D Videos

    No full text
    Understanding human-object interaction is a fundamental challenge in computer vision and robotics. Crucial to it is the ability to infer 'object affordances' from visual data, namely the types of interaction supported by an object of interest and the object parts involved. Such inference can be approached as an 'affordance reasoning' task, where object affordances are recognized and localized as image heatmaps, and as an 'affordance segmentation' task, where affordance labels are obtained at a more detailed, image pixel level. To tackle the two tasks, existing methods typically: (i) treat them independently; (ii) adopt static image-based models, ignoring the temporal aspect of human-object interaction; and / or (iii) require additional strong supervision concerning object class and location. In this paper, we focus on both tasks, while addressing all three aforementioned shortcomings. For this purpose, we propose a deep-learning based dual encoder-decoder model for joint affordance reasoning and segmentation, which learns from our recently introduced SOR3D-AFF corpus of RGB-D human-object interaction videos, without relying on object localization and classification. The basic components of the model comprise: (i) two parallel encoders that capture spatiooral interaction information; (ii) a reasoning decoder that predicts affordance heatmaps, assisted by an affordance classifier and an attention mechanism; and (iii) a segmentation decoder that exploits the predicted heatmap to yield pixel-level affordance segmentation. All modules are jointly trained, while the system can operate on both static images and videos. The approach is evaluated on four datasets, surpassing the current state-of-the-art in both affordance reasoning and segmentation. © 2013 IEEE

    A Deep Learning Approach to Object Affordance Segmentation

    No full text
    Learning to understand and infer object functionalities is an important step towards robust visual intelligence. Significant research efforts have recently focused on segmenting the object parts that enable specific types of human-object interaction, the so-called object affordances. However, most works treat it as a static semantic segmentation problem, focusing solely on object appearance and relying on strong supervision and object detection. In this paper, we propose a novel approach that exploits the spatio-temporal nature of human-object interaction for affordance segmentation. In particular, we design an autoencoder that is trained using ground-truth labels of only the last frame of the sequence, and is able to infer pixel-wise affordance labels in both videos and static images. Our model surpasses the need for object labels and bounding boxes by using a soft-attention mechanism that enables the implicit localization of the interaction hotspot. For evaluation purposes, we introduce the SOR3D-AFF corpus, which consists of human-object interaction sequences and supports 9 types of affordances in terms of pixel-wise annotation, covering typical manipulations of tool-like objects. We show that our model achieves competitive results compared to strongly supervised methods on SOR3D-AFF, while being able to predict affordances for similar unseen objects in two affordance image-only datasets. © 2020 IEEE

    A Low-cost & Realtime Motion Capture System

    No full text
    Traditional marker-based motion capture requires excessive and specialized equipment, hindering accessibility and wider adoption. In this work, we demonstrate such a system but rely on a very sparse set of low-cost consumer-grade sensors. Our system exploits a data-driven backend to infer the captured subject's joint positions from noisy marker estimates in real-time. In addition to reduced costs and portability, its inherent denoising nature allows for quicker captures by alleviating the need for precise marker placement and post-processing, making it suitable for interactive virtual reality applications. © 2022 IEEE

    Deep sensorimotor learning for RGB-D object recognition

    No full text
    Research findings in cognitive neuroscience establish that humans, early on, develop their understanding of real-world objects by observing others interact with them or by performing active exploration and physical interactions with them. This fact has motivated the so-called “sensorimotor” learning approach, where the object appearance information (sensory) is combined with the object affordances (motor), i.e. the types of actions a human can perform with the object. In this work, the aforementioned paradigm is adopted, and a neuro-biologically inspired two-stream model for RGB-D object recognition is investigated. Both streams are realized as state-of-the-art deep neural networks that process and fuse appearance and affordance information in multiple ways. In particular, three model variants are developed to efficiently encode the spatio-temporal nature of the hand–object interaction, while an attention mechanism that relies on the appearance stream confidence is also investigated. Additionally, a suitable auxiliary loss is proposed for model training, utilized to further optimize both information streams. Experiments on the challenging SOR3D dataset, which consists of 14 object types and 13 object affordances, demonstrate the efficacy of the proposed model in RGB-D object recognition. Overall, the best performing developed model achieves 90.70% classification accuracy, which is further increased to 91.98% when trained using the auxiliary loss. The latter corresponds to 46% relative error reduction compared to the appearance-only classifier performance. Finally, a cross-view analysis on the SOR3D dataset provides valuable feedback for the viewpoint impact on the affordance information. © 2019 Elsevier Inc

    Attention-Enhanced Sensorimotor Object Recognition

    No full text
    Sensorimotor learning, namely the process of understanding the physical world by combining visual and motor information, has been recently investigated, achieving promising results for the task of 2D/3D object recognition. Following the recent trend in computer vision, powerful deep neural networks (NNs) have been used to model the 'sensory' and 'motor' information, namely the object appearance and affordance. However, the existing implementations cannot efficiently address the spatio-temporal nature of the human-object interaction. Inspired by recent work on attention-based learning, this paper introduces an attention-enhanced NN-based model that learns to selectively focus on parts of the physical interaction where the object appearance is corrupted by occlusions and deformations. The model's attention mechanism relies on the confidence of classifying an object based solely on its appearance. Three metrics are used to measure the latter, namely the prediction entropy, the average N-best likelihood difference, and the N-best likelihood dispersion. Evaluation of the attention-enhanced model on the SOR3D dataset reports 33% and 26% relative improvement over the appearance-only and the spatio-temporal fusion baseline models, respectively. © 2018 IEEE

    Deep affordance-grounded sensorimotor object recognition

    No full text
    It is well-established by cognitive neuroscience that human perception of objects constitutes a complex process, where object appearance information is combined with evidence about the so-called object “affordances”, namely the types of actions that humans typically perform when interacting with them. This fact has recently motivated the “sensorimotor” approach to the challenging task of automatic object recognition, where both information sources are fused to improve robustness. In this work, the aforementioned paradigm is adopted, surpassing current limitations of sensorimotor object recognition research. Specifically, the deep learning paradigm is introduced to the problem for the first time, developing a number of novel neuro-biologically and neuro-physiologically inspired architectures that utilize state-of-the-art neural networks for fusing the available information sources in multiple ways. The proposed methods are evaluated using a large RGB-D corpus, which is specifically collected for the task of sensorimotor object recognition and is made publicly available. Experimental results demonstrate the utility of affordance information to object recognition, achieving an up to 29% relative error reduction by its inclusion. © 2017 IEEE
    corecore