1,433 research outputs found

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Facial feature point fitting with combined color and depth information for interactive displays

    Get PDF
    Interactive displays are driven by natural interaction with the user, necessitating a computer system that recognizes body gestures and facial expressions. User inputs are not easily or reliably recognized for a satisfying user experience, as the complexities of human communication are difficult to interpret in real-time. Recognizing facial expressions in particular is a problem that requires high-accuracy and efficiency for stable interaction environments. The recent availability of the Kinect, a low cost, low resolution sensor that supplies simultaneous color and depth images, provides a breakthrough opportunity to enhance the interactive capabilities of displays and overall user experience. This new RGBD (RGB + depth) sensor generates an additional channel of depth information that can be used to improve the performance of existing state of the art technology and develop new techniques. The Active Shape Model (ASM) is a well-known deformable model that has been extensively studied for facial feature point placement. Previous shape model techniques have applied 3D reconstruction techniques using multiple cameras or other statistical methods for producing 3D information from 2D color images. These methods showed improved results compared to using only color data, but required an additional deformable model or expensive imaging equipment. In this thesis, an ASM model is trained using the RGBD image produced by the Kinect. The real-time information from the depth sensor is registered to the color image to create a pixel-for-pixel match. To improve the quality of the depth image, a temporal median filter is applied to reduce random noise produced by the sensor. The resulting combined model is designed to produce more robust fitting of facial feature points compared to a purely color based active shape model

    A system for recognizing human emotions based on speech analysis and facial feature extraction: applications to Human-Robot Interaction

    Get PDF
    With the advance in Artificial Intelligence, humanoid robots start to interact with ordinary people based on the growing understanding of psychological processes. Accumulating evidences in Human Robot Interaction (HRI) suggest that researches are focusing on making an emotional communication between human and robot for creating a social perception, cognition, desired interaction and sensation. Furthermore, robots need to receive human emotion and optimize their behavior to help and interact with a human being in various environments. The most natural way to recognize basic emotions is extracting sets of features from human speech, facial expression and body gesture. A system for recognition of emotions based on speech analysis and facial features extraction can have interesting applications in Human-Robot Interaction. Thus, the Human-Robot Interaction ontology explains how the knowledge of these fundamental sciences is applied in physics (sound analyses), mathematics (face detection and perception), philosophy theory (behavior) and robotic science context. In this project, we carry out a study to recognize basic emotions (sadness, surprise, happiness, anger, fear and disgust). Also, we propose a methodology and a software program for classification of emotions based on speech analysis and facial features extraction. The speech analysis phase attempted to investigate the appropriateness of using acoustic (pitch value, pitch peak, pitch range, intensity and formant), phonetic (speech rate) properties of emotive speech with the freeware program PRAAT, and consists of generating and analyzing a graph of speech signals. The proposed architecture investigated the appropriateness of analyzing emotive speech with the minimal use of signal processing algorithms. 30 participants to the experiment had to repeat five sentences in English (with durations typically between 0.40 s and 2.5 s) in order to extract data relative to pitch (value, range and peak) and rising-falling intonation. Pitch alignments (peak, value and range) have been evaluated and the results have been compared with intensity and speech rate. The facial feature extraction phase uses the mathematical formulation (B\ue9zier curves) and the geometric analysis of the facial image, based on measurements of a set of Action Units (AUs) for classifying the emotion. The proposed technique consists of three steps: (i) detecting the facial region within the image, (ii) extracting and classifying the facial features, (iii) recognizing the emotion. Then, the new data have been merged with reference data in order to recognize the basic emotion. Finally, we combined the two proposed algorithms (speech analysis and facial expression), in order to design a hybrid technique for emotion recognition. Such technique have been implemented in a software program, which can be employed in Human-Robot Interaction. The efficiency of the methodology was evaluated by experimental tests on 30 individuals (15 female and 15 male, 20 to 48 years old) form different ethnic groups, namely: (i) Ten adult European, (ii) Ten Asian (Middle East) adult and (iii) Ten adult American. Eventually, the proposed technique made possible to recognize the basic emotion in most of the cases

    Toward 3D reconstruction of outdoor scenes using an MMW radar and a monocular vision sensor

    Get PDF
    International audienceIn this paper, we introduce a geometric method for 3D reconstruction of the exterior environment using a panoramic microwave radar and a camera. We rely on the complementarity of these two sensors considering the robustness to the environmental conditions and depth detection ability of the radar, on the one hand, and the high spatial resolution of a vision sensor, on the other. Firstly, geometric modeling of each sensor and of the entire system is presented. Secondly, we address the global calibration problem, which consists of finding the exact transformation between the sensors' coordinate systems. Two implementation methods are proposed and compared, based on the optimization of a non-linear criterion obtained from a set of radar-to-image target correspondences. Unlike existing methods, no special configuration of the 3D points is required for calibration. This makes the methods flexible and easy to use by a non-expert operator. Finally, we present a very simple, yet robust 3D reconstruction method based on the sensors' geometry. This method enables one to reconstruct observed features in 3D using one acquisition (static sensor), which is not always met in the state of the art for outdoor scene reconstruction.The proposed methods have been validated with synthetic and real data

    Motion Planning and Control of Dynamic Humanoid Locomotion

    Get PDF
    Inspired by human, humanoid robots has the potential to become a general-purpose platform that lives along with human. Due to the technological advances in many field, such as actuation, sensing, control and intelligence, it finally enables humanoid robots to possess human comparable capabilities. However, humanoid locomotion is still a challenging research field. The large number of degree of freedom structure makes the system difficult to coordinate online. The presence of various contact constraints and the hybrid nature of locomotion tasks make the planning a harder problem to solve. Template model anchoring approach has been adopted to bridge the gap between simple model behavior and the whole-body motion of humanoid robot. Control policies are first developed for simple template models like Linear Inverted Pendulum Model (LIPM) or Spring Loaded Inverted Pendulum(SLIP), the result controlled behaviors are then been mapped to the whole-body motion of humanoid robot through optimization-based task-space control strategies. Whole-body humanoid control framework has been verified on various contact situations such as unknown uneven terrain, multi-contact scenarios and moving platform and shows its generality and versatility. For walking motion, existing Model Predictive Control approach based on LIPM has been extended to enable the robot to walk without any reference foot placement anchoring. It is kind of discrete version of \u201cwalking without thinking\u201d. As a result, the robot could achieve versatile locomotion modes such as automatic foot placement with single reference velocity command, reactive stepping under large external disturbances, guided walking with small constant external pushing forces, robust walking on unknown uneven terrain, reactive stepping in place when blocked by external barrier. As an extension of this proposed framework, also to increase the push recovery capability of the humanoid robot, two new configurations have been proposed to enable the robot to perform cross-step motions. For more dynamic hopping and running motion, SLIP model has been chosen as the template model. Different from traditional model-based analytical approach, a data-driven approach has been proposed to encode the dynamics of the this model. A deep neural network is trained offline with a large amount of simulation data based on the SLIP model to learn its dynamics. The trained network is applied online to generate reference foot placements for the humanoid robot. Simulations have been performed to evaluate the effectiveness of the proposed approach in generating bio-inspired and robust running motions. The method proposed based on 2D SLIP model can be generalized to 3D SLIP model and the extension has been briefly mentioned at the end

    Augmented Reality and Health Informatics: A Study based on Bibliometric and Content Analysis of Scholarly Communication and Social Media

    Get PDF
    Healthcare outcomes have been shown to improve when technology is used as part of patient care. Health Informatics (HI) is a multidisciplinary study of the design, development, adoption, and application of IT-based innovations in healthcare services delivery, management, and planning. Augmented Reality (AR) is an emerging technology that enhances the user’s perception and interaction with the real world. This study aims to illuminate the intersection of the field of AR and HI. The domains of AR and HI by themselves are areas of significant research. However, there is a scarcity of research on augmented reality as it applies to health informatics. Given both scholarly research and social media communication having contributed to the domains of AR and HI, research methodologies of bibliometric and content analysis on scholarly research and social media communication were employed to investigate the salient features and research fronts of the field. The study used Scopus data (7360 scholarly publications) to identify the bibliometric features and to perform content analysis of the identified research. The Altmetric database (an aggregator of data sources) was used to determine the social media communication for this field. The findings from this study included Publication Volumes, Top Authors, Affiliations, Subject Areas and Geographical Locations from scholarly publications as well as from a social media perspective. The highest cited 200 documents were used to determine the research fronts in scholarly publications. Content Analysis techniques were employed on the publication abstracts as a secondary technique to determine the research themes of the field. The study found the research frontiers in the scholarly communication included emerging AR technologies such as tracking and computer vision along with Surgical and Learning applications. There was a commonality between social media and scholarly communication themes from an applications perspective. In addition, social media themes included applications of AR in Healthcare Delivery, Clinical Studies and Mental Disorders. Europe as a geographic region dominates the research field with 50% of the articles and North America and Asia tie for second with 20% each. Publication volumes show a steep upward slope indicating continued research. Social Media communication is still in its infancy in terms of data extraction, however aggregators like Altmetric are helping to enhance the outcomes. The findings from the study revealed that the frontier research in AR has made an impact in the surgical and learning applications of HI and has the potential for other applications as new technologies are adopted

    Ultra high frequency (UHF) radio-frequency identification (RFID) for robot perception and mobile manipulation

    Get PDF
    Personal robots with autonomy, mobility, and manipulation capabilities have the potential to dramatically improve quality of life for various user populations, such as older adults and individuals with motor impairments. Unfortunately, unstructured environments present many challenges that hinder robot deployment in ordinary homes. This thesis seeks to address some of these challenges through a new robotic sensing modality that leverages a small amount of environmental augmentation in the form of Ultra High Frequency (UHF) Radio-Frequency Identification (RFID) tags. Previous research has demonstrated the utility of infrastructure tags (affixed to walls) for robot localization; in this thesis, we specifically focus on tagging objects. Owing to their low-cost and passive (battery-free) operation, users can apply UHF RFID tags to hundreds of objects throughout their homes. The tags provide two valuable properties for robots: a unique identifier and receive signal strength indicator (RSSI, the strength of a tag's response). This thesis explores robot behaviors and radio frequency perception techniques using robot-mounted UHF RFID readers that enable a robot to efficiently discover, locate, and interact with UHF RFID tags applied to objects and people of interest. The behaviors and algorithms explicitly rely on the robot's mobility and manipulation capabilities to provide multiple opportunistic views of the complex electromagnetic landscape inside a home environment. The electromagnetic properties of RFID tags change when applied to common household objects. Objects can have varied material properties, can be placed in diverse orientations, and be relocated to completely new environments. We present a new class of optimization-based techniques for RFID sensing that are robust to the variation in tag performance caused by these complexities. We discuss a hybrid global-local search algorithm where a robot employing long-range directional antennas searches for tagged objects by maximizing expected RSSI measurements; that is, the robot attempts to position itself (1) near a desired tagged object and (2) oriented towards it. The robot first performs a sparse, global RFID search to locate a pose in the neighborhood of the tagged object, followed by a series of local search behaviors (bearing estimation and RFID servoing) to refine the robot's state within the local basin of attraction. We report on RFID search experiments performed in Georgia Tech's Aware Home (a real home). Our optimization-based approach yields superior performance compared to state of the art tag localization algorithms, does not require RF sensor models, is easy to implement, and generalizes to other short-range RFID sensor systems embedded in a robot's end effector. We demonstrate proof of concept applications, such as medication delivery and multi-sensor fusion, using these techniques. Through our experimental results, we show that UHF RFID is a complementary sensing modality that can assist robots in unstructured human environments.PhDCommittee Chair: Kemp, Charles C.; Committee Member: Abowd, Gregory; Committee Member: Howard, Ayanna; Committee Member: Ingram, Mary Ann; Committee Member: Reynolds, Matt; Committee Member: Tentzeris, Emmanoui
    • …
    corecore