11,250 research outputs found

    Visual Rendering of Shapes on 2D Display Devices Guided by Hand Gestures

    Full text link
    Designing of touchless user interface is gaining popularity in various contexts. Using such interfaces, users can interact with electronic devices even when the hands are dirty or non-conductive. Also, user with partial physical disability can interact with electronic devices using such systems. Research in this direction has got major boost because of the emergence of low-cost sensors such as Leap Motion, Kinect or RealSense devices. In this paper, we propose a Leap Motion controller-based methodology to facilitate rendering of 2D and 3D shapes on display devices. The proposed method tracks finger movements while users perform natural gestures within the field of view of the sensor. In the next phase, trajectories are analyzed to extract extended Npen++ features in 3D. These features represent finger movements during the gestures and they are fed to unidirectional left-to-right Hidden Markov Model (HMM) for training. A one-to-one mapping between gestures and shapes is proposed. Finally, shapes corresponding to these gestures are rendered over the display using MuPad interface. We have created a dataset of 5400 samples recorded by 10 volunteers. Our dataset contains 18 geometric and 18 non-geometric shapes such as "circle", "rectangle", "flower", "cone", "sphere" etc. The proposed methodology achieves an accuracy of 92.87% when evaluated using 5-fold cross validation method. Our experiments revel that the extended 3D features perform better than existing 3D features in the context of shape representation and classification. The method can be used for developing useful HCI applications for smart display devices.Comment: Submitted to Elsevier Displays Journal, 32 pages, 18 figures, 7 table

    Intelligent Approaches to interact with Machines using Hand Gesture Recognition in Natural way: A Survey

    Full text link
    Hand gestures recognition (HGR) is one of the main areas of research for the engineers, scientists and bioinformatics. HGR is the natural way of Human Machine interaction and today many researchers in the academia and industry are working on different application to make interactions more easy, natural and convenient without wearing any extra device. HGR can be applied from games control to vision enabled robot control, from virtual reality to smart home systems. In this paper we are discussing work done in the area of hand gesture recognition where focus is on the intelligent approaches including soft computing based methods like artificial neural network, fuzzy logic, genetic algorithms etc. The methods in the preprocessing of image for segmentation and hand image construction also taken into study. Most researchers used fingertips for hand detection in appearance based modeling. Finally the comparison of results given by different researchers is also presented

    Reasoning about Body-Parts Relations for Sign Language Recognition

    Full text link
    Over the years, hand gesture recognition has been mostly addressed considering hand trajectories in isolation. However, in most sign languages, hand gestures are defined on a particular context (body region). We propose a pipeline to perform sign language recognition which models hand movements in the context of other parts of the body captured in the 3D space using the MS Kinect sensor. In addition, we perform sign recognition based on the different hand postures that occur during a sign. Our experiments show that considering different body parts brings improved performance when compared to other methods which only consider global hand trajectories. Finally, we demonstrate that the combination of hand postures features with hand gestures features helps to improve the prediction of a given sign.Comment: Under Review ( 15 Pages: 13 Figures, 6 Tables

    A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset

    Get PDF
    This paper aims to determine which is the best human action recognition method based on features extracted from RGB-D devices, such as the Microsoft Kinect. A review of all the papers that make reference to MSR Action3D, the most used dataset that includes depth information acquired from a RGB-D device, has been performed. We found that the validation method used by each work differs from the others. So, a direct comparison among works cannot be made. However, almost all the works present their results comparing them without taking into account this issue. Therefore, we present different rankings according to the methodology used for the validation in orden to clarify the existing confusion.Comment: 16 pages and 7 table

    Robust 3D Action Recognition through Sampling Local Appearances and Global Distributions

    Full text link
    3D action recognition has broad applications in human-computer interaction and intelligent surveillance. However, recognizing similar actions remains challenging since previous literature fails to capture motion and shape cues effectively from noisy depth data. In this paper, we propose a novel two-layer Bag-of-Visual-Words (BoVW) model, which suppresses the noise disturbances and jointly encodes both motion and shape cues. First, background clutter is removed by a background modeling method that is designed for depth data. Then, motion and shape cues are jointly used to generate robust and distinctive spatial-temporal interest points (STIPs): motion-based STIPs and shape-based STIPs. In the first layer of our model, a multi-scale 3D local steering kernel (M3DLSK) descriptor is proposed to describe local appearances of cuboids around motion-based STIPs. In the second layer, a spatial-temporal vector (STV) descriptor is proposed to describe the spatial-temporal distributions of shape-based STIPs. Using the Bag-of-Visual-Words (BoVW) model, motion and shape cues are combined to form a fused action representation. Our model performs favorably compared with common STIP detection and description methods. Thorough experiments verify that our model is effective in distinguishing similar actions and robust to background clutter, partial occlusions and pepper noise

    Dynamic Gesture Recognition by Using CNNs and Star RGB: a Temporal Information Condensation

    Full text link
    Due to the advance of technologies, machines are increasingly present in people's daily lives. Thus, there has been more and more effort to develop interfaces, such as dynamic gestures, that provide an intuitive way of interaction. Currently, the most common trend is to use multimodal data, as depth and skeleton information, to enable dynamic gesture recognition. However, using only color information would be more interesting, since RGB cameras are usually available in almost every public place, and could be used for gesture recognition without the need of installing other equipment. The main problem with such approach is the difficulty of representing spatio-temporal information using just color. With this in mind, we propose a technique capable of condensing a dynamic gesture, shown in a video, in just one RGB image. We call this technique star RGB. This image is then passed to a classifier formed by two Resnet CNNs, a soft-attention ensemble, and a fully connected layer, which indicates the class of the gesture present in the input video. Experiments were carried out using both Montalbano and GRIT datasets. For Montalbano dataset, the proposed approach achieved an accuracy of 94.58%. Such result reaches the state-of-the-art when considering this dataset and only color information. Regarding the GRIT dataset, our proposal achieves more than 98% of accuracy, recall, precision, and F1-score, outperforming the reference approach by more than 6%.Comment: 19 pages, 12 figures, submitted to Neurocomputing Journa

    Hand Gesture Controlled Drones: An Open Source Library

    Full text link
    Drones are conventionally controlled using joysticks, remote controllers, mobile applications, and embedded computers. A few significant issues with these approaches are that drone control is limited by the range of electromagnetic radiation and susceptible to interference noise. In this study we propose the use of hand gestures as a method to control drones. We investigate the use of computer vision methods to develop an intuitive way of agent-less communication between a drone and its operator. Computer vision-based methods rely on the ability of a drone's camera to capture surrounding images and use pattern recognition to translate images to meaningful and/or actionable information. The proposed framework involves a few key parts toward an ultimate action to be taken. They are: image segregation from the video streams of front camera, creating a robust and reliable image recognition based on segregated images, and finally conversion of classified gestures into actionable drone movement, such as takeoff, landing, hovering and so forth. A set of five gestures are studied in this work. Haar feature-based AdaBoost classifier is employed for gesture recognition. We also envisage safety of the operator and drone's action calculating the distance based on computer vision for this task. A series of experiments are conducted to measure gesture recognition accuracies considering the major scene variabilities, illumination, background, and distance. Classification accuracies show that well-lit, clear background, and within 3 ft gestures are recognized correctly over 90%. Limitations of current framework and feasible solutions for better gesture recognition are discussed, too. The software library we developed, and hand gesture data sets are open-sourced at project website.Comment: ICDIS 201

    Hand Action Detection from Ego-centric Depth Sequences with Error-correcting Hough Transform

    Full text link
    Detecting hand actions from ego-centric depth sequences is a practically challenging problem, owing mostly to the complex and dexterous nature of hand articulations as well as non-stationary camera motion. We address this problem via a Hough transform based approach coupled with a discriminatively learned error-correcting component to tackle the well known issue of incorrect votes from the Hough transform. In this framework, local parts vote collectively for the start &\& end positions of each action over time. We also construct an in-house annotated dataset of 300 long videos, containing 3,177 single-action subsequences over 16 action classes collected from 26 individuals. Our system is empirically evaluated on this real-life dataset for both the action recognition and detection tasks, and is shown to produce satisfactory results. To facilitate reproduction, the new dataset and our implementation are also provided online

    Tracking of Fingertips and Centres of Palm using KINECT

    Full text link
    Hand Gesture is a popular way to interact or control machines and it has been implemented in many applications. The geometry of hand is such that it is hard to construct in virtual environment and control the joints but the functionality and DOF encourage researchers to make a hand like instrument. This paper presents a novel method for fingertips detection and centres of palms detection distinctly for both hands using MS KINECT in 3D from the input image. KINECT facilitates us by providing the depth information of foreground objects. The hands were segmented using the depth vector and centres of palms were detected using distance transformation on inverse image. This result would be used to feed the inputs to the robotic hands to emulate human hands operation.Comment: 4 pag

    Augmented reality meeting table: a novel multi-user interface for architectural design

    Get PDF
    Immersive virtual environments have received widespread attention as providing possible replacements for the media and systems that designers traditionally use, as well as, more generally, in providing support for collaborative work. Relatively little attention has been given to date however to the problem of how to merge immersive virtual environments into real world work settings, and so to add to the media at the disposal of the designer and the design team, rather than to replace it. In this paper we report on a research project in which optical see-through augmented reality displays have been developed together with prototype decision support software for architectural and urban design. We suggest that a critical characteristic of multi user augmented reality is its ability to generate visualisations from a first person perspective in which the scale of rendition of the design model follows many of the conventions that designers are used to. Different scales of model appear to allow designers to focus on different aspects of the design under consideration. Augmenting the scene with simulations of pedestrian movement appears to assist both in scale recognition, and in moving from a first person to a third person understanding of the design. This research project is funded by the European Commission IST program (IST-2000-28559)
    corecore