Search CORE

14 research outputs found

UWB Based Static Gesture Classification

Author: Sebastian Abhishek
Publication venue
Publication date: 23/10/2023
Field of study

Our paper presents a robust framework for UWB-based static gesture recognition, leveraging proprietary UWB radar sensor technology. Extensive data collection efforts were undertaken to compile datasets containing five commonly used gestures. Our approach involves a comprehensive data pre-processing pipeline that encompasses outlier handling, aspect ratio-preserving resizing, and false-color image transformation. Both CNN and MobileNet models were trained on the processed images. Remarkably, our best-performing model achieved an accuracy of 96.78%. Additionally, we developed a user-friendly GUI framework to assess the model's system resource usage and processing times, which revealed low memory utilization and real-time task completion in under one second. This research marks a significant step towards enhancing static gesture recognition using UWB technology, promising practical applications in various domains

arXiv.org e-Print Archive

A colour-based building recognition using support vector machine

Author: Abdullah Lili Nurliyana
Mustaffa Mas Rina
Nasharuddin Nurul Amelina
Yee Loh Weng
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/02/2019
Field of study

Many applications apply the concept of image recognition to help human in recognising objects simply by just using digital images. A content-based building recognition system could solve the problem of using just text as search input. In this paper, a building recognition system using colour histogram is proposed for recognising buildings in Ipoh city, Perak, Malaysia. The colour features of each building image will be extracted. A feature vector combining the mean, standard deviation, variance, skewness and kurtosis of gray level will be formed to represent each building image. These feature values are later used to train the system using supervised learning algorithm, which is Support Vector Machine (SVM). Lastly, the accuracy of the recognition system is evaluated using 10-fold cross validation. The evaluation results show that the building recognition system is well trained and able to effectively recognise the building images with low misclassification rate

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Blur Classification Using Segmentation Based Fractal Texture Analysis

Author: Tiwari Shamik
Publication venue: IAES Indonesia Section
Publication date: 25/12/2018
Field of study

The objective of vision based gesture recognition is to design a system, which can understand the human actions and convey the acquired information with the help of captured images. An image restoration approach is extremely required whenever image gets blur during acquisition process since blurred images can severely degrade the performance of such systems. Image restoration recovers a true image from a degraded version. It is referred as blind restoration if blur information is unidentified. Blur identification is essential before application of any blind restoration algorithm. This paper presents a blur identification approach which categories a hand gesture image into one of the sharp, motion, defocus and combined blurred categories. Segmentation based fractal texture analysis extraction algorithm is utilized for featuring the neural network based classification system. The simulation results demonstrate the preciseness of proposed method

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)

Unsupervised Embedded Gesture Recognition Based on Multi-objective NAS and Capacitive Sensing

Author: David CASTELLS-RUFAS
Ernesto BIEMPICA
Jordi CARRABINA
Juan BORREGO-CARAZO
Publication venue: IFSA Publishing, S.L.
Publication date: 01/02/2021
Field of study

Gesture recognition has become pervasive in many interactive environments. Recognition based on Neural Networks often reaches higher recognition rates than competing methods at a cost of a higher computational complexity that becomes very challenging in low resource computing platforms such as microcontrollers. New optimization methodologies, such as quantization and Neural Architecture Search are steps forward for the development of embeddable networks. In addition, as neural networks are commonly used in a supervised fashion, labeling tends to include bias in the model. Unsupervised methods allow for performing tasks as classification without depending on labeling. In this work, we present an embedded and unsupervised gesture recognition system, composed of a neural network autoencoder and K-Means clustering algorithm and optimized through a state-of-the-art multi- objective NAS. The present method allows for a method to develop, deploy and perform unsupervised classification in low resource embedded devices

Directory of Open Access Journals

Active Perception by Interaction with Other Agents in a Predictive Coding Framework: Application to Internet of Things Environment

Author: Heidari Kapourchali Masoumeh
Publication venue: University of Memphis Digital Commons
Publication date: 01/01/2019
Field of study

Predicting the state of an agent\u27s partially-observable environment is a problem of interest in many domains. Typically in the real world, the environment consists of multiple agents, not necessarily working towards a common goal. Though the goal and sensory observation for each agent is unique, one agent might have acquired some knowledge that may benefit the other. In essence, the knowledge base regarding the environment is distributed among the agents. An agent can sample this distributed knowledge base by communicating with other agents. Since an agent is not storing the entire knowledge base, its model can be small and its inference can be efficient and fault-tolerant. However, the agent needs to learn -- when, with whom and what -- to communicate (in general interact) under different situations.This dissertation presents an agent model that actively and selectively communicates with other agents to predict the state of its environment efficiently. Communication is a challenge when the internal models of other agents is unknown and unobservable. The proposed agent learns communication policies as mappings from its belief state to when, with whom and what to communicate. The policies are learned using predictive coding in an online manner, without any reinforcement. The proposed agent model is evaluated on widely-studied applications, such as human activity recognition from multimodal, multisource and heterogeneous sensor data, and transferring knowledge across sensor networks. In the applications, either each sensor or each sensor network is assumed to be monitored by an agent. The recognition accuracy on benchmark datasets is comparable to the state-of-the-art, even though our model has significantly fewer parameters and infers the state in a localized manner. The learned policy reduces number of communications. The agent is tolerant to communication failures and can recognize the reliability of each agent from its communication messages. To the best of our knowledge, this is the first work on learning communication policies by an agent for predicting the state of its environment

University of Memphis Digital Commons

Interface gestuelle pour la commande d'un capteur 3D tenu en main

Author: Ôtomo-Lauzon Kento
Publication venue
Publication date
Field of study

Ce mémoire porte sur la conception d'une interface utilisateur basée sur la reconnaissance de gestes pour la commande d'un capteur 3D tenu en main. L'interface proposée permet à l'opérateur d'un tel équipement de commander le logiciel à distance alors qu'il se déplace autour d'un objet à numériser sans devoir revenir auprès du poste de travail. À cet effet, un prototype fonctionnel est conçu au moyen d'une caméra Azure Kinect pointée vers l'utilisateur. Un corpus de gestes de la main est défini et reconnu au moyen d'algorithmes d'apprentissage automatique, et des métaphores d'interactions sont proposées pour la transformation rigide 3D d'un objet virtuel à l'écran. Ces composantes sont implantées dans un prototype fonctionnel compatible avec le logiciel VXelements de Creaform.This thesis presents the development of a gesture-based user interface for the operation of handheld 3D scanning devices. This user interface allows the user to remotely engage with the software while walking around the target object. To this end, we develop a prototype using an Azure Kinect sensor pointed at the user. We propose a set of hand gestures and a machine learning-based approach to classification for triggering momentary actions in the software. Additionally, we define interaction metaphors for applying 3D rigid transformations to a virtual object on screen. We implement these components into a proof-of-concept application compatible with Creaform VXelements

CorpusUL

Multimodaalsel emotsioonide tuvastamisel põhineva inimese-roboti suhtluse arendamine

Author: Noroozi Fatemeh
Publication venue
Publication date: 03/05/2018
Field of study

Väitekirja elektrooniline versioon ei sisalda publikatsiooneÜks afektiivse arvutiteaduse peamistest huviobjektidest on mitmemodaalne emotsioonituvastus, mis leiab rakendust peamiselt inimese-arvuti interaktsioonis. Emotsiooni äratundmiseks uuritakse nendes süsteemides nii inimese näoilmeid kui kakõnet. Käesolevas töös uuritakse inimese emotsioonide ja nende avaldumise visuaalseid ja akustilisi tunnuseid, et töötada välja automaatne multimodaalne emotsioonituvastussüsteem. Kõnest arvutatakse mel-sageduse kepstri kordajad, helisignaali erinevate komponentide energiad ja prosoodilised näitajad. Näoilmeteanalüüsimiseks kasutatakse kahte erinevat strateegiat. Esiteks arvutatakse inimesenäo tähtsamate punktide vahelised erinevad geomeetrilised suhted. Teiseks võetakse emotsionaalse sisuga video kokku vähendatud hulgaks põhikaadriteks, misantakse sisendiks konvolutsioonilisele tehisnärvivõrgule emotsioonide visuaalsekseristamiseks. Kolme klassifitseerija väljunditest (1 akustiline, 2 visuaalset) koostatakse uus kogum tunnuseid, mida kasutatakse õppimiseks süsteemi viimasesetapis. Loodud süsteemi katsetati SAVEE, Poola ja Serbia emotsionaalse kõneandmebaaside, eNTERFACE’05 ja RML andmebaaside peal. Saadud tulemusednäitavad, et võrreldes olemasolevatega võimaldab käesoleva töö raames loodudsüsteem suuremat täpsust emotsioonide äratundmisel. Lisaks anname käesolevastöös ülevaate kirjanduses väljapakutud süsteemidest, millel on võimekus tunda äraemotsiooniga seotud ̆zeste. Selle ülevaate eesmärgiks on hõlbustada uute uurimissuundade leidmist, mis aitaksid lisada töö raames loodud süsteemile ̆zestipõhiseemotsioonituvastuse võimekuse, et veelgi enam tõsta süsteemi emotsioonide äratundmise täpsust.Automatic multimodal emotion recognition is a fundamental subject of interest in affective computing. Its main applications are in human-computer interaction. The systems developed for the foregoing purpose consider combinations of different modalities, based on vocal and visual cues. This thesis takes the foregoing modalities into account, in order to develop an automatic multimodal emotion recognition system. More specifically, it takes advantage of the information extracted from speech and face signals. From speech signals, Mel-frequency cepstral coefficients, filter-bank energies and prosodic features are extracted. Moreover, two different strategies are considered for analyzing the facial data. First, facial landmarks' geometric relations, i.e. distances and angles, are computed. Second, we summarize each emotional video into a reduced set of key-frames. Then they are taught to visually discriminate between the emotions. In order to do so, a convolutional neural network is applied to the key-frames summarizing the videos. Afterward, the output confidence values of all the classifiers from both of the modalities are used to define a new feature space. Lastly, the latter values are learned for the final emotion label prediction, in a late fusion. The experiments are conducted on the SAVEE, Polish, Serbian, eNTERFACE'05 and RML datasets. The results show significant performance improvements by the proposed system in comparison to the existing alternatives, defining the current state-of-the-art on all the datasets. Additionally, we provide a review of emotional body gesture recognition systems proposed in the literature. The aim of the foregoing part is to help figure out possible future research directions for enhancing the performance of the proposed system. More clearly, we imply that incorporating data representing gestures, which constitute another major component of the visual modality, can result in a more efficient framework

DSpace at Tartu University Library

The State of the Art of Spatial Interfaces for 3D Visualization

Author: Besançon Lonni
Isenberg Tobias
Keefe Daniel,
Ynnerman Anders
Yu Lingyun
Publication venue: 'Wiley'
Publication date: 01/02/2021
Field of study

International audienceWe survey the state of the art of spatial interfaces for 3D visualization. Interaction techniques are crucial to data visualization processes and the visualization research community has been calling for more research on interaction for years. Yet, research papers focusing on interaction techniques, in particular for 3D visualization purposes, are not always published in visualization venues, sometimes making it challenging to synthesize the latest interaction and visualization results. We therefore introduce a taxonomy of interaction technique for 3D visualization. The taxonomy is organized along two axes: the primary source of input on the one hand and the visualization task they support on the other hand. Surveying the state of the art allows us to highlight specific challenges and missed opportunities for research in 3D visualization. In particular, we call for additional research in: (1) controlling 3D visualization widgets to help scientists better understand their data, (2) 3D interaction techniques for dissemination, which are under-explored yet show great promise for helping museum and science centers in their mission to share recent knowledge, and (3) developing new measures that move beyond traditional time and errors metrics for evaluating visualizations that include spatial interaction

HAL-CentraleSupelec

Publikationer från Linköpings universitet

Crossref

INRIA a CCSD electronic archive server

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Monash University Research Portal

HAL-Rennes 1