17 research outputs found

    A human activity recognition framework using max-min features and key poses with differential evolution random forests classifier

    Get PDF
    This paper presents a novel framework for human daily activity recognition that is intended to rely on few training examples evidencing fast training times, making it suitable for real-time applications. The proposed framework starts with a feature extraction stage, where the division of each activity into actions of variable-size, based on key poses, is performed. Each action window is delimited by two consecutive and automatically identified key poses, where static (i.e. geometrical) and max-min dynamic (i.e. temporal) features are extracted. These features are first used to train a random forest (RF) classifier which was tested using the CAD-60 dataset, obtaining relevant overall average results. Then in a second stage, an extension of the RF is proposed, where the differential evolution meta-heuristic algorithm is used, as splitting node methodology. The main advantage of its inclusion is the fact that the differential evolution random forest has no thresholds to tune, but rather a few adjustable parameters with well-defined behavior

    Vision-based kinematic structure learning of arbitrary articulated rigid objects

    Get PDF
    The Kinematic Structure (KS) is a compact and structured representation that fully discloses the motion capabilities of an articulated rigid object. Estimating KSs is thus an active topic in the fields of computer vision and robotics, with applications in robot manipulation tasks, unsupervised motion retargeting, and robot-assisted dressing, to name a few possibilities. While previous approaches are typically offline or computationally demanding, in this thesis, novel KS estimation methods from vision-based data that are suitable for real-time applications will be developed. The thesis starts by providing empirical evidence that initially representing the object by semi-dense three-dimensional (3D) points is a valid compromise between accuracy and computational processing costs. The problem of motion segmentation of articulated rigid bodies from semi-dense 3D points is then cast as a subspace clustering problem. Online processing is explored to handle incomplete point trajectories and partial occlusions during KS estimation. A suitable incremental metric representation of the tracked semi-dense 3D points is proposed, based on the observation that the distance between points belonging to the same rigid part is constant. This representation allows mitigating noise and points' tracking errors while implicitly encoding motion information, which is combined with the object's topological distances to build more plausible KSs. The use of event cameras to estimate KSs is considered for the first time in this thesis. A novel framework for event-based motion estimation is proposed, which can estimate the parameters of several motion models. The framework does not rely on any intermediate image-based representation and can thus handle augmented events from additional sensors. An incremental version of this framework is then used to perform joint shape-motion segmentation for event-based KS estimation without having to track feature points, which represents a paradigm shift on vision-based KS estimation. New challenging sequences for KS estimation are also made available. Experimental results corroborate that event cameras outperform frame-based cameras on motion-related tasks, and specifically on KS estimation. This thesis advances the state-of-the-art on vision-based KS estimation by proposing new frame-based KS estimation methods and taking the first steps towards considering event cameras for KS estimation. Its contributions are likely to bolster real-time applications that rely on KSs.Open Acces

    Reconhecimento de Atividades Humanas Baseado em Quadros Chave Usando Florestas Aleatórias

    No full text
    Dissertação de Mestrado Integrado em Engenharia Electrotécnica e de Computadores apresentada à Faculdade de Ciências e TecnologiaAs atividades humanas são caracterizadas como sendo complexas, não só pelos seus padrões espaço-temporais, mas também devido ao facto de que cada ser humano é fisicamente diferente. A mesma atividade é efetuada diferentemente por várias pessoa e até o mesmo ser humano pode executar a mesma atividade com movimentos distintos. Neste sentido, existe a necessidade de incorporar métodos de aprendizagem máquina capazes de lidar com esta variabilidade. No entanto, mesmo com o aumento de atenção que esta área de investigação está a ter, a maior parte destas técnicas de aprendizagem, embora precisas/exatas, são bastante lentas para treinar e classificar. Esta característica torna tais ferramentas inviáveis para aplicações em tempo-real, onde um robô tem de identificar correta e rapidamente uma dada atividade, por forma a responder adequadamente.Neste contexto, é proposta uma metodologia de reconhecimento de atividades humanas que é treinada rapidamente e requer poucos exemplos de treino, consistindo em duas componentes base: uma componente de extração de características (isto é, extração de informação relevante que descreve uma certa actividade) e uma componente de aprendizagem, baseada no classificador de florestas aleatórias (random forests). Inicialmente, é explorado o conceito baseado na segmentação de cada atividade numa sequência de janelas de tamanho fixo (isto é, a classificação não é feita quadro a quadro), extraindo apenas valores máximos e mínimos de cada característica obtida baseada no esqueleto humano. Após a sua validação experimental, foi apresentada e testada uma segunda abordagem, considerando a divisão de cada atividade em janelas de tamanho variável, baseadas em poses chave. Cada janela de ação é delimitada por duas poses chave consecutivas que são identificadas automaticamente, sendo extraídas características estáticas (isto é, geométricas) e dinâmicas (isto é, temporais). Primeiro, estas abordagens foram testadas usando o Cornell Activity Dataset e o classificador de florestas aleatórias disponibilizado pelo programa Weka, obtendo-se resultados médios globais relevantes. Em seguida, foi desenvolvido de raíz um classificador de florestas aleatórias, usando o algoritmo de evolução diferencial como parte do seu núcleo. O classificador de florestas aleatórias desenvolvido foi testado e os seus resultados comparados com os previamente obtidos com o programa Weka, observando-se uma ligeira melhoria em termos dos indicadores de desempenho considerados. Foi construído um dataset, usando um sensor RGB-D, com base no qual foram testados ambos os classificadores de florestas aleatórias. Toda a estrutura foi implementada numa plataforma robótica real (em C++), após a sua validação em ambiente Matlab. Após algumas observações finais, destacam-se novas direcções de pesquisa, por forma a melhorar as características da metodologia propostade reconhecimento proposta e colmatar as suas limitações.Human activities are characterized as being complex, not only for their temporal and spacial patterns, but also because of the fact humans are physically different. When the same activity is performed by different persons, or even by the same one, they can execute the same activity with distinct motion properties. In this sense, there is the need to incorporate machine learning approaches able to integrate all this variability. However, even with the increasingly attention this field of research is getting, current learning approaches, although accurate, are very slow to train and classify. These limitations turn infeasible their use in real-time applications, where a robot must rapidly and correctly identify a performed activity, in order to respond accordingly. A human activity recognition framework, featuring fast training and requiring few training examples, is therefore proposed, consisting of two main components: a features extraction approach component (i.e. extraction of relevant information describing a certain activity) and a machine learning component, based on the random forest classifier. An initial concept of segmenting each activity into a sequence of fixed-size actions is employed (i.e. no frame-by-frame classification), extracting just maximum and minimum values of each extracted skeleton-based feature. After some experimental validation, a second approach was developed and tested, considering the division of each activity into variable-size windows, based on key poses. Each action window is delimited by two consecutive and automatically identified key poses, where static (i.e. geometrical) and maximum and minimum dynamic (i.e. temporal) features are extracted. First, these approaches were tested using the Cornell Activity Dataset and the random forest classifier provided by Weka Software,obtaining relevant overall average results. Then a custom random forest classifier was developed from scratch, using a differential evolution algorithm, as part of its core. The developed random forest was tested and its results compared to the ones obtained with Weka's random forest,suggesting a slight increase in terms of the considered performance indicators. A custom dataset was built, using data from a RGB-D sensor, and both random forest classifiers were tested. The proposed framework, after being validated using Matlab, was implemented in a real robotic platform (in C++). After some concluding remarks, new open research directions are highlighted, in order to improve the framework's characteristics and to bridge its drawbacks

    Time-to-Contact Map by Joint Estimation of Up-to-Scale Inverse Depth and Global Motion using a Single Event Camera

    No full text
    International audienceEvent cameras asynchronously report brightness changes with a temporal resolution in the order of microseconds, which makes them inherently suitable to address problems that involve rapid motion perception. In this paper, we address the problem of time-to-contact (TTC) estimation using a single event camera. This problem is typically addressed by estimating a single global TTC measure, which explicitly assumes that the surface/obstacle is planar and fronto-parallel. We relax this assumption by proposing an incremental event-based method to estimate the TTC that jointly estimates the (up-to scale) inverse depth and global motion using a single event camera. The proposed method is reliable and fast while asynchronously maintaining a TTC map (TTCM), which provides perpixel TTC estimates. As a side product, the proposed method can also estimate per-event optical flow. We achieve state-of-the-art performances on TTC estimation in terms of accuracy and runtime per event while achieving competitive performance on optical flow estimation

    Comparing Several P300-Based Visuo-Auditory Brain-Computer Interfaces for a Completely Locked-in ALS Patient: A Longitudinal Case Study

    No full text
    In a completely locked-in state (CLIS), often resulting from traumatic brain injury or neurodegenerative diseases like amyotrophic lateral sclerosis (ALS), patients lose voluntary muscle control, including eye movement, making communication impossible. Brain-computer interfaces (BCIs) offer hope for restoring communication, but achieving reliable communication with these patients remains a challenge. This study details the design, testing, and comparison of nine visuo-auditory P300-based BCIs (combining different visual and auditory stimuli and different visual layouts) with a CLIS patient over ten months. The aim was to evaluate the impact of these stimuli in achieving effective communication. While some interfaces showed promising progress, achieving up to 90% online accuracy in one session, replicating this success in subsequent sessions proved challenging, with the average online accuracy across all sessions being 56.4 ± 15.2%. The intertrial variability in EEG signals and the low discrimination between target and non-target events were the main challenge. Moreover, the lack of communication with the patient made BCI design a challenging blind trial-and-error process. Despite the inconsistency of the results, it was possible to infer that the combination of visual and auditory stimuli had a positive impact, and that there was an improvement over time

    TAXONOMY OF THE SOUTH AMERICAN DWARF BOAS OF THE GENUS TROPIDOPHIS BIBRON, 1840, WITH THE DESCRIPTION OF TWO NEW SPECIES FROM THE ATLANTIC FOREST (SERPENTES: TROPIDOPHIIDAE)

    No full text
    A taxonomic study on the South American dwarf boas of the genus Tropidophis revealed the existence of two new species in the Atlantic Forest bionic. As a result, we recognize five mainland species, three in the Atlantic Forest and two in northwestern South America. Based on general distribution and morphological orientation, the type locality of T. paucisquamis is restricted to Estacao Biologica de Boraceia (EBB), municipality of Salesopolis, state of Sao Paulo, Brazil; furthermore, a lectotype for T. taczanowskyi is designated. We provide data on the hemipenial morphology of two South American Tropidophis, showing that the quadrifurcate condition described for West Indian taxa also occurs in mainland congeners. The distributions of the three Atlantic Forest species are congruent with patterns of diversification of other vertebrate taxa associated with cold climates prevalent at high elevations. Refugial isolation and riverine barriers may account for such speciation events.Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP)Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP)Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq)Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq)Programa de Capacitacao em Taxonomia-PROTAX, CNPq [52193/2008-1]Programa de Capacitacao em TaxonomiaPROTAX, CNPq[KU 121271

    Control of Brain Activity in hMT+/V5 at Three Response Levels Using fMRI-Based Neurofeedback/BCI

    No full text
    <div><p>A major challenge in brain-computer interface (BCI) research is to increase the number of command classes and levels of control. BCI studies often use binary control level approaches (level 0 and 1 of brain activation for each class of control). Different classes may often be achieved but not different levels of activation for the same class. The increase in the number of levels of control in BCI applications may allow for larger efficiency in neurofeedback applications. In this work we test the hypothesis whether more than two modulation levels can be achieved in a single brain region, the hMT+/V5 complex. Participants performed three distinct imagery tasks during neurofeedback training: imagery of a stationary dot, imagery of a dot with two opposing motions in the vertical axis and imagery of a dot with four opposing motions in vertical or horizontal axes (imagery of 2 or 4 motion directions). The larger the number of motion alternations, the higher the expected hMT+/V5 response. A substantial number (17 of 20) of participants achieved successful binary level of control and 12 were able to reach even 3 significant levels of control within the same session, confirming the whole group effects at the individual level. With this simple approach we suggest that it is possible to design a parametric system of control based on activity modulation of a specific brain region with at least 3 different levels. Furthermore, we show that particular imagery task instructions, based on different number of motion alternations, provide feasible achievement of different control levels in BCI and/or neurofeedback applications.</p></div

    Example of hMT+/V5 identification using the defined localizer.

    No full text
    <p>GLM conjunction analysis (using the stringent criterion that all particular motion vs static BOLD signal contrasts had to be significant for a voxel to be considered positive) shows the resulting ROI: hMT+/V5. Regions are shown at the same statistical threshold level (<i>P</i> < 0.0001, Bonferroni corrected).</p
    corecore