16,372 research outputs found

    A survey on trajectory clustering analysis

    Full text link
    This paper comprehensively surveys the development of trajectory clustering. Considering the critical role of trajectory data mining in modern intelligent systems for surveillance security, abnormal behavior detection, crowd behavior analysis, and traffic control, trajectory clustering has attracted growing attention. Existing trajectory clustering methods can be grouped into three categories: unsupervised, supervised and semi-supervised algorithms. In spite of achieving a certain level of development, trajectory clustering is limited in its success by complex conditions such as application scenarios and data dimensions. This paper provides a holistic understanding and deep insight into trajectory clustering, and presents a comprehensive analysis of representative methods and promising future directions

    Deep Trajectory for Recognition of Human Behaviours

    Full text link
    Identifying human actions in complex scenes is widely considered as a challenging research problem due to the unpredictable behaviors and variation of appearances and postures. For extracting variations in motion and postures, trajectories provide meaningful way. However, simple trajectories are normally represented by vector of spatial coordinates. In order to identify human actions, we must exploit structural relationship between different trajectories. In this paper, we propose a method that divides the video into N number of segments and then for each segment we extract trajectories. We then compute trajectory descriptor for each segment which capture the structural relationship among different trajectories in the video segment. For trajectory descriptor, we project all extracted trajectories on the canvas. This will result in texture image which can store the relative motion and structural relationship among the trajectories. We then train Convolution Neural Network (CNN) to capture and learn the representation from dense trajectories. . Experimental results shows that our proposed method out performs state of the art methods by 90.01% on benchmark data set

    Rapid online learning and robust recall in a neuromorphic olfactory circuit

    Full text link
    We present a neural algorithm for the rapid online learning and identification of odorant samples under noise, based on the architecture of the mammalian olfactory bulb and implemented on the Intel Loihi neuromorphic system. As with biological olfaction, the spike timing-based algorithm utilizes distributed, event-driven computations and rapid (one-shot) online learning. Spike timing-dependent plasticity rules operate iteratively over sequential gamma-frequency packets to construct odor representations from the activity of chemosensor arrays mounted in a wind tunnel. Learned odorants then are reliably identified despite strong destructive interference. Noise resistance is further enhanced by neuromodulation and contextual priming. Lifelong learning capabilities are enabled by adult neurogenesis. The algorithm is applicable to any signal identification problem in which high-dimensional signals are embedded in unknown backgrounds.Comment: 52 text pages; 8 figures. Version 3 includes a new figure and additional detail

    A Neural Network Approach to Joint Modeling Social Networks and Mobile Trajectories

    Full text link
    The accelerated growth of mobile trajectories in location-based services brings valuable data resources to understand users' moving behaviors. Apart from recording the trajectory data, another major characteristic of these location-based services is that they also allow the users to connect whomever they like. A combination of social networking and location-based services is called as location-based social networks (LBSN). As shown in previous works, locations that are frequently visited by socially-related persons tend to be correlated, which indicates the close association between social connections and trajectory behaviors of users in LBSNs. In order to better analyze and mine LBSN data, we present a novel neural network model which can joint model both social networks and mobile trajectories. In specific, our model consists of two components: the construction of social networks and the generation of mobile trajectories. We first adopt a network embedding method for the construction of social networks: a networking representation can be derived for a user. The key of our model lies in the component of generating mobile trajectories. We have considered four factors that influence the generation process of mobile trajectories, namely user visit preference, influence of friends, short-term sequential contexts and long-term sequential contexts. To characterize the last two contexts, we employ the RNN and GRU models to capture the sequential relatedness in mobile trajectories at different levels, i.e., short term or long term. Finally, the two components are tied by sharing the user network representations. Experimental results on two important applications demonstrate the effectiveness of our model. Especially, the improvement over baselines is more significant when either network structure or trajectory data is sparse.Comment: Accepted by ACM TOI

    Robobarista: Learning to Manipulate Novel Objects via Deep Multimodal Embedding

    Full text link
    There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and learn to transfer manipulation strategy across different objects by embedding point-cloud, natural language, and manipulation trajectory data into a shared embedding space using a deep neural network. In order to learn semantically meaningful spaces throughout our network, we introduce a method for pre-training its lower layers for multimodal feature embedding and a method for fine-tuning this embedding space using a loss-based margin. In order to collect a large number of manipulation demonstrations for different objects, we develop a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects and appliances with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot with our model can even prepare a cup of a latte with appliances it has never seen before.Comment: Journal Versio

    Robobarista: Object Part based Transfer of Manipulation Trajectories from Crowd-sourcing in 3D Pointclouds

    Full text link
    There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory. In order to collect a large number of manipulation demonstrations for different objects, we developed a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot can even manipulate objects it has never seen before.Comment: In International Symposium on Robotics Research (ISRR) 201

    Human Action Recognition and Prediction: A Survey

    Full text link
    Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state. Vision-based action recognition and prediction from videos are such tasks, where action recognition is to infer human actions (present state) based upon complete action executions, and action prediction to predict human actions (future state) based upon incomplete action executions. These two tasks have become particularly prevalent topics recently because of their explosively emerging real-world applications, such as visual surveillance, autonomous driving vehicle, entertainment, and video retrieval, etc. Many attempts have been devoted in the last a few decades in order to build a robust and effective framework for action recognition and prediction. In this paper, we survey the complete state-of-the-art techniques in the action recognition and prediction. Existing models, popular algorithms, technical difficulties, popular action databases, evaluation protocols, and promising future directions are also provided with systematic discussions

    Learning a Deep Model for Human Action Recognition from Novel Viewpoints

    Full text link
    Recognizing human actions from unknown and unseen (novel) views is a challenging problem. We propose a Robust Non-Linear Knowledge Transfer Model (R-NKTM) for human action recognition from novel views. The proposed R-NKTM is a deep fully-connected neural network that transfers knowledge of human actions from any unknown view to a shared high-level virtual view by finding a non-linear virtual path that connects the views. The R-NKTM is learned from dense trajectories of synthetic 3D human models fitted to real motion capture data and generalizes to real videos of human actions. The strength of our technique is that we learn a single R-NKTM for all actions and all viewpoints for knowledge transfer of any real human action video without the need for re-training or fine-tuning the model. Thus, R-NKTM can efficiently scale to incorporate new action classes. R-NKTM is learned with dummy labels and does not require knowledge of the camera viewpoint at any stage. Experiments on three benchmark cross-view human action datasets show that our method outperforms existing state-of-the-art

    Temporal Cycle-Consistency Learning

    Full text link
    We introduce a self-supervised representation learning method based on the task of temporal alignment between videos. The method trains a network using temporal cycle consistency (TCC), a differentiable cycle-consistency loss that can be used to find correspondences across time in multiple videos. The resulting per-frame embeddings can be used to align videos by simply matching frames using the nearest-neighbors in the learned embedding space. To evaluate the power of the embeddings, we densely label the Pouring and Penn Action video datasets for action phases. We show that (i) the learned embeddings enable few-shot classification of these action phases, significantly reducing the supervised training requirements; and (ii) TCC is complementary to other methods of self-supervised learning in videos, such as Shuffle and Learn and Time-Contrastive Networks. The embeddings are also used for a number of applications based on alignment (dense temporal correspondence) between video pairs, including transfer of metadata of synchronized modalities between videos (sounds, temporal semantic labels), synchronized playback of multiple videos, and anomaly detection. Project webpage: https://sites.google.com/view/temporal-cycle-consistency .Comment: Accepted at CVPR 2019. Project webpage: https://sites.google.com/view/temporal-cycle-consistenc

    Artificial Neural Networks Applied to Taxi Destination Prediction

    Full text link
    We describe our first-place solution to the ECML/PKDD discovery challenge on taxi destination prediction. The task consisted in predicting the destination of a taxi based on the beginning of its trajectory, represented as a variable-length sequence of GPS points, and diverse associated meta-information, such as the departure time, the driver id and client information. Contrary to most published competitor approaches, we used an almost fully automated approach based on neural networks and we ranked first out of 381 teams. The architectures we tried use multi-layer perceptrons, bidirectional recurrent neural networks and models inspired from recently introduced memory networks. Our approach could easily be adapted to other applications in which the goal is to predict a fixed-length output from a variable-length sequence.Comment: ECML/PKDD discovery challeng
    • …
    corecore