466 research outputs found

    BGM: Building a Dynamic Guidance Map without Visual Images for Trajectory Prediction

    Full text link
    Visual images usually contain the informative context of the environment, thereby helping to predict agents' behaviors. However, they hardly impose the dynamic effects on agents' actual behaviors due to the respectively fixed semantics. To solve this problem, we propose a deterministic model named BGM to construct a guidance map to represent the dynamic semantics, which circumvents to use visual images for each agent to reflect the difference of activities in different periods. We first record all agents' activities in the scene within a period close to the current to construct a guidance map and then feed it to a Context CNN to obtain their context features. We adopt a Historical Trajectory Encoder to extract the trajectory features and then combine them with the context feature as the input of the social energy based trajectory decoder, thus obtaining the prediction that meets the social rules. Experiments demonstrate that BGM achieves state-of-the-art prediction accuracy on the two widely used ETH and UCY datasets and handles more complex scenarios

    A detection-based pattern recognition framework and its applications

    Get PDF
    The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation. Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages. A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage. This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min

    Affinity matrix with large eigenvalue gap for graph-based subspace clustering and semi-supervised classification

    Get PDF
    In the graph-based learning method, the data graph or similarity matrix reveals the relationship between data, and reflects similar attributes within a class and differences between classes. Inspired by Davis–Kahan Theorem that the stability of matrix eigenvector space depends on its spectral distance (i.e. its eigenvalue gap), in this paper, we propose a global local affinity matrix model with low rank subspace sparse representation (GLAM-LRSR) based on global information of eigenvalue gap and local distance between samples. This method approximate the similarity matrix with ideally diagonal block structure from the perspective of maximizing the eigenvalue gap, and the local distance between data is utilized as a regular term to prevent the eigenvalue gap from being too large to ensure the efficacy of similarity matrix. We have shown that the combination of subspace (LRSR) partitioning method such as Sparse Subspace Clustering(SSC) and the similarity matrix constructed by GLAM can improve the accuracy of subspace clustering, and that the similarity matrix constructed by GLAM-LRSR can be successfully applied to graph-based semi-supervised classification task. Our experiments on synthetic data as well as the real-world datasets for face clustering, face recovery and motion segmentation have clearly demonstrate the significant advantages of GLAM-LRSR and its effectiveness

    Generalized Kernel-based Visual Tracking

    Full text link
    In this work we generalize the plain MS trackers and attempt to overcome standard mean shift trackers' two limitations. It is well known that modeling and maintaining a representation of a target object is an important component of a successful visual tracker. However, little work has been done on building a robust template model for kernel-based MS tracking. In contrast to building a template from a single frame, we train a robust object representation model from a large amount of data. Tracking is viewed as a binary classification problem, and a discriminative classification rule is learned to distinguish between the object and background. We adopt a support vector machine (SVM) for training. The tracker is then implemented by maximizing the classification score. An iterative optimization scheme very similar to MS is derived for this purpose.Comment: 12 page

    Algorithms for the Analysis of Spatio-Temporal Data from Team Sports

    Get PDF
    Modern object tracking systems are able to simultaneously record trajectories—sequences of time-stamped location points—for large numbers of objects with high frequency and accuracy. The availability of trajectory datasets has resulted in a consequent demand for algorithms and tools to extract information from these data. In this thesis, we present several contributions intended to do this, and in particular, to extract information from trajectories tracking football (soccer) players during matches. Football player trajectories have particular properties that both facilitate and present challenges for the algorithmic approaches to information extraction. The key property that we look to exploit is that the movement of the players reveals information about their objectives through cooperative and adversarial coordinated behaviour, and this, in turn, reveals the tactics and strategies employed to achieve the objectives. While the approaches presented here naturally deal with the application-specific properties of football player trajectories, they also apply to other domains where objects are tracked, for example behavioural ecology, traffic and urban planning

    Speech and neural network dynamics

    Get PDF

    Surveillance video summarization based on trajectory rarity measure

    Get PDF
    The dynamic video summarization of surveillance videos has several critical applications, mainly due to the wide availability of digital cameras in environments such as airports, train and bus stations, shopping centers, stadiums, buildings, schools, hospitals, roads, among others. This study presents an approach for the generation of dynamic summary on surveillance video domain based on human trajectories. It has an emphasis on trajectory descriptors in conjunction with the unsupervised clustering method. Our approach contribute to existing literature concerning the combination of methods and objectives. We hypothesize that the clustering of trajectories permits to identify rare trajectories base on their morphology. The clustering as an output provides numerous subsets of trajectories or clusters and the number of elements of a specific cluster is used to determine their rarity. Those subsets with few components are rare while the others that have a high number of elements are considered ordinary; therefore, the implications of our study show that is possible to use unsupervised clustering for automatic detection of rare trajectories based on their morphology and with this information segment videos. We experimented with different sets of trajectories segmenting the rare videos from our ground truth.Trabajo de investigació

    Human-aware Collaborative Manipulation with Reaching Motion Prediction

    Get PDF
    This dissertations presents a possible approach to improve human-robot interaction in an industrial collaborative situation, where the human operator and a collaborative industrial robot work within a shared work-space. The approach presented in this dissertation focuses on a situation where part of the assembly process needs to be carried out by a human operator, whose assembly station is located on a work-bench, and a robot is used to pick and place products in specific locations on the operator’s work station. Because those locations can be accessed both by the robot or the human operator at any time, collisions can occur and should be avoided in order to make the process more natural for the human operator as well as to avoid the emergency stop of the collaborative robot which has to be restarted and thus decreases productivity. In order to prevent those collisions the proposed system defines key-areas in each of the locations as well as other relevant positions for the collaborative task. The system uses a Kinect Sensor and a neural network to track the user’s hand over time and Gaussian Mixture Models to make predictions regarding the possible destination key-area given the observed trajectory until that moment. If a collision is predicted the robot pauses the task being executed at the moment in order to prevent it and, once the conflict has been resolved, resumes operation.Esta dissertação apresenta uma possível aproximação para melhorar a interação humanorobot em situações industrias colaborativas, onde um operador humano e um robot industrial colaborativo trabalham num espaço partilhado. A aproximação apresentada nesta dissertação foca situações onde parte do processo de produção deve ser realizado por um operador humano cuja área de trabalho se localiza numa mesa. É utilizado um robot de forma a colocar e retirar produtos de locais especificos da mesa de trabalho do operador. Uma vez que estes locais podem ser acedidos pelo utilizador e pelo robot a qualquer momento é possivel que ocorram colisões que devem ser evitadas, de forma a tornar a interação mais natural para o humano e evitar paragens de emergencia, que requerem que o robot colaborativo seja reiniciado manualmente e, portanto, diminuem a produtividade. De forma a prevenir essas colisões, o sistema proposto define áreas-chave nos locais onde podem ocorrer colisões e em outras localisões relevantes para a tarefa colaborativa a ser executada. A solução proposta utiliza um sensor Kinect, juntamente com uma rede neuronal para seguir a mão do operador ao longo do tempo e usa Gaussian Mixture Models para fazer previsões relativas à área de destino dada a trajetoria observada até ao momento. Se for prevista uma colisão o robot interrompe a execução da tarefa programada de forma a evitar a colisão. Uma vez o conflito resolvido, o robot retoma a tarefa do ponto onde parou

    Intelligent Sensors for Human Motion Analysis

    Get PDF
    The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems
    corecore