41 research outputs found

    Multiple human tracking in RGB-depth data: A survey

    Get PDF
    © The Institution of Engineering and Technology. Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-depth devices has led to many new approaches to MHT, and many of these integrate colour and depth cues to improve each and every stage of the process. In this survey, the authors present the common processing pipeline of these methods and review their methodology based (a) on how they implement this pipeline and (b) on what role depth plays within each stage of it. They identify and introduce existing, publicly available, benchmark datasets and software resources that fuse colour and depth data for MHT. Finally, they present a brief comparative evaluation of the performance of those works that have applied their methods to these datasets

    Camera localization using trajectories and maps

    Get PDF
    We propose a new Bayesian framework for automatically determining the position (location and orientation) of an uncalibrated camera using the observations of moving objects and a schematic map of the passable areas of the environment. Our approach takes advantage of static and dynamic information on the scene structures through prior probability distributions for object dynamics. The proposed approach restricts plausible positions where the sensor can be located while taking into account the inherent ambiguity of the given setting. The proposed framework samples from the posterior probability distribution for the camera position via data driven MCMC, guided by an initial geometric analysis that restricts the search space. A Kullback-Leibler divergence analysis is then used that yields the final camera position estimate, while explicitly isolating ambiguous settings. The proposed approach is evaluated in synthetic and real environments, showing its satisfactory performance in both ambiguous and unambiguous settings

    Learning and inference with Wasserstein metrics

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 131-143).This thesis develops new approaches for three problems in machine learning, using tools from the study of optimal transport (or Wasserstein) distances between probability distributions. Optimal transport distances capture an intuitive notion of similarity between distributions, by incorporating the underlying geometry of the domain of the distributions. Despite their intuitive appeal, optimal transport distances are often difficult to apply in practice, as computing them requires solving a costly optimization problem. In each setting studied here, we describe a numerical method that overcomes this computational bottleneck and enables scaling to real data. In the first part, we consider the problem of multi-output learning in the presence of a metric on the output domain. We develop a loss function that measures the Wasserstein distance between the prediction and ground truth, and describe an efficient learning algorithm based on entropic regularization of the optimal transport problem. We additionally propose a novel extension of the Wasserstein distance from probability measures to unnormalized measures, which is applicable in settings where the ground truth is not naturally expressed as a probability distribution. We show statistical learning bounds for both the Wasserstein loss and its unnormalized counterpart. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data image tagging problem, outperforming a baseline that doesn't use the metric. In the second part, we consider the probabilistic inference problem for diffusion processes. Such processes model a variety of stochastic phenomena and appear often in continuous-time state space models. Exact inference for diffusion processes is generally intractable. In this work, we describe a novel approximate inference method, which is based on a characterization of the diffusion as following a gradient flow in a space of probability densities endowed with a Wasserstein metric. Existing methods for computing this Wasserstein gradient flow rely on discretizing the underlying domain of the diffusion, prohibiting their application to problems in more than several dimensions. In the current work, we propose a novel algorithm for computing a Wasserstein gradient flow that operates directly in a space of continuous functions, free of any underlying mesh. We apply our approximate gradient flow to the problem of filtering a diffusion, showing superior performance where standard filters struggle. Finally, we study the ecological inference problem, which is that of reasoning from aggregate measurements of a population to inferences about the individual behaviors of its members. This problem arises often when dealing with data from economics and political sciences, such as when attempting to infer the demographic breakdown of votes for each political party, given only the aggregate demographic and vote counts separately. Ecological inference is generally ill-posed, and requires prior information to distinguish a unique solution. We propose a novel, general framework for ecological inference that allows for a variety of priors and enables efficient computation of the most probable solution. Unlike previous methods, which rely on Monte Carlo estimates of the posterior, our inference procedure uses an efficient fixed point iteration that is linearly convergent. Given suitable prior information, our method can achieve more accurate inferences than existing methods. We additionally explore a sampling algorithm for estimating credible regions.by Charles Frogner.Ph. D

    Online growing neural gas for anomaly detection in changing surveillance scenes

    Get PDF
    Anomaly detection is still a challenging task for video surveillance due to complex environments and unpredictable human behaviors. Most existing approaches train offline detectors using manually labeled data and predefined parameters, and are hard to model changing scenes. This paper introduces a neural network based model called online Growing Neural Gas (online GNG) to perform an unsupervised learning. Unlike a parameter-fixed GNG, our model updates learning parameters continuously, for which we propose several online neighbor-related strategies. Specific operations, namely neuron insertion, deletion, learning rate adaptation and stopping criteria selection, get upgraded to online modes. In the anomaly detection stage, the behavior patterns far away from our model are labeled as anomalous, for which far away is measured by a time varying threshold. Experiments are implemented on three surveillance datasets, namely UMN, UCSD Ped1/Ped2 and Avenue dataset. All datasets have changing scenes due to mutable crowd density and behavior types. Anomaly detection results show that our model can adapt to the current scene rapidly and reduce false alarms while still detecting most anomalies. Quantitative comparisons with 12 recent approaches further confirm our superiority.National Natural Science Foundation of China (NSFC) [61673030, 61340046, 60875050, 60675025]; National High Technology Research and Development Program of China (863 Program) [2006AA04Z247]; Scientific Research Project of Guangdong Province [2015B010919004]; National high level talent special support programSCI(E)ARTICLE187-2016

    An Unsupervised Cluster: Learning Water Customer Behavior Using Variation of Information on a Reconstructed Phase Space

    Get PDF
    The unsupervised clustering algorithm described in this dissertation addresses the need to divide a population of water utility customers into groups based on their similarities and differences, using only the measured flow data collected by water meters. After clustering, the groups represent customers with similar consumption behavior patterns and provide insight into ‘normal’ and ‘unusual’ customer behavior patterns. This research focuses upon individually metered water utility customers and includes both residential and commercial customer accounts serviced by utilities within North America. The contributions of this dissertation not only represent a novel academic work, but also solve a practical problem for the utility industry. This dissertation introduces a method of agglomerative clustering using information theoretic distance measures on Gaussian mixture models within a reconstructed phase space. The clustering method accommodates a utility’s limited human, financial, computational, and environmental resources. The proposed weighted variation of information distance measure for comparing Gaussian mixture models places emphasis upon those behaviors whose statistical distributions are more compact over those behaviors with large variation and contributes a novel addition to existing comparison options

    Uncovering Intratumoral And Intertumoral Heterogeneity Among Single-Cell Cancer Specimens

    Get PDF
    While several tools have been developed to map axes of variation among individual cells, no analogous approaches exist for identifying axes of variation among multicellular biospecimens profiled at single-cell resolution. Developing such an approach is of great translational relevance and interest, as single-cell expression data are now often collected across numerous experimental conditions (e.g., representing different drug perturbation conditions, CRISPR knockdowns, or patients undergoing clinical trials) that need to be compared. In this work, “Phenotypic Earth Mover\u27s Distance” (PhEMD) is presented as a solution to this problem. PhEMD is a general method for embedding a “manifold of manifolds,” in which each datapoint in the higher-level manifold (of biospecimens) represents a collection of points that span a lower-level manifold (of cells). PhEMD is applied to a newly-generated, 300-biospecimen mass cytometry drug screen experiment to map small-molecule inhibitors based on their differing effects on breast cancer cells undergoing epithelial–mesenchymal transition (EMT). These experiments highlight EGFR and MEK1/2 inhibitors as strongly halting EMT at an early stage and PI3K/mTOR/Akt inhibitors as enriching for a drug-resistant mesenchymal cell subtype characterized by high expression of phospho-S6. More generally, these experiments reveal that the final mapping of perturbation conditions has low intrinsic dimension and that the network of drugs demonstrates manifold structure, providing insight into how these single-cell experiments should be computational modeled and visualized. In the presented drug-screen experiment, the full spectrum of perturbation effects could be learned by profiling just a small fraction (11%) of drugs. Moreover, PhEMD could be integrated with complementary datasets to infer the phenotypes of biospecimens not directly profiled with single-cell profiling. Together, these findings have major implications for conducting future drug-screen experiments, as they suggest that large-scale drug screens can be conducted by measuring only a small fraction of the drugs using the most expensive high-throughput single-cell technologies—the effects of other drugs may be inferred by mapping and extending the perturbation space. PhEMD is also applied to patient tumor biopsies to assess intertumoral heterogeneity. Applied to a melanoma dataset and a clear-cell renal cell carcinoma dataset (ccRCC), PhEMD maps tumors similarly to how it maps perturbation conditions as above in order to learn key axes along which tumors vary with respect to their tumor-infiltrating immune cells. In both of these datasets, PhEMD highlights a subset of tumors demonstrating a marked enrichment of exhausted CD8+ T-cells. The wide variability in tumor-infiltrating immune cell abundance and particularly prominent exhausted CD8+ T-cell subpopulation highlights the importance of careful patient stratification when assessing clinical response to T cell-directed immunotherapies. Altogether, this work highlights PhEMD’s potential to facilitate drug discovery and patient stratification efforts by uncovering the network geometry of a large collection of single-cell biospecimens. Our varied experiments demonstrate that PhEMD is highly scalable, compatible with leading batch effect correction techniques, and generalizable to multiple experimental designs, with clear applicability to modern precision oncology efforts

    Enhancing Face Recognition with Deep Learning Architectures: A Comprehensive Review

    Get PDF
    The progression of information discernment via facial identification and the emergence of innovative frameworks has exhibited remarkable strides in recent years. This phenomenon has been particularly pronounced within the realm of verifying individual credentials, a practice prominently harnessed by law enforcement agencies to advance the field of forensic science. A multitude of scholarly endeavors have been dedicated to the application of deep learning techniques within machine learning models. These endeavors aim to facilitate the extraction of distinctive features and subsequent classification, thereby elevating the precision of unique individual recognition. In the context of this scholarly inquiry, the focal point resides in the exploration of deep learning methodologies tailored for the realm of facial recognition and its subsequent matching processes. This exploration centers on the augmentation of accuracy through the meticulous process of training models with expansive datasets. Within the confines of this research paper, a comprehensive survey is conducted, encompassing an array of diverse strategies utilized in facial recognition. This survey, in turn, delves into the intricacies and challenges that underlie the intricate field of facial recognition within imagery analysis

    Unsupervised Discovery and Representation of Subspace Trends in Massive Biomedical Datasets

    Get PDF
    The goal of this dissertation is to develop unsupervised algorithms for discovering previously unknown subspace trends in massive multivariate biomedical data sets without the benefit of prior information. A subspace trend is a sustained pattern of gradual/progressive changes within an unknown subset of feature dimensions. A fundamental challenge to subspace trend discovery is the presence of irrelevant data dimensions, noise, outliers, and confusion from multiple subspace trends driven by independent factors that are mixed in with each other. These factors can obscure the trends in traditional dimension reduction and projection based data visualizations. To overcome these limitations, we propose a novel graph-theoretic neighborhood similarity measure for sensing concordant progressive changes across data dimensions. Using this measure, we present an unsupervised algorithm for trend-relevant feature selection and visualization. Additionally, we propose to use an efficient online density-based representation to make the algorithm scalable for massive datasets. The representation not only assists in trend discovery, but also in cluster detection including rare populations. Our method has been successfully applied to diverse synthetic and real-world biomedical datasets, such as gene expression microarray and arbor morphology of neurons and microglia in brain tissue. Derived representations revealed biologically meaningful hidden subspace trend(s) that were obscured by irrelevant features and noise. Although our applications are mostly from the biomedical domain, the proposed algorithm is broadly applicable to exploratory analysis of high-dimensional data including visualization, hypothesis generation, knowledge discovery, and prediction in diverse other applications.Electrical and Computer Engineering, Department o

    Real-time Target Tracking and Following with UR5 Collaborative Robot Arm

    Get PDF
    The rise of the camera usage and their availability give opportunities for developing robotics applications and computer vision applications. Especially, recent development in depth sensing (e.g., Microsoft Kinect) allows development of new methods for Human Robot Interaction (HRI) field. Moreover, Collaborative robots (co-bots) are adapted for the manufacturing industry. This thesis focuses on HRI using the capabilities of Microsoft Kinect, Universal Robot-5 (UR5) and Robot Operating System (ROS). In this particular study, the movement of a fingertip is perceived and the same movement is repeated on the robot side. Seamless cooperation, accurate trajectory and safety during the collaboration are the most important parts of the HRI. The study aims to recognize and track the fingertip accurately and to transform it as the motion of UR5. It also aims to improve the motion performance of UR5 and interaction efficiency during collaboration. In the experimental part, nearest-point approach is used via Kinect sensor's depth image (RGB-D). The approach is based on the Euclidean distance which has robust properties against different environments. Moreover, Point Cloud Library (PCL) and its built-in filters are used for processing the depth data. After the depth data provided via Microsoft Kinect have been processed, the difference of the nearest points is transmitted to the robot via ROS. On the robot side, MoveIt! motion planner is used for the smooth trajectory. Once the data has been processed successfully and the motion code has been implemented without bugs, 84.18% total accuracy was achieved. After the improvements in motion planning and data processing, the total accuracy was increased to 94.14%. Lastly, the latency was reduced from 3-4 seconds to 0.14 seconds

    Vision based localization of mobile robots

    Get PDF
    Mobile robotics is an active and exciting sub-field of Computer Science. Its importance is easily witnessed in a variety of undertakings from DARPA\u27s Grand Challenge to NASA\u27s Mars exploration program. The field is relatively young, and still many challenges face roboticists across the board. One important area of research is localization, which concerns itself with granting a robot the ability to discover and continually update an internal representation of its position. Vision based sensor systems have been investigated [8,22,27], but to much lesser extent than other popular techniques [4,6,7,9,10]. A custom mobile platform has been constructed on top of which a monocular vision based localization system has been implemented. The rigorous gathering of empirical data across a large group of parameters germane to the problem has led to various findings about monocular vision based localization and the fitness of the custom robot platform. The localization component is based on a probabilistic technique called Monte-Carlo Localization (MCL) that tolerates a variety of different sensors and effectors, and has further proven to be adept at localization in diverse circumstances. Both a motion model and sensor model that drive the particle filter at the algorithm\u27s core have been carefully derived. The sensor model employs a simple correlation process that leverages color histograms and edge detection to filter robot pose estimations via the on board vision. This algorithm relies on image matching to tune position estimates based on a priori knowledge of its environment in the form of a feature library. It is believed that leveraging different computationally inexpensive features can lead to efficient and robust localization with MCL. The central goal of this thesis is to implement and arrive at such a conclusion through the gathering of empirical data. Section 1 presents a brief introduction to mobile robot localization and robot architectures, while section 2 covers MCL itself in more depth. Section 3 elaborates on the localization strategy, modeling and implementation that forms the basis of the trials that are presented toward the end of that section. Section 4 presents a revised implementation that attempts to address shortcomings identified during localization trials. Finally in section 5, conclusions are drawn about the effectiveness of the localization implementation and a path to improved localization with monocular vision is posited
    corecore