11 research outputs found

    Action recognition in depth videos using nonparametric probabilistic graphical models

    Get PDF
    Action recognition involves automatically labelling videos that contain human motion with action classes. It has applications in diverse areas such as smart surveillance, human computer interaction and content retrieval. The recent advent of depth sensing technology that produces depth image sequences has offered opportunities to solve the challenging action recognition problem. The depth images facilitate robust estimation of a human skeleton’s 3D joint positions and a high level action can be inferred from a sequence of these joint positions. A natural way to model a sequence of joint positions is to use a graphical model that describes probabilistic dependencies between the observed joint positions and some hidden state variables. A problem with these models is that the number of hidden states must be fixed a priori even though for many applications this number is not known in advance. This thesis proposes nonparametric variants of graphical models with the number of hidden states automatically inferred from data. The inference is performed in a full Bayesian setting by using the Dirichlet Process as a prior over the model’s infinite dimensional parameter space. This thesis describes three original constructions of nonparametric graphical models that are applied in the classification of actions in depth videos. Firstly, the action classes are represented by a Hidden Markov Model (HMM) with an unbounded number of hidden states. The formulation enables information sharing and discriminative learning of parameters. Secondly, a hierarchical HMM with an unbounded number of actions and poses is used to represent activities. The construction produces a simplified model for activity classification by using logistic regression to capture the relationship between action states and activity labels. Finally, the action classes are modelled by a Hidden Conditional Random Field (HCRF) with the number of intermediate hidden states learned from data. Tractable inference procedures based on Markov Chain Monte Carlo (MCMC) techniques are derived for all these constructions. Experiments with multiple benchmark datasets confirm the efficacy of the proposed approaches for action recognition

    Synthetic Text Generation using Hypergraph Representations

    Full text link
    Generating synthetic variants of a document is often posed as text-to-text transformation. We propose an alternate LLM based method that first decomposes a document into semantic frames and then generates text using this interim sparse format. The frames are modeled using a hypergraph, which allows perturbing the frame contents in a principled manner. Specifically, new hyperedges are mined through topological analysis and complex polyadic relationships including hierarchy and temporal dynamics are accommodated. We show that our solution generates documents that are diverse, coherent and vary in style, sentiment, format, composition and facts

    Non-parametric hidden conditional random fields for action classification

    Get PDF
    Conditional Random Fields (CRF), a structured prediction method, combines probabilistic graphical models and discriminative classification techniques in order to predict class labels in sequence recognition problems. Its extension the Hidden Conditional Random Fields (HCRF) uses hidden state variables in order to capture intermediate structures. The number of hidden states in an HCRF must be specified a priori. This number is often not known in advance. A non-parametric extension to the HCRF, with the number of hidden states automatically inferred from data, is proposed here. This is a significant advantage over the classical HCRF since it avoids ad hoc model selection procedures. Further, the training and inference procedure is fully Bayesian eliminating the over fitting problem associated with frequentist methods. In particular, our construction is based on scale mixtures of Gaussians as priors over the HCRF parameters and makes use of Hierarchical Dirichlet Process (HDP) and Laplace distribution. The proposed inference procedure uses elliptical slice sampling, a Markov Chain Monte Carlo (MCMC) method, in order to sample optimal and sparse posterior HCRF parameters. The above technique is applied for classifying human actions that occur in depth image sequences – a challenging computer vision problem. Experiments with real world video datasets confirm the efficacy of our classification approach

    Action classification using a discriminative multilevel HDP-HMM

    Get PDF
    We classify human actions occurring in depth image sequences using features based on skeletal joint positions. The action classes are represented by a multi-level Hierarchical Dirichlet Process – Hidden Markov Model (HDP-HMM). The non-parametric HDP-HMM allows the inference of hidden states automatically from training data. The model parameters of each class are formulated as transformations from a shared base distribution, thus promoting the use of unlabelled examples during training and borrowing information across action classes. Further, the parameters are learnt in a discriminative way. We use a normalized gamma process representation of HDP and margin based likelihood functions for this purpose. We sample parameters from the complex posterior distribution induced by our discriminative likelihood function using elliptical slice sampling. Experiments with two different datasets show that action class models learnt using our technique produce good classification results

    Action recognition in depth videos using nonparametric probabilistic graphical models

    Get PDF
    Action recognition involves automatically labelling videos that contain human motion with action classes. It has applications in diverse areas such as smart surveillance, human computer interaction and content retrieval. The recent advent of depth sensing technology that produces depth image sequences has offered opportunities to solve the challenging action recognition problem. The depth images facilitate robust estimation of a human skeleton’s 3D joint positions and a high level action can be inferred from a sequence of these joint positions. A natural way to model a sequence of joint positions is to use a graphical model that describes probabilistic dependencies between the observed joint positions and some hidden state variables. A problem with these models is that the number of hidden states must be fixed a priori even though for many applications this number is not known in advance. This thesis proposes nonparametric variants of graphical models with the number of hidden states automatically inferred from data. The inference is performed in a full Bayesian setting by using the Dirichlet Process as a prior over the model’s infinite dimensional parameter space. This thesis describes three original constructions of nonparametric graphical models that are applied in the classification of actions in depth videos. Firstly, the action classes are represented by a Hidden Markov Model (HMM) with an unbounded number of hidden states. The formulation enables information sharing and discriminative learning of parameters. Secondly, a hierarchical HMM with an unbounded number of actions and poses is used to represent activities. The construction produces a simplified model for activity classification by using logistic regression to capture the relationship between action states and activity labels. Finally, the action classes are modelled by a Hidden Conditional Random Field (HCRF) with the number of intermediate hidden states learned from data. Tractable inference procedures based on Markov Chain Monte Carlo (MCMC) techniques are derived for all these constructions. Experiments with multiple benchmark datasets confirm the efficacy of the proposed approaches for action recognition

    Bayesian Hierarchical Models for Counterfactual Estimation

    Full text link
    Counterfactual explanations utilize feature perturbations to analyze the outcome of an original decision and recommend an actionable recourse. We argue that it is beneficial to provide several alternative explanations rather than a single point solution and propose a probabilistic paradigm to estimate a diverse set of counterfactuals. Specifically, we treat the perturbations as random variables endowed with prior distribution functions. This allows sampling multiple counterfactuals from the posterior density, with the added benefit of incorporating inductive biases, preserving domain specific constraints and quantifying uncertainty in estimates. More importantly, we leverage Bayesian hierarchical modeling to share information across different subgroups of a population, which can both improve robustness and measure fairness. A gradient based sampler with superior convergence characteristics efficiently computes the posterior samples. Experiments across several datasets demonstrate that the counterfactuals estimated using our approach are valid, sparse, diverse and feasible

    Synthetic Document Generator for Annotation-free Layout Recognition

    Full text link
    Analyzing the layout of a document to identify headers, sections, tables, figures etc. is critical to understanding its content. Deep learning based approaches for detecting the layout structure of document images have been promising. However, these methods require a large number of annotated examples during training, which are both expensive and time consuming to obtain. We describe here a synthetic document generator that automatically produces realistic documents with labels for spatial positions, extents and categories of the layout elements. The proposed generative process treats every physical component of a document as a random variable and models their intrinsic dependencies using a Bayesian Network graph. Our hierarchical formulation using stochastic templates allow parameter sharing between documents for retaining broad themes and yet the distributional characteristics produces visually unique samples, thereby capturing complex and diverse layouts. We empirically illustrate that a deep layout detection model trained purely on the synthetic documents can match the performance of a model that uses real documents

    Activity recognition using a supervised non-parametric hierarchical HMM

    Get PDF
    The problem of classifying human activities occurring in depth image sequences is addressed. The 3D joint positions of a human skeleton and the local depth image pattern around these joint positions define the features. A two level hierarchical Hidden Markov Model (H-HMM), with independent Markov chains for the joint positions and depth image pattern, is used to model the features. The states corresponding to the H-HMM bottom level characterize the granular poses while the top level characterizes the coarser actions associated with the activities. Further, the H-HMM is based on a Hierarchical Dirichlet Process (HDP), and is fully non-parametric with the number of pose and action states inferred automatically from data. This is a significant advantage over classical HMM and its extensions. In order to perform classification, the relationships between the actions and the activity labels are captured using multinomial logistic regression. The proposed inference procedure ensures alignment of actions from activities with similar labels. Our construction enables information sharing, allows incorporation of unlabelled examples and provides a flexible factorized representation to include multiple data channels. Experiments with multiple real world datasets show the efficacy of our classification approach

    Municipal Bond Pricing: A Data Driven Method

    No full text
    Price evaluations of municipal bonds have traditionally been performed by human experts based on their market knowledge and trading experience. Automated evaluation is an attractive alternative providing the advantage of an objective estimation that is transparent, consistent, and scalable. In this paper, we present a statistical model to automatically estimate U.S municipal bond yields based on trade transactions and study the agreement between human evaluations and machine generated estimates. The model uses piecewise polynomials constructed using basis functions. This provides immense flexibility in capturing the wide dispersion of yields. A novel transfer learning based approach that exploits the latent hierarchical relationship of the bonds is applied to enable robust yield estimation even in the absence of adequate trade data. The Bayesian nature of our model offers a principled framework to account for uncertainty in the estimates. Our inference procedure scales well even for large data sets. We demonstrate the empirical effectiveness of our model by assessing over 100,000 active bonds and find that our estimates are in line with hand priced evaluations for a large number of bonds

    Mapping ESG Trends by Distant Supervision of Neural Language Models

    No full text
    The integration of Environmental, Social and Governance (ESG) considerations into business decisions and investment strategies have accelerated over the past few years. It is important to quantify the extent to which ESG-related conversations are carried out by companies so that their impact on business operations can be objectively assessed. However, profiling ESG language is challenging due to its multi-faceted nature and the lack of supervised datasets. This research study aims to detect historical trends in ESG discussions by analyzing the transcripts of corporate earning calls. The proposed solution exploits recent advances in neural language modeling to understand the linguistic structure in ESG discourse. In detail, firstly we develop a classification model that categorizes the relevance of a text sentence to ESG. A pre-trained language model is fine-tuned on a small corporate sustainability reports dataset for this purpose. The semantic knowledge encoded in this classification model is then leveraged by applying it to the sentences in the conference transcripts using a novel distant-supervision approach. Extensive empirical evaluations against various pretraining techniques demonstrate the efficacy of the proposed transfer learning framework. Our analysis indicates that in the last 5 years, nearly 15% of the discussions during earnings calls pertained to ESG, implying that ESG factors are integral to business strategy
    corecore