22 research outputs found

    Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information

    Get PDF
    Conditional independence testing is a fundamental problem underlying causal discovery and a particularly challenging task in the presence of nonlinear and high-dimensional dependencies. Here a fully non-parametric test for continuous data based on conditional mutual information combined with a local permutation scheme is presented. Through a nearest neighbor approach, the test efficiently adapts also to non-smooth distributions due to strongly nonlinear dependencies. Numerical experiments demonstrate that the test reliably simulates the null distribution even for small sample sizes and with high-dimensional conditioning sets. The test is better calibrated than kernel-based tests utilizing an analytical approximation of the null distribution, especially for non-smooth densities, and reaches the same or higher power levels. Combining the local permutation scheme with the kernel tests leads to better calibration, but suffers in power. For smaller sample sizes and lower dimensions, the test is faster than random fourier feature-based kernel tests if the permutation scheme is (embarrassingly) parallelized, but the runtime increases more sharply with sample size and dimensionality. Thus, more theoretical research to analytically approximate the null distribution and speed up the estimation for larger sample sizes is desirable.Comment: 17 pages, 12 figures, 1 tabl

    Structure learning of graphical models for task-oriented robot grasping

    Get PDF
    In the collective imaginaries a robot is a human like machine as any androids in science fiction. However the type of robots that you will encounter most frequently are machinery that do work that is too dangerous, boring or onerous. Most of the robots in the world are of this type. They can be found in auto, medical, manufacturing and space industries. Therefore a robot is a system that contains sensors, control systems, manipulators, power supplies and software all working together to perform a task. The development and use of such a system is an active area of research and one of the main problems is the development of interaction skills with the surrounding environment, which include the ability to grasp objects. To perform this task the robot needs to sense the environment and acquire the object informations, physical attributes that may influence a grasp. Humans can solve this grasping problem easily due to their past experiences, that is why many researchers are approaching it from a machine learning perspective finding grasp of an object using information of already known objects. But humans can select the best grasp amongst a vast repertoire not only considering the physical attributes of the object to grasp but even to obtain a certain effect. This is why in our case the study in the area of robot manipulation is focused on grasping and integrating symbolic tasks with data gained through sensors. The learning model is based on Bayesian Network to encode the statistical dependencies between the data collected by the sensors and the symbolic task. This data representation has several advantages. It allows to take into account the uncertainty of the real world, allowing to deal with sensor noise, encodes notion of causality and provides an unified network for learning. Since the network is actually implemented and based on the human expert knowledge, it is very interesting to implement an automated method to learn the structure as in the future more tasks and object features can be introduced and a complex network design based only on human expert knowledge can become unreliable. Since structure learning algorithms presents some weaknesses, the goal of this thesis is to analyze real data used in the network modeled by the human expert, implement a feasible structure learning approach and compare the results with the network designed by the expert in order to possibly enhance it

    The IBMAP approach for Markov networks structure learning

    Get PDF
    In this work we consider the problem of learning the structure of Markov networks from data. We present an approach for tackling this problem called IBMAP, together with an efficient instantiation of the approach: the IBMAP-HC algorithm, designed for avoiding important limitations of existing independence-based algorithms. These algorithms proceed by performing statistical independence tests on data, trusting completely the outcome of each test. In practice tests may be incorrect, resulting in potential cascading errors and the consequent reduction in the quality of the structures learned. IBMAP contemplates this uncertainty in the outcome of the tests through a probabilistic maximum-a-posteriori approach. The approach is instantiated in the IBMAP-HC algorithm, a structure selection strategy that performs a polynomial heuristic local search in the space of possible structures. We present an extensive empirical evaluation on synthetic and real data, showing that our algorithm outperforms significantly the current independence-based algorithms, in terms of data efficiency and quality of learned structures, with equivalent computational complexities. We also show the performance of IBMAP-HC in a real-world application of knowledge discovery: EDAs, which are evolutionary algorithms that use structure learning on each generation for modeling the distribution of populations. The experiments show that when IBMAP-HC is used to learn the structure, EDAs improve the convergence to the optimum

    Using Markov Boundary Approach for Interpretable and Generalizable Feature Selection

    Full text link
    Predictive power and generalizability of models depend on the quality of features selected in the model. Machine learning (ML) models in banks consider a large number of features which are often correlated or dependent. Incorporation of these features may hinder model stability and prior feature screening can improve long term performance of the models. A Markov boundary (MB) of features is the minimum set of features that guarantee that other potential predictors do not affect the target given the boundary while ensuring maximal predictive accuracy. Identifying the Markov boundary is straightforward under assumptions of Gaussianity on the features and linear relationships between them. This paper outlines common problems associated with identifying the Markov boundary in structured data when relationships are non-linear, and predictors are of mixed data type. We have proposed a multi-group forward-backward selection strategy that not only handles the continuous features but addresses some of the issues with MB identification in a mixed data setup and demonstrated its capabilities on simulated and real datasets

    Learning an L1-regularized Gaussian Bayesian Network in the Equivalence Class Space

    Get PDF
    Learning the structure of a graphical model from data is a common task in a wide range of practical applications. In this paper, we focus on Gaussian Bayesian networks, i.e., on continuous data and directed acyclic graphs with a joint probability density of all variables given by a Gaussian. We propose to work in an equivalence class search space, specifically using the k-greedy equivalence search algorithm. This, combined with regularization techniques to guide the structure search, can learn sparse networks close to the one that generated the data. We provide results on some synthetic networks and on modeling the gene network of the two biological pathways regulating the biosynthesis of isoprenoids for the Arabidopsis thaliana plan

    Nearly Minimax Optimal Wasserstein Conditional Independence Testing

    Full text link
    This paper is concerned with minimax conditional independence testing. In contrast to some previous works on the topic, which use the total variation distance to separate the null from the alternative, here we use the Wasserstein distance. In addition, we impose Wasserstein smoothness conditions which on bounded domains are weaker than the corresponding total variation smoothness imposed, for instance, by Neykov et al. [2021]. This added flexibility expands the distributions which are allowed under the null and the alternative to include distributions which may contain point masses for instance. We characterize the optimal rate of the critical radius of testing up to logarithmic factors. Our test statistic which nearly achieves the optimal critical radius is novel, and can be thought of as a weighted multi-resolution version of the U-statistic studied by Neykov et al. [2021].Comment: 24 pages, 1 figure, ordering of the last three authors is rando

    A survey on independence-based Markov networks learning

    Full text link
    This work reports the most relevant technical aspects in the problem of learning the \emph{Markov network structure} from data. Such problem has become increasingly important in machine learning, and many other application fields of machine learning. Markov networks, together with Bayesian networks, are probabilistic graphical models, a widely used formalism for handling probability distributions in intelligent systems. Learning graphical models from data have been extensively applied for the case of Bayesian networks, but for Markov networks learning it is not tractable in practice. However, this situation is changing with time, given the exponential growth of computers capacity, the plethora of available digital data, and the researching on new learning technologies. This work stresses on a technology called independence-based learning, which allows the learning of the independence structure of those networks from data in an efficient and sound manner, whenever the dataset is sufficiently large, and data is a representative sampling of the target distribution. In the analysis of such technology, this work surveys the current state-of-the-art algorithms for learning Markov networks structure, discussing its current limitations, and proposing a series of open problems where future works may produce some advances in the area in terms of quality and efficiency. The paper concludes by opening a discussion about how to develop a general formalism for improving the quality of the structures learned, when data is scarce.Comment: 35 pages, 1 figur
    corecore