27 research outputs found

    Variational Bayes via Propositionalization

    Get PDF
    We propose a unified approach to VB (variational Bayes) in symbolic-statistical modeling via propositionalization. By propositionalization we mean, broadly, expressing and computing probabilistic models such as BNs (Bayesian networks) and PCFGs (probabilistic context free grammars) in terms of propositional logic that considers propositional variables as binary random variables. Our proposal is motivated by three observations. The first one is that PPC (propostionalized probability computation), i.e. probability computation formalized in a propositional setting, has turned out to be general and efficient when variable values are sparsely interdependent. Examples include (discrete) BNs, PCFGs and more generally PRISM which is a Turing complete logic programming language with EM learning ability we have been developing, and computes probabilities using graphically represented AND/OR boolean formulas. Efficiency of PPC is classically testified by the Inside-Outside algorithm in the case of PCFGs and by recent PPC approaches in the case of BNs such as the one by Darwiche et al. that exploits 00 probability and CSI (context specific independence). Dechter et al. also revealed that PPC is a general computation scheme for BNs by their formulation of AND/OR search spaces. Second of all, while VB has been around for sometime as a practically effective approach to Bayesian modeling, it\u27s use is still somewhat restricted to simple models such as BNs and HMMs (hidden Markov models) though its usefulness is established through a variety of applications from model selection to prediction. On the other hand it is already proved that VB can be extended to PCFGs and is efficiently implementable using dynamic programming. Note that PCFGs are just one class of PPC and much more general PPC is realized by PRISM. Accordingly if VB is extened to PRISM\u27s PPC, we will obtain VB for general probabilistic models, far wider than BNs and PCFGs. The last observation is that once VB becomes available in PRISM, it saves us a lot of time and energy. First we do not have to derive a new VB algorithm from scratch for each model and implement it. All we have to do is just to write a probabilistic model at predicate level. The rest of work will be carried out automatically in a unified manner by the PRISM system as it happens in the case of EM learning. Deriving and implementing a VB algorithm is a tedious error-prone process, and ensuring its correctness would be difficult beyond PCFGs without formal semantics. PRISM augmented with VB will completely eliminate such needs and make it easy to explore and test new Bayesian models by helping the user cope with data sparseness and avoid over-fitting

    Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining

    Get PDF
    The notion of meta-mining has appeared recently and extends the traditional meta-learning in two ways. First it does not learn meta-models that provide support only for the learning algorithm selection task but ones that support the whole data-mining process. In addition it abandons the so called black-box approach to algorithm description followed in meta-learning. Now in addition to the datasets, algorithms also have descriptors, workflows as well. For the latter two these descriptions are semantic, describing properties of the algorithms. With the availability of descriptors both for datasets and data mining workflows the traditional modelling techniques followed in meta-learning, typically based on classification and regression algorithms, are no longer appropriate. Instead we are faced with a problem the nature of which is much more similar to the problems that appear in recommendation systems. The most important meta-mining requirements are that suggestions should use only datasets and workflows descriptors and the cold-start problem, e.g. providing workflow suggestions for new datasets. In this paper we take a different view on the meta-mining modelling problem and treat it as a recommender problem. In order to account for the meta-mining specificities we derive a novel metric-based-learning recommender approach. Our method learns two homogeneous metrics, one in the dataset and one in the workflow space, and a heterogeneous one in the dataset-workflow space. All learned metrics reflect similarities established from the dataset-workflow preference matrix. We demonstrate our method on meta-mining over biological (microarray datasets) problems. The application of our method is not limited to the meta-mining problem, its formulations is general enough so that it can be applied on problems with similar requirements

    Modeling Complex Networks For (Electronic) Commerce

    Get PDF
    NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

    Learning classifiers from linked data

    Get PDF
    The emergence of many interlinked, physically distributed, and autonomously maintained linked data sources amounts to the rapid growth of Linked Open Data (LOD) cloud, which offers unprecedented opportunities for predictive modeling and knowledge discovery from such data. However existing machine learning approaches are limited in their applicability because it is neither desirable nor feasible to gather all of the data in a centralized location for analysis due to access, memory, bandwidth, or computational restrictions. In some applications additional schema such as subclass hierarchies may be available and exploited by the learner. Furthermore, in other applications, the attributes that are relevant for specific prediction tasks are not known a priori and hence need to be discovered by the algorithm. Against this background, we present a series of approaches that attempt to address such scenarios. First, we show how to learn Relational Bayesian Classifiers (RBCs) from a single but remote data store using statistical queries, and we extend to the setting where the attributes that are relevant for prediction are not known a priori, by selectively crawling the data store for attributes of interest. Next, we introduce an algorithm for learning classifiers from a remote data store enriched with subclass hierarchies. Our algorithm encodes the constraints specified in a subclass hierarchy using latent variables in a directed graphical model, and adopts the Variational Bayesian EM approach to efficiently learn parameters. In retrospect, we observe that in learning from linked data it is often useful to represent an instance as tuples of bags of attribute values. With this inspiration, we introduce, formulate, and present solutions for a novel type of learning problem which we call distributional instance classification. Finally, building up from the foundations, we consider the problem of learning predictive models from multiple interlinked data stores. We introduce a distributed learning framework, and identify three special cases of linked data fragmentation then describe effective strategies for learning predictive models in each case. Further, we consider a novel application of a matrix reconstruction technique from the field of Computerized Tomography to approximate the statistics needed by the learning algorithm from projections using count queries, thus dramatically reducing the amount of information transmitted from the remote data sources to the learner

    Representation Learning for Words and Entities

    Get PDF
    This thesis presents new methods for unsupervised learning of distributed representations of words and entities from text and knowledge bases. The first algorithm presented in the thesis is a multi-view algorithm for learning representations of words called Multiview Latent Semantic Analysis (MVLSA). By incorporating up to 46 different types of co-occurrence statistics for the same vocabulary of english words, I show that MVLSA outperforms other state-of-the-art word embedding models. Next, I focus on learning entity representations for search and recommendation and present the second method of this thesis, Neural Variational Set Expansion (NVSE). NVSE is also an unsupervised learning method, but it is based on the Variational Autoencoder framework. Evaluations with human annotators show that NVSE can facilitate better search and recommendation of information gathered from noisy, automatic annotation of unstructured natural language corpora. Finally, I move from unstructured data and focus on structured knowledge graphs. I present novel approaches for learning embeddings of vertices and edges in a knowledge graph that obey logical constraints.Comment: phd thesis, Machine Learning, Natural Language Processing, Representation Learning, Knowledge Graphs, Entities, Word Embeddings, Entity Embedding

    Modeling Complex Networks For (Electronic) Commerce

    Get PDF
    NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

    Modeling complex networks for electronic commerce

    Full text link

    Classification in Networked Data: A Toolkit and a Univariate Case Study

    Get PDF
    This paper1 is about classifying entities that are interlinked with entities for which the class is known. After surveying prior work, we present NetKit, a modular toolkit for classification in networked data, and a case-study of its application to networked data used in prior machine learning research. NetKit is based on a node-centric framework in which classifiers comprise a local classifier, a relational classifier, and a collective inference procedure. Various existing node-centric relational learning algorithms can be instantiated with appropriate choices for these components, and new combinations of components realize new algorithms. The case study focuses on univariate network classification, for which the only information used is the structure of class linkage in the network (i.e., only links and some class labels). To our knowledge, no work previously has evaluated systematically the power of class-linkage alone for classification in machine learning benchmark data sets. The results demonstrate that very simple network-classification models perform quite well—well enough that they should be used regularly as baseline classifiers for studies of learning with networked data. The simplest method (which performs remarkably well) highlights the close correspondence between several existing methods introduced for different purposes—that is, Gaussian-field classifiers, Hopfield networks, and relational-neighbor classifiers. The case study also shows that there are two sets of techniques that are preferable in different situations, namely when few versus many labels are known initially. We also demonstrate that link selection plays an important role similar to traditional feature selectionNYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

    Representation Learning for Words and Entities

    Get PDF
    This thesis presents new methods for unsupervised learning of distributed representations of words and entities from text and knowledge bases. The first algorithm presented in the thesis is a multi-view algorithm for learning representations of words called Multiview LSA (MVLSA). Through experiments on close to 50 different views, I show that MVLSA outperforms other state-of-the-art word embedding models. After that, I focus on learning entity representations for search and recommendation and present the second algorithm of this thesis called Neural Variational Set Expansion (NVSE). NVSE is also an unsupervised learning method, but it is based on the Variational Autoencoder framework. Evaluations with human annotators show that NVSE can facilitate better search and recommendation of information gathered from noisy, automatic annotation of unstructured natural language corpora. Finally, I move from unstructured data and focus on structured knowledge graphs. Moreover, I present novel approaches for learning embeddings of vertices and edges in a knowledge graph that obey logical constraints
    corecore