Search CORE

22,442 research outputs found

BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees

Author: Anderson M. R.
Bardenet R.
Bergstra J. S.
Bottou L.
Crankshaw D.
Derezinski M.
Drineas P.
Duchi J.
Feurer M.
Gittens A.
Le Q. V.
Lin C.-J.
Lucic M.
Maclaurin D.
Martens J.
Mozafari B.
Musco C.
Ogawa K.
Pedregosa F.
R
Recht B.
Salakhutdinov R.
Tieleman T.
Tipping M. E.
Weimer M.
Xing E.
Yang A. Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/12/2018
Field of study

The rising volume of datasets has made training machine learning (ML) models a major computational cost in the enterprise. Given the iterative nature of model and parameter tuning, many analysts use a small sample of their entire data during their initial stage of analysis to make quick decisions (e.g., what features or hyperparameters to use) and use the entire dataset only in later stages (i.e., when they have converged to a specific model). This sampling, however, is performed in an ad-hoc fashion. Most practitioners cannot precisely capture the effect of sampling on the quality of their model, and eventually on their decision-making process during the tuning phase. Moreover, without systematic support for sampling operators, many optimizations and reuse opportunities are lost. In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML training. BlinkML allows users to make error-computation tradeoffs: instead of training a model on their full data (i.e., full model), BlinkML can quickly train an approximate model with quality guarantees using a sample. The quality guarantees ensure that, with high probability, the approximate model makes the same predictions as the full model. BlinkML currently supports any ML model that relies on maximum likelihood estimation (MLE), which includes Generalized Linear Models (e.g., linear regression, logistic regression, max entropy classifier, Poisson regression) as well as PPCA (Probabilistic Principal Component Analysis). Our experiments show that BlinkML can speed up the training of large-scale ML tasks by 6.26x-629x while guaranteeing the same predictions, with 95% probability, as the full model.Comment: 22 pages, SIGMOD 201

arXiv.org e-Print Archive

Crossref

On pattern classification algorithms - Introduction and survey

Author: Agrawala A. K.
Ho YU.-C.
Publication venue
Publication date: 01/01/1968
Field of study

Pattern recognition algorithms, and mathematical techniques of estimation, decision making, and optimization theor

Crossref

NASA Technical Reports Server

Recommended from our members

Learning from AI : new trends in database technology

Author: Bic Lubomir
Gilbert Jonathan P.
Publication venue: eScholarship, University of California
Publication date: 01/01/1985
Field of study

Recently some researchers in the areas of database data modelling and knowledge representations in artificial intelligence have recognized that they share many common goals. In this survey paper we show the relationship between database and artificial intelligence research. We show that there has been a tendency for data models to incorporate more modelling techniques developed for knowledge representations in artificial intelligence as the desire to incorporate more application oriented semantics, user friendliness, and flexibility has increased. Increasing the semantics of the representation is the key to capturing the "reality" of the database environment, increasing user friendliness, and facilitating the support of multiple, possibly conflicting, user views of the information contained in a database

eScholarship - University of California

Object-Oriented Dynamics Learning through Multi-Level Abstraction

Author: Lin Zichuan
Ren Zhizhou
Wang Jianhao
Zhang Chongjie
Zhu Guangxiang
Publication venue
Publication date: 05/12/2019
Field of study

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common environments with multiple dynamic objects. In this paper, we present a novel self-supervised learning framework, called Multi-level Abstraction Object-oriented Predictor (MAOP), which employs a three-level learning architecture that enables efficient object-based dynamics learning from raw visual observations. We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability. Our results show that MAOP significantly outperforms previous methods in terms of sample efficiency and generalization over novel environments for learning environment models. We also demonstrate that learned dynamics models enable efficient planning in unseen environments, comparable to true environment models. In addition, MAOP learns semantically and visually interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial Intelligence (AAAI), 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Schema architecture and their relationships to transaction processing in distributed database systems

Author: Apers P.M.G.
Scheuermann P.
Publication venue
Publication date: 01/01/1991
Field of study

We discuss the different types of schema architectures which could be supported by distributed database systems, making a clear distinction between logical, physical, and federated distribution. We elaborate on the additional mapping information required in architecture based on logical distribution in order to support retrieval as well as update operations. We illustrate the problems in schema integration and data integration in multidatabase systems and discuss their impact on query processing. Finally, we discuss different issues relevant to the cooperation (or noncooperation) of local database systems in a heterogeneous multidatabase system and their relationship to the schema architecture and transaction processing

University of Twente Research Information