Search CORE

19 research outputs found

Recommended from our members

Approximate Bayesian Deep Learning for Resource-Constrained Environments

Author: Vadera Meet Prakash
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/10/2022
Field of study

Deep learning models have shown promising results in areas including computer vision, natural language processing, speech recognition, and more. However, existing point estimation-based training methods for these models may result in predictive uncertainties that are not well calibrated, including the occurrence of confident errors. Approximate Bayesian inference methods can help address these issues in a principled way by accounting for uncertainty in model parameters. However, these methods are computationally expensive both when computing approximations to the parameter posterior and when using an approximate parameter posterior to make predictions. They can also require significantly more storage than point-estimated models. In this thesis, we address a range of questions related to trade-offs between the quality of inference and prediction and the computational scalability of Bayesian deep learning methods. We begin by developing a framework for comprehensive evaluation of Bayesian neural network models and applying this framework to a range of existing models and inference methods. Second, we address the problem of providing flexible trade-offs between prediction quality, run time, and storage by developing and evaluating a general framework for distilling expectations with respect to the Bayesian posterior distribution of a deep neural network classifier. Third, we investigate the trade-offs between model sparsity and inference performance for deep neural network models using several approaches to deriving sparse model structures. Fourth, we present a framework for correcting approximate posterior predictive distributions, encouraging them to prefer high-utility decisions. Finally, we investigate the use of approximate Bayesian deep learning in object detection and present an evaluation of approaches for quantifying different facets of uncertainty related to object classes and locations

ScholarWorks@UMass Amherst

Taylor Polynomial Estimator for Estimating Frequency Moments

Author: A Andoni
M Charikar
N Alon
P Indyk
R Singh
S Ganguly
Y Li
Publication venue
Publication date: 03/06/2015
Field of study

We present a randomized algorithm for estimating the

p

th moment

F_p

of the frequency vector of a data stream in the general update (turnstile) model to within a multiplicative factor of

1 \pm \epsilon

, for

p > 2

, with high constant confidence. For

0 < \epsilon \le 1

, the algorithm uses space

O( n^{1-2/p} \epsilon^{-2} + n^{1-2/p} \epsilon^{-4/p} \log (n))

words. This improves over the current bound of

O(n^{1-2/p} \epsilon^{-2-4/p} \log (n))

words by Andoni et. al. in \cite{ako:arxiv10}. Our space upper bound matches the lower bound of Li and Woodruff \cite{liwood:random13} for

\epsilon = (\log (n))^{-\Omega(1)}

and the lower bound of Andoni et. al. \cite{anpw:icalp13} for

\epsilon = \Omega(1)

.Comment: Supercedes arXiv:1104.4552. Extended Abstract of this paper to appear in Proceedings of ICALP 201

arXiv.org e-Print Archive

Crossref

How Technology Impacts and Compares to Humans in Socially Consequential Arenas

Author: Dooley Samuel
Publication venue
Publication date: 02/11/2022
Field of study

One of the main promises of technology development is for it to be adopted by people, organizations, societies, and governments -- incorporated into their life, work stream, or processes. Often, this is socially beneficial as it automates mundane tasks, frees up more time for other more important things, or otherwise improves the lives of those who use the technology. However, these beneficial results do not apply in every scenario and may not impact everyone in a system the same way. Sometimes a technology is developed which produces both benefits and inflicts some harm. These harms may come at a higher cost to some people than others, raising the question: {\it how are benefits and harms weighed when deciding if and how a socially consequential technology gets developed?} The most natural way to answer this question, and in fact how people first approach it, is to compare the new technology to what used to exist. As such, in this work, I make comparative analyses between humans and machines in three scenarios and seek to understand how sentiment about a technology, performance of that technology, and the impacts of that technology combine to influence how one decides to answer my main research question.Comment: Doctoral thesis proposal. arXiv admin note: substantial text overlap with arXiv:2110.08396, arXiv:2108.12508, arXiv:2006.1262

arXiv.org e-Print Archive

Integrated High-Resolution Modeling for Operational Hydrologic Forecasting

Author: Hernández Felipe
Publication venue
Publication date: 19/06/2019
Field of study

Current advances in Earth-sensing technologies, physically-based modeling, and computational processing, offer the promise of a major revolution in hydrologic forecasting—with profound implications for the management of water resources and protection from related disasters. However, access to the necessary capabilities for managing information from heterogeneous sources, and for its deployment in robust-enough modeling engines, remains the province of large governmental agencies. Moreover, even within this type of centralized operations, success is still challenged by the sheer computational complexity associated with overcoming uncertainty in the estimation of parameters and initial conditions in large-scale or high-resolution models. In this dissertation we seek to facilitate the access to hydrometeorological data products from various U.S. agencies and to advanced watershed modeling tools through the implementation of a lightweight GIS-based software package. Accessible data products currently include gauge, radar, and satellite precipitation; stream discharge; distributed soil moisture and snow cover; and multi-resolution weather forecasts. Additionally, we introduce a suite of open-source methods aimed at the efficient parameterization and initialization of complex geophysical models in contexts of high uncertainty, scarce information, and limited computational resources. The developed products in this suite include: 1) model calibration based on state of the art ensemble evolutionary Pareto optimization, 2) automatic parameter estimation boosted through the incorporation of expert criteria, 3) data assimilation that hybridizes particle smoothing and variational strategies, 4) model state compression by means of optimized clustering, 5) high-dimensional stochastic approximation of watershed conditions through a novel lightweight Gaussian graphical model, and 6) simultaneous estimation of model parameters and states for hydrologic forecasting applications. Each of these methods was tested using established distributed physically-based hydrologic modeling engines (VIC and the DHSVM) that were applied to watersheds in the U.S. of different sizes—from a small highly-instrumented catchment in Pennsylvania, to the basin of the Blue River in Oklahoma. A series of experiments was able to demonstrate statistically-significant improvements in the predictive accuracy of the proposed methods in contrast with traditional approaches. Taken together, these accessible and efficient tools can therefore be integrated within various model-based workflows for complex operational applications in water resources and beyond

D-Scholarship@Pitt

LIPIcs, Volume 251, ITCS 2023, Complete Volume

Author: Tauman Kalai Yael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 251, ITCS 2023, Complete Volum

Dagstuhl Research Online Publication Server

Online learning on the programmable dataplane

Author: Simpson Kyle Andrew
Publication venue
Publication date: 01/01/2022
Field of study

This thesis makes the case for managing computer networks with datadriven methods automated statistical inference and control based on measurement data and runtime observations—and argues for their tight integration with programmable dataplane hardware to make management decisions faster and from more precise data. Optimisation, defence, and measurement of networked infrastructure are each challenging tasks in their own right, which are currently dominated by the use of hand-crafted heuristic methods. These become harder to reason about and deploy as networks scale in rates and number of forwarding elements, but their design requires expert knowledge and care around unexpected protocol interactions. This makes tailored, per-deployment or -workload solutions infeasible to develop. Recent advances in machine learning offer capable function approximation and closed-loop control which suit many of these tasks. New, programmable dataplane hardware enables more agility in the network— runtime reprogrammability, precise traffic measurement, and low latency on-path processing. The synthesis of these two developments allows complex decisions to be made on previously unusable state, and made quicker by offloading inference to the network. To justify this argument, I advance the state of the art in data-driven defence of networks, novel dataplane-friendly online reinforcement learning algorithms, and in-network data reduction to allow classification of switchscale data. Each requires co-design aware of the network, and of the failure modes of systems and carried traffic. To make online learning possible in the dataplane, I use fixed-point arithmetic and modify classical (non-neural) approaches to take advantage of the SmartNIC compute model and make use of rich device local state. I show that data-driven solutions still require great care to correctly design, but with the right domain expertise they can improve on pathological cases in DDoS defence, such as protecting legitimate UDP traffic. In-network aggregation to histograms is shown to enable accurate classification from fine temporal effects, and allows hosts to scale such classification to far larger flow counts and traffic volume. Moving reinforcement learning to the dataplane is shown to offer substantial benefits to stateaction latency and online learning throughput versus host machines; allowing policies to react faster to fine-grained network events. The dataplane environment is key in making reactive online learning feasible—to port further algorithms and learnt functions, I collate and analyse the strengths of current and future hardware designs, as well as individual algorithms

Glasgow Theses Service

New Perspectives on Structured Encryption:Attacks, Constructions and Foundations

Author: Gui Zichen
Publication venue
Publication date: 12/01/2022
Field of study

Explore Bristol Research

Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

Author
Publication venue: AUAI Press
Publication date: 01/09/2018
Field of study

UCL Discovery

LIPIcs, Volume 274, ESA 2023, Complete Volume

Author: Farach-Colton Martin
Herman Grzegorz
Puglisi Simon J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 274, ESA 2023, Complete Volum

Dagstuhl Research Online Publication Server

De l'apprentissage faiblement supervisé au catalogage en ligne

Author: Cabannes Vivien
Publication venue: HAL CCSD
Publication date: 18/07/2022
Field of study

Applied mathematics and machine computations have raised a lot of hope since the recent success of supervised learning. Many practitioners in industries have been trying to switch from their old paradigms to machine learning. Interestingly, those data scientists spend more time scrapping, annotating and cleaning data than fine-tuning models. This thesis is motivated by the following question: can we derive a more generic framework than the one of supervised learning in order to learn from clutter data? This question is approached through the lens of weakly supervised learning, assuming that the bottleneck of data collection lies in annotation. We model weak supervision as giving, rather than a unique target, a set of target candidates. We argue that one should look for an “optimistic” function that matches most of the observations. This allows us to derive a principle to disambiguate partial labels. We also discuss the advantage to incorporate unsupervised learning techniques into our framework, in particular manifold regularization approached through diffusion techniques, for which we derived a new algorithm that scales better with input dimension then the baseline method. Finally, we switch from passive to active weakly supervised learning, introducing the “active labeling” framework, in which a practitioner can query weak information about chosen data. Among others, we leverage the fact that one does not need full information to access stochastic gradients and perform stochastic gradient descent.Les mathématiques appliquées et le calcul nourrissent beaucoup d’espoirs à la suite des succès récents de l’apprentissage supervisé. Dans l’industrie, beaucoup d’ingénieurs cherchent à remplacer leurs anciens paradigmes de pensée par l’apprentissage machine. Étonnamment, ces ingénieurs passent plus de temps à collecter, annoter et nettoyer des données qu’à raffiner des modèles. Ce phénomène motive la problématique de cette thèse: peut-on définir un cadre théorique plus général que l’apprentissage supervisé pour apprendre grâce à des données hétérogènes? Cette question est abordée via le concept de supervision faible, faisant l’hypothèse que le problème que posent les données est leur annotation. On modélise la supervision faible comme l’accès, pour une entrée donnée, non pas d’une sortie claire, mais d’un ensemble de sorties potentielles. On plaide pour l’adoption d’une perspective « optimiste » et l’apprentissage d’une fonction qui vérifie la plupart des observations. Cette perspective nous permet de définir un principe pour lever l’ambiguïté des informations faibles. On discute également de l’importance d’incorporer des techniques sans supervision d’appréhension des données d’entrée dans notre théorie, en particulier de compréhension de la variété sous-jacente via des techniques de diffusion, pour lesquelles on propose un algorithme réaliste afin d’éviter le fléau de la dimension, à l’inverse de ce qui existait jusqu’alors. Enfin, nous nous attaquons à la question de collecte active d’informations faibles, définissant le problème de « catalogage en ligne », où un intendant doit acquérir une maximum d’informations fiables sur ses données sous une contrainte de budget. Entre autres, nous tirons parti du fait que pour obtenir un gradient stochastique et effectuer une descente de gradient, il n’y a pas besoin de supervision totale

INRIA a CCSD electronic archive server