9 research outputs found
Stochastic Subset Selection for Efficient Training and Inference of Neural Networks
Current machine learning algorithms are designed to work with huge volumes of
high dimensional data such as images. However, these algorithms are being
increasingly deployed to resource constrained systems such as mobile devices
and embedded systems. Even in cases where large computing infrastructure is
available, the size of each data instance, as well as datasets, can be a
bottleneck in data transfer across communication channels. Also, there is a
huge incentive both in energy and monetary terms in reducing both the
computational and memory requirements of these algorithms. For nonparametric
models that require to leverage the stored training data at inference time, the
increased cost in memory and computation could be even more problematic. In
this work, we aim to reduce the volume of data these algorithms must process
through an end-to-end two-stage neural subset selection model. We first
efficiently obtain a subset of candidate elements by sampling a mask from a
conditionally independent Bernoulli distribution, and then autoregressivley
construct a subset consisting of the most task relevant elements via sampling
the elements from a conditional Categorical distribution. We validate our
method on set reconstruction and classification tasks with feature selection as
well as the selection of representative samples from a given dataset, on which
our method outperforms relevant baselines. We also show in our experiments that
our method enhances scalability of nonparametric models such as Neural
Processes.Comment: 19 page
Stochastic Optimisation Methods Applied to PET Image Reconstruction
Positron Emission Tomography (PET) is a medical imaging technique that is used to pro- vide functional information regarding physiological processes. Statistical PET reconstruc- tion attempts to estimate the distribution of radiotracer in the body but this methodology is generally computationally demanding because of the use of iterative algorithms. These algorithms are often accelerated by the utilisation of data subsets, which may result in con- vergence to a limit set rather than the unique solution. Methods exist to relax the update step sizes of subset algorithms but they introduce additional heuristic parameters that may result in extended reconstruction times. This work investigates novel methods to modify subset algorithms to converge to the unique solution while maintaining the acceleration benefits of subset methods.
This work begins with a study of an automatic method for increasing subset sizes, called AutoSubsets. This algorithm measures the divergence between two distinct data subset update directions and, if significant, the subset size is increased for future updates. The algorithm is evaluated using both projection and list mode data. The algorithm’s use of small initial subsets benefits early reconstruction but unfortunately, at later updates, the subsets size increases too early, which impedes convergence rates.
The main part of this work investigates the application of stochastic variance reduction optimisation algorithms to PET image reconstruction. These algorithms reduce variance due to the use of subsets by incorporating previously computed subset gradients into the update direction. The algorithms are adapted for the application to PET reconstruction. This study evaluates the reconstruction performance of these algorithms when applied to various 3D non-TOF PET simulated, phantom and patient data sets. The impact of a number of algorithm parameters are explored, which includes: subset selection methodologies, the number of subsets, step size methodologies and preconditioners. The results indicate that these stochastic variance reduction algorithms demonstrate superior performance after only a few epochs when compared to a standard PET reconstruction algorithm
Optimizing the Simplicial-Map Neural Network Architecture
Simplicial-map neural networks are a recent neural network architecture induced by
simplicial maps defined between simplicial complexes. It has been proved that simplicial-map neural
networks are universal approximators and that they can be refined to be robust to adversarial attacks.
In this paper, the refinement toward robustness is optimized by reducing the number of simplices
(i.e., nodes) needed. We have shown experimentally that such a refined neural network is equivalent
to the original network as a classification tool but requires much less storage.Agencia Estatal de Investigación PID2019-107339GB-10
Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation
We describe an information-driven active selection ap-proach to determine which detectors to deploy at which lo-cation in which frame of a video to minimize semantic class label uncertainty at every pixel, with the smallest computa-tional cost that ensures a given uncertainty bound. We show minimal performance reduction compared to a “paragon” algorithm running all detectors at all locations in all frames, at a small fraction of the computational cost. Our method can handle uncertainty in the labeling mechanism, so it can handle both “oracles ” (manual annotation) or noisy detec-tors (automated annotation). 1
Model-Based Problem Solving through Symbolic Regression via Pareto Genetic Programming.
Pareto genetic programming methodology is extended by additional generic model selection and generation strategies that (1) drive the modeling engine to creation of models of reduced non-linearity and increased generalization capabilities, and (2) improve the effectiveness of the search for robust models by goal softening and adaptive fitness evaluations. In addition to the new strategies for model development and model selection, this dissertation presents a new approach for analysis, ranking, and compression of given multi-dimensional input-response data for the purpose of balancing the information content of undesigned data sets.
Genetische Programmierung einer algorithmischen Chemie
Der genetischen Programmierung (GP) liegt zumeist die Annahme zugrunde, dass die Individuen eine evolvierte, wohldefinierte Struktur haben und ihre Ausführung deterministisch erfolgt. Diese Annahme hat ihren Ursprung nicht beim methodischen Vorbild, der natürlichen Evolution, sondern ist ein bewusstes oder unbewusstes Erbe der Umgebung, in der die Evolution nachgebildet wird - der von-Neumann-Architektur.
John von Neumann hat mit der nach ihm benannten von-Neumann-Architektur weit mehr in der Informatik beeinflusst als das Gebiet der Rechnerarchitekturen. Daher ist sein Einfluss auf die Evolution von Algorithmen mittels genetischer Programmierung nicht verwunderlich, auch wenn die von-Neumann-Architektur wenig gemein mit den in der Natur evolvierten Systemen hat. In den letzten Jahren entstanden eine ganze Reihe von Konzepten und theoretischen Modellen, die nur noch wenig Anleihen bei von Neumanns Rechnerarchitektur machen und die in ihren Eigenschaften stärker natürlichen Systemen ähneln. Die Fähigkeit dieser Systeme, Berechnungen durchzuführen, entsteht erst durch die Interaktion ihrer parallel agierenden, nichtdeterministischen und dezentral organisierten Komponenten. Die Fähigkeit emergiert.
Über die Evolution von Algorithmen für solche Systeme jenseits der von-Neumann-Architektur weiß man noch vergleichsweise wenig. Die vorliegende Arbeit nimmt sich dieser Fragestellung an und bedient sich hierbei der algorithmischen Chemie, einer künstlichen Chemie, die bei vereinfachter Betrachtungsweise aus einem veränderten Programmzeigerverhalten in der von-Neumann-Architektur resultiert. Reaktionen, eine Variante einfacher Instruktionen, werden hierbei in zufälliger Reihenfolge gezogen und ausgeführt. Sie interagieren miteinander, indem sie Produkte anderer Reaktionen verwenden und das Ergebnis ihrer Transformation, gespeichert in sogenannten Molekülen, anderen Reaktionen zur Verfügung stellen.
Zur experimentellen Auswertung dieses nichtdeterministischen Systems wird die sequenzielle Parameteroptimierung um ein Verfahren zur Verteilung eines Experimentbudgets erweitert. Das systematische Design der Experimente und ihre anschließende Analyse ermöglichen es, generalisierte Erkenntnisse über das Systemverhalten jenseits konkreter Parametrisierungen zu gewinnen. Im Fall der genetischen Programmierung einer algorithmischen Chemie führen die gewonnenen Erkenntnisse zu einer Neuentwicklung des Rekombinationsoperators nach dem Vorbild homologer Rekombinationsoperationen und damit zu einer weiteren Verbesserung der Systemperformance.
Es zeigt sich, dass die für ein zielgerichtetes Verhalten einer algorithmischen Chemie notwendigen Reaktionsschemata mittels genetischer Programmierung erlernt werden können. Für gängige Problemstellungen aus dem Bereich der genetischen Programmierung werden Lösungen gefunden, die in ihrer Güte mit denen anderer GP-Varianten und maschineller Lernverfahren vergleichbar sind. Die evolvierten Lösungen fallen dabei deutlich kompakter bezüglich der Datenflusshöhe und der Anzahl benötigter Operationen aus, als in dem zum Vergleich herangezogenen linearen GP-System
Sequential Decision-Making Problems: Online Learning for Optimization over Networks
The granularity of big data allows marketers to target customers more effectively with
personalized offers. We model the resulting learning and optimization problem of allocating
coupons on a social network where consumer response models are unknown.
Uncertainty in these model parameters leads to a natural "exploration-exploitation"
tradeoff in choosing which discounts to offer, while choosing which customers to include
in a marketing campaign is a stochastic subset selection problem. We adapt
the Knowledge Gradient with Discrete Priors from optimal learning and find a statistically
significant increase in revenue and general robustness to moderate amounts of
sampling noise. We then incorporate social network information about the connectivity
of users and optimally learn the underlying customer segment parameters, while
accounting for network effects on revenue. Our sampling method under uncertainty
is competitive with realizations of the policy which makes the optimal decision with
perfect information; however, realized revenue is lower than the expected revenue.
We also consider the task assignment problem in crowdsourcing settings, such
as on a platform like Amazon's Mechanical Turk, where we are uncertain both of
users' reliabilities and the true labels of the questions. We derive an approximation
scheme for a dynamic task assignment framework which uses intermediate estimates
of the question answers to estimate the increase in mutual information we receive
by querying each user. This dynamic assignment framework improves upon the final
label accuracy of random sampling by up to 35% for small sample regimes, achieves
comparable performance with one-shot dynamic assignment, and better performance
when the ratio of questions to users increases.
These models and results suggest a general framework using ideas from optimal
learning to learn parameters of customer response models while optimizing for revenue.
Given these estimates, for submodular objective functions, greedy approximation
schemes suffice to construct cardinality-constrained subsets of users to target
Recommended from our members
Using data mining in educational research: A comparison of Bayesian network with multiple regression in prediction
Advances in technology have altered data collection and popularized large databases in areas including education. To turn the collected data into knowledge, effective analysis tools are required. Traditional statistical approaches have shown some limitations when analyzing large-scale data, especially sets with a large number of variables. This dissertation introduces to educational researchers a new data analysis approach called data mining, an analytic process at the intersection of statistics, databases, machine learning/artificial intelligence (AI), and computer science, that is designed to explore large amounts of data to search for consistent patterns and/or systematic relationships between variables. To examine the usefulness of data mining in educational research, one specific data mining technique--the Bayesian Belief Network (BBN) based in Bayesian probability--is used to construct an analysis model in contrast to the traditional statistical approaches to answer a pseudo research question about faculty salary prediction in postsecondary institutions. Four prediction models--a multiple regression model with theoretical variable selection, a regression model with statistical variable extraction, a data mining BBN model with wrapper feature selection, and a combination model that used variables selected by the BBN in a multiple regression procedure--are expounded to analyze a data set called the National Survey of Postsecondary Faculty 1999 (NSOPF:99) provided by the National Center of Educational Services (NCES). The algorithms, input variables, final models, outputs, and interpretations of the four prediction models are presented and discussed. The results indicate that, with a nonmetric approach, the BBN can effectively handle a large number of variables through a process of stochastic subset selection; uncover dependence relationships among variables; detect hidden patterns in the data set; minimize the sample size as a factor influencing the amount of computations in data modeling; reduce data dimensionality by automatically identifying the most pertinent variable from a group of different but highly correlated measures in the analysis; and select the critical variables related to a core construct in prediction problems. The BBN and other data mining techniques have drawbacks; nonetheless, they are useful tools with unique advantages for analyzing large-scale data in educational research