38 research outputs found
Tradeoffs of Diagonal Fisher Information Matrix Estimators
The Fisher information matrix characterizes the local geometry in the
parameter space of neural networks. It elucidates insightful theories and
useful tools to understand and optimize neural networks. Given its high
computational cost, practitioners often use random estimators and evaluate only
the diagonal entries. We examine two such estimators, whose accuracy and sample
complexity depend on their associated variances. We derive bounds of the
variances and instantiate them in regression and classification networks. We
navigate trade-offs of both estimators based on analytical and numerical
studies. We find that the variance quantities depend on the non-linearity with
respect to different parameter groups and should not be neglected when
estimating the Fisher information
Data Preprocessing to Mitigate Bias with Boosted Fair Mollifiers
In a recent paper, Celis et al. (2020) introduced a new approach to fairness
that corrects the data distribution itself. The approach is computationally
appealing, but its approximation guarantees with respect to the target
distribution can be quite loose as they need to rely on a (typically limited)
number of constraints on data-based aggregated statistics; also resulting in a
fairness guarantee which can be data dependent.
Our paper makes use of a mathematical object recently introduced in privacy
-- mollifiers of distributions -- and a popular approach to machine learning --
boosting -- to get an approach in the same lineage as Celis et al. but without
the same impediments, including in particular, better guarantees in terms of
accuracy and finer guarantees in terms of fairness. The approach involves
learning the sufficient statistics of an exponential family. When the training
data is tabular, the sufficient statistics can be defined by decision trees
whose interpretability can provide clues on the source of (un)fairness.
Experiments display the quality of the results for simulated and real-world
data
UNIPoint: Universally Approximating Point Processes Intensities
Point processes are a useful mathematical tool for describing events over
time, and so there are many recent approaches for representing and learning
them. One notable open question is how to precisely describe the flexibility of
point process models and whether there exists a general model that can
represent all point processes. Our work bridges this gap. Focusing on the
widely used event intensity function representation of point processes, we
provide a proof that a class of learnable functions can universally approximate
any valid intensity function. The proof connects the well known
Stone-Weierstrass Theorem for function approximation, the uniform density of
non-negative continuous functions using a transfer functions, the formulation
of the parameters of a piece-wise continuous functions as a dynamic system, and
a recurrent neural network implementation for capturing the dynamics. Using
these insights, we design and implement UNIPoint, a novel neural point process
model, using recurrent neural networks to parameterise sums of basis function
upon each event. Evaluations on synthetic and real world datasets show that
this simpler representation performs better than Hawkes process variants and
more complex neural network-based approaches. We expect this result will
provide a practical basis for selecting and tuning models, as well as
furthering theoretical work on representational complexity and learnability
Interval-censored Hawkes processes
Interval-censored data solely records the aggregated counts of events during
specific time intervals - such as the number of patients admitted to the
hospital or the volume of vehicles passing traffic loop detectors - and not the
exact occurrence time of the events. It is currently not understood how to fit
the Hawkes point processes to this kind of data. Its typical loss function (the
point process log-likelihood) cannot be computed without exact event times.
Furthermore, it does not have the independent increments property to use the
Poisson likelihood. This work builds a novel point process, a set of tools, and
approximations for fitting Hawkes processes within interval-censored data
scenarios. First, we define the Mean Behavior Poisson process (MBPP), a novel
Poisson process with a direct parameter correspondence to the popular
self-exciting Hawkes process. We fit MBPP in the interval-censored setting
using an interval-censored Poisson log-likelihood (IC-LL). We use the parameter
equivalence to uncover the parameters of the associated Hawkes process. Second,
we introduce two novel exogenous functions to distinguish the exogenous from
the endogenous events. We propose the multi-impulse exogenous function - for
when the exogenous events are observed as event time - and the latent
homogeneous Poisson process exogenous function - for when the exogenous events
are presented as interval-censored volumes. Third, we provide several
approximation methods to estimate the intensity and compensator function of
MBPP when no analytical solution exists. Fourth and finally, we connect the
interval-censored loss of MBPP to a broader class of Bregman divergence-based
functions. Using the connection, we show that the popularity estimation
algorithm Hawkes Intensity Process (HIP) is a particular case of the MBPP. We
verify our models through empirical testing on synthetic data and real-world
data
Fair Wrapping for Black-box Predictions
We introduce a new family of techniques to post-process ("wrap") a black-box
classifier in order to reduce its bias. Our technique builds on the recent
analysis of improper loss functions whose optimization can correct any twist in
prediction, unfairness being treated as a twist. In the post-processing, we
learn a wrapper function which we define as an -tree, which modifies
the prediction. We provide two generic boosting algorithms to learn
-trees. We show that our modification has appealing properties in terms
of composition of -trees, generalization, interpretability, and KL
divergence between modified and original predictions. We exemplify the use of
our technique in three fairness notions: conditional value-at-risk, equality of
opportunity, and statistical parity; and provide experiments on several readily
available datasets.Comment: Published in Advances in Neural Information Processing Systems 35
(NeurIPS 2022
3D NLTE Lithium abundances for late-type stars in GALAH DR3
Lithium's susceptibility to burning in stellar interiors makes it an
invaluable tracer for delineating the evolutionary pathways of stars, offering
insights into the processes governing their development. Observationally, the
complex Li production and depletion mechanisms in stars manifest themselves as
Li plateaus, and as Li-enhanced and Li-depleted regions of the HR diagram. The
Li-dip represents a narrow range in effective temperature close to the
main-sequence turn-off, where stars have slightly super-solar masses and
strongly depleted Li. To study the modification of Li through stellar
evolution, we measure 3D non-local thermodynamic equilibrium (NLTE) Li
abundance for 581 149 stars released in GALAH DR3. We describe a novel method
that fits the observed spectra using a combination of 3D NLTE Li line profiles
with blending metal line strength that are optimized on a star-by-star basis.
Furthermore, realistic errors are determined by a Monte Carlo nested sampling
algorithm which samples the posterior distribution of the fitted spectral
parameters. The method is validated by recovering parameters from a synthetic
spectrum and comparing to 26 stars in the Hypatia catalogue. We find 228 613 Li
detections, and 352 536 Li upper limits. Our abundance measurements are
generally lower than GALAH DR3, with a mean difference of 0.23 dex. For the
first time, we trace the evolution of Li-dip stars beyond the main sequence
turn-off and up the subgiant branch. This is the first 3D NLTE analysis of Li
applied to a large spectroscopic survey, and opens up a new era of precision
analysis of abundances for large surveys.Comment: 20 pages, 17 figures, accepted for publication in MNRA
Doctoral Seminar on the history of early modern religion & theology (as co-convenors)
status: publishe