16,381 research outputs found
Loss minimization yields multicalibration for large neural networks
Multicalibration is a notion of fairness that aims to provide accurate
predictions across a large set of groups. Multicalibration is known to be a
different goal than loss minimization, even for simple predictors such as
linear functions. In this note, we show that for (almost all) large neural
network sizes, optimally minimizing squared error leads to multicalibration.
Our results are about representational aspects of neural networks, and not
about algorithmic or sample complexity considerations. Previous such results
were known only for predictors that were nearly Bayes-optimal and were
therefore representation independent. We emphasize that our results do not
apply to specific algorithms for optimizing neural networks, such as SGD, and
they should not be interpreted as "fairness comes for free from optimizing
neural networks"
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules
We target the problem of automatically synthesizing proofs of semantic
equivalence between two programs made of sequences of statements. We represent
programs using abstract syntax trees (AST), where a given set of
semantics-preserving rewrite rules can be applied on a specific AST pattern to
generate a transformed and semantically equivalent program. In our system, two
programs are equivalent if there exists a sequence of application of these
rewrite rules that leads to rewriting one program into the other. We propose a
neural network architecture based on a transformer model to generate proofs of
equivalence between program pairs. The system outputs a sequence of rewrites,
and the validity of the sequence is simply checked by verifying it can be
applied. If no valid sequence is produced by the neural network, the system
reports the programs as non-equivalent, ensuring by design no programs may be
incorrectly reported as equivalent. Our system is fully implemented for a given
grammar which can represent straight-line programs with function calls and
multiple types. To efficiently train the system to generate such sequences, we
develop an original incremental training technique, named self-supervised
sample selection. We extensively study the effectiveness of this novel training
approach on proofs of increasing complexity and length. Our system, S4Eq,
achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent
programsComment: 30 pages including appendi
Countermeasures for the majority attack in blockchain distributed systems
La tecnologÃa Blockchain es considerada como uno de los paradigmas informáticos más importantes posterior al Internet; en función a sus caracterÃsticas únicas que la hacen ideal para registrar, verificar y administrar información de diferentes transacciones. A pesar de esto, Blockchain se enfrenta a diferentes problemas de seguridad, siendo el ataque del 51% o ataque mayoritario uno de los más importantes. Este consiste en que uno o más mineros tomen el control de al menos el 51% del Hash extraÃdo o del cómputo en una red; de modo que un minero puede manipular y modificar arbitrariamente la información registrada en esta tecnologÃa. Este trabajo se enfocó en diseñar e implementar estrategias de detección y mitigación de ataques mayoritarios (51% de ataque) en un sistema distribuido Blockchain, a partir de la caracterización del comportamiento de los mineros. Para lograr esto, se analizó y evaluó el Hash Rate / Share de los mineros de Bitcoin y Crypto Ethereum, seguido del diseño e implementación de un protocolo de consenso para controlar el poder de cómputo de los mineros. Posteriormente, se realizó la exploración y evaluación de modelos de Machine Learning para detectar software malicioso de tipo Cryptojacking.DoctoradoDoctor en IngenierÃa de Sistemas y Computació
Projected Multi-Agent Consensus Equilibrium (PMACE) for Distributed Reconstruction with Application to Ptychography
Multi-Agent Consensus Equilibrium (MACE) formulates an inverse imaging
problem as a balance among multiple update agents such as data-fitting terms
and denoisers. However, each such agent operates on a separate copy of the full
image, leading to redundant memory use and slow convergence when each agent
affects only a small subset of the full image. In this paper, we extend MACE to
Projected Multi-Agent Consensus Equilibrium (PMACE), in which each agent
updates only a projected component of the full image, thus greatly reducing
memory use for some applications.We describe PMACE in terms of an equilibrium
problem and an equivalent fixed point problem and show that in most cases the
PMACE equilibrium is not the solution of an optimization problem. To
demonstrate the value of PMACE, we apply it to the problem of ptychography, in
which a sample is reconstructed from the diffraction patterns resulting from
coherent X-ray illumination at multiple overlapping spots. In our PMACE
formulation, each spot corresponds to a separate data-fitting agent, with the
final solution found as an equilibrium among all the agents. Our results
demonstrate that the PMACE reconstruction algorithm generates more accurate
reconstructions at a lower computational cost than existing ptychography
algorithms when the spots are sparsely sampled
Disjointness Graphs of segments in R^2 are almost all Hamiltonian
Let P be a set of n >= 2 points in general position in R^2. The edge
disjointness graph D(P) of P is the graph whose vertices are all the closed
straight line segments with endpoints in P, two of which are adjacent in D(P)
if and only if they are disjoint. In this note, we give a full characterization
of all those edge disjointness graphs that are hamiltonian. More precisely, we
shall show that (up to order type isomorphism) there are exactly 8 instances of
P for which D(P) is not hamiltonian. Additionally, from one of these 8
instances, we derive a counterexample to a criterion for the existence of
hamiltonian cycles due to A. D. Plotnikov in 1998
Fair Assortment Planning
Many online platforms, ranging from online retail stores to social media
platforms, employ algorithms to optimize their offered assortment of items
(e.g., products and contents). These algorithms tend to prioritize the
platforms' short-term goals by solely featuring items with the highest
popularity or revenue. However, this practice can then lead to undesirable
outcomes for the rest of the items, making them leave the platform, and in turn
hurting the platform's long-term goals. Motivated by that, we introduce and
study a fair assortment planning problem, which requires any two items with
similar quality/merits to be offered similar outcomes. We show that the problem
can be formulated as a linear program (LP), called (FAIR), that optimizes over
the distribution of all feasible assortments. To find a near-optimal solution
to (FAIR), we propose a framework based on the Ellipsoid method, which requires
a polynomial-time separation oracle to the dual of the LP. We show that finding
an optimal separation oracle to the dual problem is an NP-complete problem, and
hence we propose a series of approximate separation oracles, which then result
in a -approx. algorithm and a PTAS for the original Problem (FAIR). The
approximate separation oracles are designed by (i) showing the separation
oracle to the dual of the LP is equivalent to solving an infinite series of
parameterized knapsack problems, and (ii) taking advantage of the structure of
the parameterized knapsack problems. Finally, we conduct a case study using the
MovieLens dataset, which demonstrates the efficacy of our algorithms and
further sheds light on the price of fairness.Comment: 86 pages, 7 figure
The Longest Subsequence-Repeated Subsequence Problem
Motivated by computing duplication patterns in sequences, a new fundamental
problem called the longest subsequence-repeated subsequence (LSRS) is proposed.
Given a sequence of length , a letter-repeated subsequence is a
subsequence of in the form of with
a subsequence of , and for all in
and in . We first present an time algorithm to
compute the longest cubic subsequences of all the substrings of ,
improving the trivial bound. Then, an time algorithm for
computing the longest subsequence-repeated subsequence (LSRS) of is
obtained. Finally we focus on two variants of this problem. We first consider
the constrained version when is unbounded, each letter appears in
at most times and all the letters in must appear in the solution.
We show that the problem is NP-hard for , via a reduction from a special
version of SAT (which is obtained from 3-COLORING). We then show that when each
letter appears in at most times, then the problem is solvable in
time.Comment: 16 pages, 1 figur
Limit theorems for non-Markovian and fractional processes
This thesis examines various non-Markovian and fractional processes---rough volatility models, stochastic Volterra equations, Wiener chaos expansions---through the prism of asymptotic analysis.
Stochastic Volterra systems serve as a conducive framework encompassing most rough volatility models used in mathematical finance. In Chapter 2, we provide a unified treatment of pathwise large and moderate deviations principles for a general class of multidimensional stochastic Volterra equations with singular kernels, not necessarily of convolution form. Our methodology is based on the weak convergence approach by Budhiraja, Dupuis and Ellis.
This powerful approach also enables us to investigate the pathwise large deviations of families of white noise functionals characterised by their Wiener chaos expansion as~
In Chapter 3, we provide sufficient conditions for the large deviations principle to hold in path space, thereby refreshing a problem left open By Pérez-Abreu (1993). Hinging on analysis on Wiener space, the proof involves describing, controlling and identifying the limit of perturbed multiple stochastic integrals.
In Chapter 4, we come back to mathematical finance via the route of Malliavin calculus. We present explicit small-time formulae for the at-the-money implied volatility, skew and curvature in a large class of models, including rough volatility models and their multi-factor versions. Our general setup encompasses both European options on a stock and VIX options. In particular, we develop a detailed analysis of the two-factor rough Bergomi model.
Finally, in Chapter 5, we consider the large-time behaviour of affine stochastic Volterra equations, an under-developed area in the absence of Markovianity.
We leverage on a measure-valued Markovian lift introduced by Cuchiero and Teichmann and the associated notion of generalised Feller property.
This setting allows us to prove the existence of an invariant measure for the lift and hence of a stationary distribution for the affine Volterra process, featuring in the rough Heston model.Open Acces
- …