34 research outputs found
Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence
International audienceWe consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which to derive and cast results; (2) derive a sample complexity lower bound for near-optimal arm identification; (3) propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4) discuss whether our log^2(1/delta) dependence is inescapable for ``two-phase'' (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold
On Similarity Prediction and Pairwise Clustering
International audienceWe consider the problem of clustering a finite set of items from pairwise similarity information. Unlike what is done in the literature on this subject, we do so in a passive learning setting, and with no specific constraints on the cluster shapes other than their size. We investigate the problem in different settings: i. an online setting, where we provide a tight characterization of the prediction complexity in the mistake bound model, and ii. a standard stochastic batch setting, where we give tight upper and lower bounds on the achievable generalization error. Prediction performance is measured both in terms of the ability to recover the similarity function encoding the hidden clustering and in terms of how well we classify each item within the set. The proposed algorithms are time efficient
Siblings Overseas. Foundational landscape, law, land distribution, and urban form in 16th-century Spanish colonial cities. Three cases of new towns in Jaen (Spain), Nueva Granada (Colombia) and Cuyo (Argentina).
The PhD project Siblings Overseas aims to contribute to the global urban
history of Hispanic grid cities, building connections between practices, morphologies,
and ideas from both shores of the Atlantic Ocean. This line of research
has its precedent in the previous work Granada Des-Granada, published in
Colombia in 2018 (Ed. Uniande9, which offered a survey on Muslim medinas and
the evolution of Christian grid cities between the 11th and 15th centuries.
Siblings Overseas takes over where Granada Des-Granada ended and focuses
on grid cities founded in Spanish domains during the early modern period. After
the first fortified settlements in the American coastline were created, the 16th century
brought diverse transformations to Spanish colonial new towns both in the Iberian
Peninsula, the Mediterranean, and the American frontier. Urban laws and foundational
acts gained relevance, shifting the main urban efforts in America from fortified
positions in the early 1500s to open grid cities in the 1530s. Despite the ample
literature studied this phenomenon in America, its presence in Europe and the Mediterranean
has received less attention. Spanish archives conserve original 16th-century
settlement books and logs of several cities founded in the Iberian south and the
former Andalusian frontier, which have been studied and transcribed by local historians
who signaled their familiarity with their American sisters. No comparative
analysis has been developed in this sense, keeping these "Andalusian colonies"
away from international historiography.
The objective of this dissertation is to present an in-depth comparative study
of European and American urban plantation protocols, focusing on unfortified new
towns whose foundational processes evolved during the 16th century.
The general hypothesis is that Spanish practices for the plantation of cities in
Europe and America present a set of shared aspects based on their common frame
of laws, institutions, agents, and beliefs. These elements were in constant evolution
in both shores of the Atlantic due to their dynamic socio-political situation. Their
similarities and differences have been studied and evidenced through the analysis
of primary written sources, historical cartographies, and detailed foundational records.
The urban grid is the most visible of these cities’ traits, even an archetypical
one; but it did not operate by itself. The evidence presented in Siblings Overseas
show that there was no pre-established model for all these new towns around the
global Spanish Empire, but a shared set of urban protocols organically applied in
diverse contexts.
The leading case of study in this project is the foundational process of four
new towns in Sierra Sur de Jaen (Andalusia), which took place between 1508 and
1539 and includes the settlements of Mancha Real, Valdepeñas de Jaén, Los Villares,
and Campillo de Arenas. Sierra Sur was the main friction point between the
kingdoms of Jaen and Granada during the last centuries of the Reconquista, making
it a strategic territory for colonization after the Granada War (1582-92). Available
primary sources are mainly written documents: instructions for founding agents,
judicial processes, lawsuits over land rights, independence privileges, etc. Only one
of the four foundational plans survives but is well conserved and show with precision
the layout of streets and the distribution of urban parcels.
American cases include two cities, both influenced by urban principles stated
in the Indies Laws. This legal body reunites edicts from the earliest 16th century
until its publication in 1681, each with its respective date and ordering king/queen.
Its analysis shows how Laws enacted by monarchs like the Catholic Kings, Juana
I, Charles V, and Philip II recommend the same principles and rules for America as
those applied in Sierra Sur. However, official records and foundational plans of
most early Spanish colonial settlements have not survived. The oldest partition plan
conserved of an American foundation is the one of Mendoza, first Spanish city in
the province of Cuyo (1561-2), originally under the jurisdiction of Capitanía General
de Chile and later included in the Viceroyalty of La Plata (Argentina). Mendoza
was founded in two acts,with plans and written records conserved for each of them
at the Archivo General de Indias (Seville). The second American case is Villa de
Leyva in the Kingdom of New Granada (Colombia), firstly planted in 1572 and
then moved in 1582. The foundational acts conserved for this city are some of the
oldest in Colombia and South America. Villa de Leyva depended on Tunja's jurisdiction,
forty kilometers away, in the same manner that Sierra Sur's new towns were
under the authority of Jaen.El Proyecto de doctorado Siblings Overseas tiene como objetivo contribuir a
la historia urbana global de las ciudades hispanas en retícula, construyendo
conexiones entre prácticas, morfologías e ideas provenientes de ambas orillas
del océano Atlántico. Esta línea de trabajo tiene un precedente directo en el trabajo
previo Granada Des-Granada, publicado en Colombia en 2018 (Ed. Uniandes), en
donde se ofrecía una exploración del urbanismo de medina islámica y grilla cristiana
en España entre los siglos XI y XV.
Siblings Overseas toma el relevo donde Granada Des-Granada terminó, concentrándose
en ciudades de trama ortogonal fundadas en reinos españoles durante
la modernidad temprana. Tras la creación de los primeros asentamientos costeros
fortificados en América, el siglo XVI trajo consigo diversas transformaciones urbanas
en ciudades de tipo colonial creadas tanto en la Península Ibérica y el contexto
mediterráneo como en la frontera americana. Leyes urbanas y actas fundacionales
ganaron relevancia, redirigiendo los principales esfuerzos urbanos en América
desde las posiciones fortificadas de principios de la década de 1500 a los asentamientos
reticulares abiertos en la década de 1530. A pesar de la amplia literatura
existente en cuanto al estudio de este fenómeno en América, su presencia en Europa
y el Mediterráneo ha recibido mucha menos atención. Diversos archivos españoles
conservan libros y registros de fundación originales de diversas ciudades del siglo
XVI creadas en el sur ibérico y la antigua frontera andaluza, los cuales han sido
estudiados transcritos y estudiados por historiadores locales que han señalado su
familiaridad con sus “hermanas” americanas. Sin embargo, ningún análisis comparativo
ha sido desarrollado en este sentido, manteniendo así a las fundaciones “coloniales”
andaluzas del XVI apartadas de la historiografía internacional.
El objetivo de esta tesis doctoral es presentar un estudio comparativo profundo
entre protocolos de fundación de ciudades aplicados en Europa y América, concentrándose
en ciudades de nueva planta no fortificadas cuyos procesos fundacionales
se desarrollaron a lo largo del siglo XVI.
La hipótesis general se basa en la idea de que las prácticas fundacionales españolas
aplicadas en Europa y América presentan una serie de aspectos comunes
basados en su marco legal compartido a nivel de leyes, instituciones, agentes y
creencias, entre otros factores. A lo largo del siglo XVI, estos elementos experimentaron
una evolución constante a ambos lados del Atlántico dada su divergente
situación sociopolítica. Sus similitudes y diferencias han sido estudiadas y evidenciadas
en este proyecto a través del análisis de fuentes escritas de carácter notarial,
registros de procesos de fundación, así como mapas y cartografías históricas. La
grilla urbana es la más visible de estas características comunes, incluso la más arquetípica,
más sin embargo no operaba por si misma. La evidencia presentada en
Siblings Overseas demuestra que no existía ningún modelo preestablecido para todas
estas ciudades a lo largo del imperio español global, sino más bien una serie de
protocolos urbanos comunes aplicados orgánicamente en contextos diversos que
arrojaban, por tanto, resultados igualmente diversos.
El caso de estudio principal de este proyecto es el proceso fundacional de
cuatro ciudades de nueva planta en la Sierra Sur de Jaén (Andalucía) llevado a cabo
entre 1508 y 1539 y que incluye las poblaciones de Mancha Real, Valdepeñas de
Jaén, Los Villares y Campillo de Arenas. Sierra Sur había sido el principal punto
de fricción entre los reinos de Jaén y Granada durante los últimos siglos de la Reconquista,
haciendo de ella un territorio altamente estratégico de cara a ser colonizado
tras la Guerra de Granada (1482-1492). Las fuentes primarias disponibles al
respecto de este proceso fundacional son principalmente documentos escritos: instrucciones
impuestas a los agentes fundadores, procesos judiciales, demandas sobre
derechos de propiedad de la tierra, privilegios de independencia, etc. Sólo uno de
los cuatro planos fundacionales de estas villas ha sobrevivido, si bien se encuentra
bien conservado y muestra con precisión la distribución de vías y parcelas urbanas.
El grupo de casos americanos incluidos en este trabajo consta principalmente
de dos ciudades, ambas influenciadas por los principios urbanos recogidos más adelante
en las llamadas Leyes de Indias. Este cuerpo legal reúne edictos y normas
emitidas desde principios del siglo XVI hasta su compilación en 1681. En dicha
edición, cada ley o norma incluye una nota indicativa de la fecha en que fue hecha
oficial y el monarca a cargo de su firma. Su análisis muestra cómo las leyes aprobadas
por reyes y reinas tales como los Reyes Católicos, Juana I, Carlos V o Felipe
II recomendaba los mismos principios y reglas para América que ya se venían
aplicando en la Sierra Sur. A pesar de la existencia de esta base legal común abundantemente
documentada, casi ningún asentamiento colonial de primera generación
en América conserva documentación de su fundación. El plan de repartimiento colonial
americano más antiguo que se conserva es el de Mendoza (1561-2), la primera
ciudad española en la provincia de Cuyo, originalmente en la jurisdicción de
la Capitanía General de Chile y más adelante integrada en el Virreinato de La Plata
con capital en Buenos Aires, hoy Argentina. Mendoza fue fundada a través de dos
actas distintas, cada una con sus propios registros y planos conservados en el Archivo
General de Indias, Sevilla. El segundo caso americano es Villa de Leyva, en
el Reino de Nueva Granada (Colombia), fundada por primera vez en 1572 y más
adelante desplazada a una nueva localización en 1582. Las actas de fundación que
conserva esta ciudad son algunas de las más antiguas tanto de Colombia como de
América Latina, con Mendoza como antecedente cercano en el tiempo más no en
el espacio. Villa de Leyva dependía de la jurisdicción de Tunja, a cuarenta kilómetros
de distancia, de un modo similar a como las nuevas fundaciones de la Sierra
Sur dependían de la autoridad provincial en Jaé
Downscaling Climate Change Impacts, Socio-Economic Implications and Alternative Adaptation Pathways for Islands and Outermost Regions
This book provides a comprehensive overview of the future scenarios of climate change and management concerns associated with climate change impacts on the blue economy of European islands and outermost regions. The publication collects major findings of the SOCLIMPACT project’s research outcomes, aiming to raise social awareness among policy-makers and industry about climate change consequences at local level, and provide knowledge-based information to support policy design, from local to national level. This comprehensive book will also assist students, scholars and practitioners to understand, conceptualize and effectively and responsibly manage climate change information and applied research. This book provides invaluable material for Blue Growth Management, theory and application, at all levels. This first edition includes up-to-date data, statistics, references, case material and figures of the 12 islands case studies. ¨Downscaling climate change impacts, socio-economic implications and alternative adaptation pathways for Islands and Outermost Regions¨ is a must-read book, given the accessible style and breadth and depth with which the topic is dealt. The book is an up-to-date synthesis of key knowledge on this area, written by a multidisciplinary group of experts on climate and economic modelling, and policy design
Operationalizing fairness for responsible machine learning
As machine learning (ML) is increasingly used for decision making in scenarios that impact humans, there is a growing awareness of its potential for unfairness. A large body of recent work has focused on proposing formal notions of fairness in ML, as well as approaches to mitigate unfairness. However, there is a growing disconnect between the ML fairness literature and the needs to operationalize fairness in practice. This thesis addresses the need for responsible ML by developing new models and methods to address challenges in operationalizing fairness in practice. Specifically, it makes the following contributions.
First, we tackle a key assumption in the group fairness literature that sensitive demographic attributes such as race and gender are known upfront, and can be readily used in model training to mitigate unfairness. In practice, factors like privacy and regulation often prohibit ML models from collecting or using protected attributes in decision making. To address this challenge we introduce the novel notion of computationally-identifiable errors and propose Adversarially Reweighted Learning (ARL), an optimization method that seeks to improve the worst-case performance over unobserved groups, without requiring access to the protected attributes in the dataset. Second, we argue that while group fairness notions are a desirable fairness criterion, they are fundamentally limited as they reduce fairness to an average statistic over pre-identified protected groups. In practice, automated decisions are made at an individual level, and can adversely impact individual people irrespective of the group statistic. We advance the paradigm of individual fairness by proposing iFair (individually fair representations), an optimization approach for learning a low dimensional latent representation of the data with two goals: to encode the data as well as possible, while removing any information about protected attributes in the transformed representation. Third, we advance the individual fairness paradigm, which requires that similar individuals receive similar outcomes. However, similarity metrics computed over observed feature space can be brittle, and inherently limited in their ability to accurately capture similarity between individuals. To address this, we introduce a novel notion of fairness graphs, wherein pairs of individuals can be identified as deemed similar with respect to the ML objective. We cast the problem of individual fairness into graph embedding, and propose PFR (pairwise fair representations), a method to learn a unified pairwise fair representation of the data. Fourth, we tackle the challenge that production data after model deployment is constantly evolving. As a consequence, in spite of the best efforts in training a fair model, ML systems can be prone to failure risks due to a variety of unforeseen reasons. To ensure responsible model deployment, potential failure risks need to be predicted, and mitigation actions need to be devised, for example, deferring to a human expert when uncertain or collecting additional data to address model’s blind-spots. We propose Risk Advisor, a model-agnostic meta-learner to predict potential failure risks and to give guidance on the sources of uncertainty inducing the risks, by leveraging information theoretic notions of aleatoric and epistemic uncertainty. This dissertation brings ML fairness closer to real-world applications by developing methods that address key practical challenges. Extensive experiments on a variety of real-world and synthetic datasets show that our proposed methods are viable in practice.Mit der zunehmenden Verwendung von Maschinellem Lernen (ML) in Situationen, die Auswirkungen auf Menschen haben, nimmt das Bewusstsein über das Potenzial für Unfair- ness zu. Ein großer Teil der jüngeren Forschung hat den Fokus auf das formale Verständnis von Fairness im Zusammenhang mit ML sowie auf Ansätze zur Überwindung von Unfairness gelegt. Jedoch driften die Literatur zu Fairness in ML und die Anforderungen zur Implementierung in der Praxis zunehmend auseinander. Diese Arbeit beschäftigt sich mit der Notwendigkeit für verantwortungsvolles ML, wofür neue Modelle und Methoden entwickelt werden, um die Herausforderungen im Fairness-Bereich in der Praxis zu bewältigen. Ihr wissenschaftlicher Beitrag ist im Folgenden dargestellt. In Kapitel 3 behandeln wir die Schlüsselprämisse in der Gruppenfairnessliteratur, dass sensible demografische Merkmale wie etwa die ethnische Zugehörigkeit oder das Geschlecht im Vorhinein bekannt sind und während des Trainings eines Modells zur Reduzierung der Unfairness genutzt werden können. In der Praxis hindern häufig Einschränkungen zum Schutz der Privatsphäre oder gesetzliche Regelungen ML-Modelle daran, geschützte Merkmale für die Entscheidungsfindung zu sammeln oder zu verwenden. Um diese Herausforderung zu überwinden, führen wir das Konzept der Komputational-identifizierbaren Fehler ein und stellen Adversarially Reweighted Learning (ARL) vor, ein Optimierungsverfahren, das die Worst-Case-Performance bei unbekannter Gruppenzugehörigkeit ohne Wissen über die geschützten Merkmale verbessert. In Kapitel 4 stellen wir dar, dass Konzepte für Gruppenfairness trotz ihrer Eignung als Fairnesskriterium grundsätzlich beschränkt sind, da Fairness auf eine gemittelte statistische Größe für zuvor identifizierte geschützte Gruppen reduziert wird. In der Praxis werden automatisierte Entscheidungen auf einer individuellen Ebene gefällt, und können unabhängig von der gruppenbezogenen Statistik Nachteile für Individuen haben. Wir erweitern das Konzept der individuellen Fairness um unsere Methode iFair (individually fair representations), ein Optimierungsverfahren zum Erlernen einer niedrigdimensionalen Darstellung der Daten mit zwei Zielen: die Daten so akkurat wie möglich zu enkodieren und gleichzeitig jegliche Information über die geschützten Merkmale in der transformierten Darstellung zu entfernen. In Kapitel 5 entwickeln wir das Paradigma der individuellen Fairness weiter, das ein ähnliches Ergebnis für ähnliche Individuen erfordert. Ähnlichkeitsmetriken im beobachteten Featureraum können jedoch unzuverlässig und inhärent beschränkt darin sein, Ähnlichkeit zwischen Individuen korrekt abzubilden. Um diese Herausforderung anzugehen, führen wir den neue Konzept der Fairnessgraphen ein, in denen Paare (oder Sets) von Individuen als ähnlich im Bezug auf die ML-Aufgabe identifiziert werden. Wir übersetzen das Problem der individuellen Fairness in eine Grapheinbindung und stellen PFR (pairwise fair representations) vor, eine Methode zum Erlernen einer vereinheitlichten paarweisen fairen Abbildung der Daten. In Kapitel 6 gehen wir die Herausforderung an, dass sich die Daten im Feld nach der Inbetriebnahme des Modells fortlaufend ändern. In der Konsequenz können ML-Systeme trotz größter Bemühungen, ein faires Modell zu trainieren, aufgrund einer Vielzahl an unvorhergesehenen Gründen scheitern. Um eine verantwortungsvolle Implementierung sicherzustellen, gilt es, Risiken für ein potenzielles Versagen vorherzusehen und Gegenmaßnahmen zu entwickeln,z.B. die Übertragung der Entscheidung an einen menschlichen Experten bei Unsicherheit oder das Sammeln weiterer Daten, um die blinden Flecken des Modells abzudecken. Wir stellen mit Risk Advisor einen modell-agnostischen Meta-Learner vor, der Risiken für potenzielles Versagen vorhersagt und Anhaltspunkte für die Ursache der zugrundeliegenden Unsicherheit basierend auf informationstheoretischen Konzepten der aleatorischen und epistemischen Unsicherheit liefert. Diese Dissertation bringt Fairness für verantwortungsvolles ML durch die Entwicklung von Ansätzen für die Lösung von praktischen Kernproblemen näher an die Anwendungen im Feld. Umfassende Experimente mit einer Vielzahl von synthetischen und realen Datensätzen zeigen, dass unsere Ansätze in der Praxis umsetzbar sind.The International Max Planck Research School for Computer Science (IMPRS-CS
Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization
The notion of replicable algorithms was introduced in Impagliazzo et al.
[STOC '22] to describe randomized algorithms that are stable under the
resampling of their inputs. More precisely, a replicable algorithm gives the
same output with high probability when its randomness is fixed and it is run on
a new i.i.d. sample drawn from the same distribution. Using replicable
algorithms for data analysis can facilitate the verification of published
results by ensuring that the results of an analysis will be the same with high
probability, even when that analysis is performed on a new data set.
In this work, we establish new connections and separations between
replicability and standard notions of algorithmic stability. In particular, we
give sample-efficient algorithmic reductions between perfect generalization,
approximate differential privacy, and replicability for a broad class of
statistical problems. Conversely, we show any such equivalence must break down
computationally: there exist statistical problems that are easy under
differential privacy, but that cannot be solved replicably without breaking
public-key cryptography. Furthermore, these results are tight: our reductions
are statistically optimal, and we show that any computational separation
between DP and replicability must imply the existence of one-way functions.
Our statistical reductions give a new algorithmic framework for translating
between notions of stability, which we instantiate to answer several open
questions in replicability and privacy. This includes giving sample-efficient
replicable algorithms for various PAC learning, distribution estimation, and
distribution testing problems, algorithmic amplification of in
approximate DP, conversions from item-level to user-level privacy, and the
existence of private agnostic-to-realizable learning reductions under
structured distributions.Comment: STOC 2023, minor typos fixe