82 research outputs found
Character-Aware Neural Language Models
We describe a simple neural language model that relies only on
character-level inputs. Predictions are still made at the word-level. Our model
employs a convolutional neural network (CNN) and a highway network over
characters, whose output is given to a long short-term memory (LSTM) recurrent
neural network language model (RNN-LM). On the English Penn Treebank the model
is on par with the existing state-of-the-art despite having 60% fewer
parameters. On languages with rich morphology (Arabic, Czech, French, German,
Spanish, Russian), the model outperforms word-level/morpheme-level LSTM
baselines, again with fewer parameters. The results suggest that on many
languages, character inputs are sufficient for language modeling. Analysis of
word representations obtained from the character composition part of the model
reveals that the model is able to encode, from characters only, both semantic
and orthographic information.Comment: AAAI 201
The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul
In the last two decades many random graph models have been proposed to
extract knowledge from networks. Most of them look for communities or, more
generally, clusters of vertices with homogeneous connection profiles. While the
first models focused on networks with binary edges only, extensions now allow
to deal with valued networks. Recently, new models were also introduced in
order to characterize connection patterns in networks through mixed
memberships. This work was motivated by the need of analyzing a historical
network where a partition of the vertices is given and where edges are typed. A
known partition is seen as a decomposition of a network into subgraphs that we
propose to model using a stochastic model with unknown latent clusters. Each
subgraph has its own mixing vector and sees its vertices associated to the
clusters. The vertices then connect with a probability depending on the
subgraphs only, while the types of edges are assumed to be sampled from the
latent clusters. A variational Bayes expectation-maximization algorithm is
proposed for inference as well as a model selection criterion for the
estimation of the cluster number. Experiments are carried out on simulated data
to assess the approach. The proposed methodology is then applied to an
ecclesiastical network in Merovingian Gaul. An R code, called Rambo,
implementing the inference algorithm is available from the authors upon
request.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS691 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Stronger Together: on the Articulation of Ethical Charters, Legal Tools, and Technical Documentation in ML
The growing need for accountability of the people behind AI systems can be
addressed by leveraging processes in three fields of study: ethics, law, and
computer science. While these fields are often considered in isolation, they
rely on complementary notions in their interpretation and implementation. In
this work, we detail this interdependence and motivate the necessary role of
collaborative governance tools in shaping a positive evolution of AI. We first
contrast notions of compliance in the ethical, legal, and technical fields; we
outline both their differences and where they complement each other, with a
particular focus on the roles of ethical charters, licenses, and technical
documentation in these interactions. We then focus on the role of values in
articulating the synergies between the fields and outline specific mechanisms
of interaction between them in practice. We identify how these mechanisms have
played out in several open governance fora: an open collaborative workshop, a
responsible licensing initiative, and a proposed regulatory framework. By
leveraging complementary notions of compliance in these three domains, we can
create a more comprehensive framework for governing AI systems that jointly
takes into account their technical capabilities, their impact on society, and
how technical specifications can inform relevant regulations. Our analysis thus
underlines the necessity of joint consideration of the ethical, legal, and
technical in AI ethics frameworks to be used on a larger scale to govern AI
systems and how the thinking in each of these areas can inform the others
Stable Bias: Analyzing Societal Representations in Diffusion Models
As machine learning-enabled Text-to-Image (TTI) systems are becoming
increasingly prevalent and seeing growing adoption as commercial services,
characterizing the social biases they exhibit is a necessary first step to
lowering their risk of discriminatory outcomes. This evaluation, however, is
made more difficult by the synthetic nature of these systems' outputs: common
definitions of diversity are grounded in social categories of people living in
the world, whereas the artificial depictions of fictive humans created by these
systems have no inherent gender or ethnicity. To address this need, we propose
a new method for exploring the social biases in TTI systems. Our approach
relies on characterizing the variation in generated images triggered by
enumerating gender and ethnicity markers in the prompts, and comparing it to
the variation engendered by spanning different professions. This allows us to
(1) identify specific bias trends, (2) provide targeted scores to directly
compare models in terms of diversity and representation, and (3) jointly model
interdependent social variables to support a multidimensional analysis. We
leverage this method to analyze images generated by 3 popular TTI systems
(Dall-E 2, Stable Diffusion v 1.4 and 2) and find that while all of their
outputs show correlations with US labor demographics, they also consistently
under-represent marginalized identities to different extents. We also release
the datasets and low-code interactive bias exploration platforms developed for
this work, as well as the necessary tools to similarly evaluate additional TTI
systems.Comment: Accepted to NeurIPS Datasets and Benchmarks 2023 (spotlight
Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives
Open Artificial Intelligence (Open source AI) collaboratives offer
alternative pathways for how AI can be developed beyond well-resourced
technology companies and who can be a part of the process. To understand how
and why they work and what additionality they bring to the landscape, we focus
on three such communities, each focused on a different kind of activity around
AI: building models (BigScience workshop), tools and ways of working (The
Turing Way), and ecosystems (Mozilla Festival's Building Trustworthy AI Working
Group). First, we document the community structures that facilitate these
distributed, volunteer-led teams, comparing the collaboration styles that drive
each group towards their specific goals. Through interviews with community
leaders, we map user journeys for how members discover, join, contribute, and
participate. Ultimately, this paper aims to highlight the diversity of AI work
and workers that have come forth through these collaborations and how they
offer a broader practice of openness to the AI space.Comment: Presented at the 2022 NeurIPS Workshop on Broadening Research
Collaborations in M
- …