28 research outputs found
Open Platforms for Artificial Intelligence for Social Good: Common Patterns as a Pathway to True Impact
The AI for social good movement has now reached a state in which a large
number of one-off demonstrations have illustrated that partnerships of AI
practitioners and social change organizations are possible and can address
problems faced in sustainable development. In this paper, we discuss how moving
from demonstrations to true impact on humanity will require a different course
of action, namely open platforms containing foundational AI capabilities to
support common needs of multiple organizations working in similar topical
areas. We lend credence to this proposal by describing three example patterns
of social good problems and their AI-based solutions: natural language
processing for making sense of international development reports, causal
inference for providing guidance to vulnerable individuals, and
discrimination-aware classification for supporting unbiased allocation
decisions. We argue that the development of such platforms will be possible
through convenings of social change organizations, AI companies, and
grantmaking foundations.Comment: appearing at the 2019 ICML AI for Social Good Worksho
A Method For Color Naming And Description Of Color Composition In Images
Color is one of the main visual cues and has been frequently used in image processing, analysis and retrieval. The extraction of highlevel color descriptors is an increasingly important problem, as these descriptions often provide link to image content. When combined with image segmentation color naming can be used to select objects by color, describe the appearance of the image and even generate semantic annotations. For example, regions labeled as light blue and strong green may represent sky and grass, vivid colors are typically found in man-made objects, and modifiers such as brownish, grayish and dark convey the impression of the atmosphere in the scene. This paper presents a computational model for color categorization, naming and extraction of color composition. In this work we start from the National Bureau of Standards' recommendation for color names [4], and through subjective experiments develop our color vocabulary and syntax. Next, to attach the color name to an arbitrary input color, we design a perceptually based color naming metric. Finally, we extend the method and develop a scheme for extracting the color composition of a complex image. The algorithm follows the relevant neurophysiological findings and studies on human color categorization. In testing the method the known color regions in different color spaces were identified accurately, the color names assigned to randomly selected colors agreed with human judgments, and the color composition extracted from natural images was consistent with human observations
Teaching machines to understand data science code by semantic enrichment of dataflow graphs
Your computer is continuously executing programs, but does it really
understand them? Not in any meaningful sense. That burden falls upon human
knowledge workers, who are increasingly asked to write and understand code.
They deserve to have intelligent tools that reveal the connections between code
and its subject matter. Towards this prospect, we develop an AI system that
forms semantic representations of computer programs, using techniques from
knowledge representation and program analysis. To create the representations,
we introduce an algorithm for enriching dataflow graphs with semantic
information. The semantic enrichment algorithm is undergirded by a new ontology
language for modeling computer programs and a new ontology about data science,
written in this language. Throughout the paper, we focus on code written by
data scientists and we locate our work within a larger movement towards
collaborative, open, and reproducible science.Comment: 33 pages. Significantly expanded from previous versio
Trust and Transparency in Contact Tracing Applications
The global outbreak of COVID-19 has led to focus on efforts to manage and
mitigate the continued spread of the disease. One of these efforts include the
use of contact tracing to identify people who are at-risk of developing the
disease through exposure to an infected person. Historically, contact tracing
has been primarily manual but given the exponential spread of the virus that
causes COVID-19, there has been significant interest in the development and use
of digital contact tracing solutions to supplement the work of human contact
tracers. The collection and use of sensitive personal details by these
applications has led to a number of concerns by the stakeholder groups with a
vested interest in these solutions. We explore digital contact tracing
solutions in detail and propose the use of a transparent reporting mechanism,
FactSheets, to provide transparency of and support trust in these applications.
We also provide an example FactSheet template with questions that are specific
to the contact tracing application domain.Comment: 9 page
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
Given the emerging global threat of antimicrobial resistance, new methods for
next-generation antimicrobial design are urgently needed. We report a peptide
generation framework PepCVAE, based on a semi-supervised variational
autoencoder (VAE) model, for designing novel antimicrobial peptide (AMP)
sequences. Our model learns a rich latent space of the biological peptide
context by taking advantage of abundant, unlabeled peptide sequences. The model
further learns a disentangled antimicrobial attribute space by using the
feedback from a jointly trained AMP classifier that uses limited labeled
instances. The disentangled representation allows for controllable generation
of AMPs. Extensive analysis of the PepCVAE-generated sequences reveals superior
performance of our model in comparison to a plain VAE, as PepCVAE generates
novel AMP sequences with higher long-range diversity, while being closer to the
training distribution of biological peptides. These features are highly desired
in next-generation antimicrobial design
Understanding Unequal Gender Classification Accuracy from Face Images
Recent work shows unequal performance of commercial face classification
services in the gender classification task across intersectional groups defined
by skin type and gender. Accuracy on dark-skinned females is significantly
worse than on any other group. In this paper, we conduct several analyses to
try to uncover the reason for this gap. The main finding, perhaps surprisingly,
is that skin type is not the driver. This conclusion is reached via stability
experiments that vary an image's skin type via color-theoretic methods, namely
luminance mode-shift and optimal transport. A second suspect, hair length, is
also shown not to be the driver via experiments on face images cropped to
exclude the hair. Finally, using contrastive post-hoc explanation techniques
for neural networks, we bring forth evidence suggesting that differences in
lip, eye and cheek structure across ethnicity lead to the differences. Further,
lip and eye makeup are seen as strong predictors for a female face, which is a
troubling propagation of a gender stereotype
Experiences with Improving the Transparency of AI Models and Services
AI models and services are used in a growing number of highstakes areas,
resulting in a need for increased transparency. Consistent with this, several
proposals for higher quality and more consistent documentation of AI data,
models, and systems have emerged. Little is known, however, about the needs of
those who would produce or consume these new forms of documentation. Through
semi-structured developer interviews, and two document creation exercises, we
have assembled a clearer picture of these needs and the various challenges
faced in creating accurate and useful AI documentation. Based on the
observations from this work, supplemented by feedback received during multiple
design explorations and stakeholder conversations, we make recommendations for
easing the collection and flexible presentation of AI facts to promote
transparency
FactSheets: Increasing Trust in AI Services through Supplier's Declarations of Conformity
Accuracy is an important concern for suppliers of artificial intelligence
(AI) services, but considerations beyond accuracy, such as safety (which
includes fairness and explainability), security, and provenance, are also
critical elements to engender consumers' trust in a service. Many industries
use transparent, standardized, but often not legally required documents called
supplier's declarations of conformity (SDoCs) to describe the lineage of a
product along with the safety and performance testing it has undergone. SDoCs
may be considered multi-dimensional fact sheets that capture and quantify
various aspects of the product and its development to make it worthy of
consumers' trust. Inspired by this practice, we propose FactSheets to help
increase trust in AI services. We envision such documents to contain purpose,
performance, safety, security, and provenance information to be completed by AI
service providers for examination by consumers. We suggest a comprehensive set
of declaration items tailored to AI and provide examples for two fictitious AI
services in the appendix of the paper.Comment: 31 page
Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics
De novo therapeutic design is challenged by a vast chemical repertoire and
multiple constraints, e.g., high broad-spectrum potency and low toxicity. We
propose CLaSS (Controlled Latent attribute Space Sampling) - an efficient
computational method for attribute-controlled generation of molecules, which
leverages guidance from classifiers trained on an informative latent space of
molecules modeled using a deep generative autoencoder. We screen the generated
molecules for additional key attributes by using deep learning classifiers in
conjunction with novel features derived from atomistic simulations. The
proposed approach is demonstrated for designing non-toxic antimicrobial
peptides (AMPs) with strong broad-spectrum potency, which are emerging drug
candidates for tackling antibiotic resistance. Synthesis and testing of only
twenty designed sequences identified two novel and minimalist AMPs with high
potency against diverse Gram-positive and Gram-negative pathogens, including
one multidrug-resistant and one antibiotic-resistant K. pneumoniae, via
membrane pore formation. Both antimicrobials exhibit low in vitro and in vivo
toxicity and mitigate the onset of drug resistance. The proposed approach thus
presents a viable path for faster and efficient discovery of potent and
selective broad-spectrum antimicrobials
Teaching Meaningful Explanations
The adoption of machine learning in high-stakes applications such as
healthcare and law has lagged in part because predictions are not accompanied
by explanations comprehensible to the domain user, who often holds the ultimate
responsibility for decisions and outcomes. In this paper, we propose an
approach to generate such explanations in which training data is augmented to
include, in addition to features and labels, explanations elicited from domain
users. A joint model is then learned to produce both labels and explanations
from the input features. This simple idea ensures that explanations are
tailored to the complexity expectations and domain knowledge of the consumer.
Evaluation spans multiple modeling techniques on a game dataset, a (visual)
aesthetics dataset, a chemical odor dataset and a Melanoma dataset showing that
our approach is generalizable across domains and algorithms. Results
demonstrate that meaningful explanations can be reliably taught to machine
learning algorithms, and in some cases, also improve modeling accuracy.Comment: 9 page