214 research outputs found
The question of "Mind-sets" and AI: Cultural origins and limits of the current AI Ethical AIs and Cultural Pluralism
The current process of scientific and technological development is the outcome of the epochal Cultural Revolution in the West: i.e. the emergence of the Age of Enlightenment and its pursuit of "rationality". Today, "rationality" combined with "logic" has mutated into a "strong belief" in the power of rationality and "computational processes" as a 'safer' and only way to acquire knowledge. This is the main driving force behind the emergence of AI. At the core of this mind-set is the fundamental duality of the observer and the observed. After the imperial expansion of Western Europe – in alliance with religion, its previous foe (“Christianity”) – this world-view became the globally dominant mind-set. The paper explores the dominant narrative of rationality and reason of Western science, and seeks an alternative world of cultural diversity
Recommended from our members
CloudBooks: An infrastructure for reading on multiple devices
The use of light, portable devices such as iPads whose reading angle is readily changed is radically different to reading on a desktop or laptop. However, it would be naive to view this as mere evolution. Rather, such devices permit reading activity to more closely mirror paper. A light, keyboardless device can be used in many different locations and orientations. This paper reports an infrastructure for supporting reading on multiple slate devices using a single cloud-based system to provide for numerous configurations
Supporting Human-AI Collaboration in Auditing LLMs with LLMs
Large language models are becoming increasingly pervasive and ubiquitous in
society via deployment in sociotechnical systems. Yet these language models, be
it for classification or generation, have been shown to be biased and behave
irresponsibly, causing harm to people at scale. It is crucial to audit these
language models rigorously. Existing auditing tools leverage either or both
humans and AI to find failures. In this work, we draw upon literature in
human-AI collaboration and sensemaking, and conduct interviews with research
experts in safe and fair AI, to build upon the auditing tool: AdaTest (Ribeiro
and Lundberg, 2022), which is powered by a generative large language model
(LLM). Through the design process we highlight the importance of sensemaking
and human-AI communication to leverage complementary strengths of humans and
generative models in collaborative auditing. To evaluate the effectiveness of
the augmented tool, AdaTest++, we conduct user studies with participants
auditing two commercial language models: OpenAI's GPT-3 and Azure's sentiment
analysis model. Qualitative analysis shows that AdaTest++ effectively leverages
human strengths such as schematization, hypothesis formation and testing.
Further, with our tool, participants identified a variety of failures modes,
covering 26 different topics over 2 tasks, that have been shown before in
formal audits and also those previously under-reported.Comment: 21 pages, 3 figure
ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems
Quick interaction between a human teacher and a learning machine presents
numerous benefits and challenges when working with web-scale data. The human
teacher guides the machine towards accomplishing the task of interest. The
learning machine leverages big data to find examples that maximize the training
value of its interaction with the teacher. When the teacher is restricted to
labeling examples selected by the machine, this problem is an instance of
active learning. When the teacher can provide additional information to the
machine (e.g., suggestions on what examples or predictive features should be
used) as the learning task progresses, then the problem becomes one of
interactive learning.
To accommodate the two-way communication channel needed for efficient
interactive learning, the teacher and the machine need an environment that
supports an interaction language. The machine can access, process, and
summarize more examples than the teacher can see in a lifetime. Based on the
machine's output, the teacher can revise the definition of the task or make it
more precise. Both the teacher and the machine continuously learn and benefit
from the interaction.
We have built a platform to (1) produce valuable and deployable models and
(2) support research on both the machine learning and user interface challenges
of the interactive learning problem. The platform relies on a dedicated,
low-latency, distributed, in-memory architecture that allows us to construct
web-scale learning machines with quick interaction speed. The purpose of this
paper is to describe this architecture and demonstrate how it supports our
research efforts. Preliminary results are presented as illustrations of the
architecture but are not the primary focus of the paper
Aligning Offline Metrics and Human Judgments of Value for Code Generation Models
Large language models have demonstrated great potential to assist programmers
in generating code. For such human-AI pair programming scenarios, we
empirically demonstrate that while generated code is most often evaluated in
terms of their functional correctness (i.e., whether generations pass available
unit tests), correctness does not fully capture (e.g., may underestimate) the
productivity gains these models may provide. Through a user study with N = 49
experienced programmers, we show that while correctness captures high-value
generations, programmers still rate code that fails unit tests as valuable if
it reduces the overall effort needed to complete a coding task. Finally, we
propose a hybrid metric that combines functional correctness and syntactic
similarity and show that it achieves a 14% stronger correlation with value and
can therefore better represent real-world gains when evaluating and comparing
models.Comment: Accepted at ACL 2023 (Findings
Rational expectations equilibrium in an economy with segmented capital asset markets
We develop a model of noisy rational expectations equilibrium in segmented markets. The noise emerges endogenously through intermarket effects rather than through exogenous supply noise from liquidity or naive trading as in standard noisy rational expectations equilibrium of the Hellwig type. Existence of and persistence of segmentation in equilibrium is established. A metric to determine welfare effects of the degree of segmentation is also derived. This metric is structurally different from the metric derived in the standard models and includes the latter as a special case. Empirical evidence from and observed characteristics of "real world" economies that support the economic intuition underlying the model are described in some detail.Capital market
Trust in AutoML: Exploring Information Needs for Establishing Trust in Automated Machine Learning Systems
We explore trust in a relatively new area of data science: Automated Machine
Learning (AutoML). In AutoML, AI methods are used to generate and optimize
machine learning models by automatically engineering features, selecting
models, and optimizing hyperparameters. In this paper, we seek to understand
what kinds of information influence data scientists' trust in the models
produced by AutoML? We operationalize trust as a willingness to deploy a model
produced using automated methods. We report results from three studies --
qualitative interviews, a controlled experiment, and a card-sorting task -- to
understand the information needs of data scientists for establishing trust in
AutoML systems. We find that including transparency features in an AutoML tool
increased user trust and understandability in the tool; and out of all proposed
features, model performance metrics and visualizations are the most important
information to data scientists when establishing their trust with an AutoML
tool.Comment: IUI 202
Designing and evaluating the usability of a machine learning API for rapid prototyping music technology
To better support creative software developers and music technologists' needs, and to empower them as machine learning users and innovators, the usability of and developer experience with machine learning tools must be considered and better understood. We review background research on the design and evaluation of application programming interfaces (APIs), with a focus on the domain of machine learning for music technology software development. We present the design rationale for the RAPID-MIX API, an easy-to-use API for rapid prototyping with interactive machine learning, and a usability evaluation study with software developers of music technology. A cognitive dimensions questionnaire was designed and delivered to a group of 12 participants who used the RAPID-MIX API in their software projects, including people who developed systems for personal use and professionals developing software products for music and creative technology companies. The results from the questionnaire indicate that participants found the RAPID-MIX API a machine learning API which is easy to learn and use, fun, and good for rapid prototyping with interactive machine learning. Based on these findings, we present an analysis and characterization of the RAPID-MIX API based on the cognitive dimensions framework, and discuss its design trade-offs and usability issues. We use these insights and our design experience to provide design recommendations for ML APIs for rapid prototyping of music technology. We conclude with a summary of the main insights, a discussion of the merits and challenges of the application of the CDs framework to the evaluation of machine learning APIs, and directions to future work which our research deems valuable
Predicting Academic Success Based on Learning Material Usage
In this work, we explore students' usage of online learning material as a predictor of academic success. In the context of an introductory programming course, we recorded the amount of time that each element such as a text paragraph or an image was visible on the students' screen. Then, we applied machine learning methods to study to what extent material usage predicts course outcomes. Our results show that the time spent with each paragraph of the online learning material is a moderate predictor of student success even when corrected for student time-on-task, and that the information can be used to identify at-risk students. The predictive performance of the models is dependent on the quantity of data, and the predictions become more accurate as the course progresses. In a broader context, our results indicate that course material usage can be used to predict academic success, and that such data can be collected in-situ with minimal interference to the students' learning process.Peer reviewe
- …