85,391 research outputs found
Responsible Scoring Mechanisms Through Function Sampling
Human decision-makers often receive assistance from data-driven algorithmic
systems that provide a score for evaluating objects, including individuals. The
scores are generated by a function (mechanism) that takes a set of features as
input and generates a score.The scoring functions are either machine-learned or
human-designed and can be used for different decision purposes such as ranking
or classification.
Given the potential impact of these scoring mechanisms on individuals' lives
and on society, it is important to make sure these scores are computed
responsibly. Hence we need tools for responsible scoring mechanism design. In
this paper, focusing on linear scoring functions, we highlight the importance
of unbiased function sampling and perturbation in the function space for
devising such tools. We provide unbiased samplers for the entire function
space, as well as a -vicinity around a given function.
We then illustrate the value of these samplers for designing effective
algorithms in three diverse problem scenarios in the context of ranking.
Finally, as a fundamental method for designing responsible scoring mechanisms,
we propose a novel approach for approximating the construction of the
arrangement of hyperplanes. Despite the exponential complexity of an
arrangement in the number of dimensions, using function sampling, our algorithm
is linear in the number of samples and hyperplanes, and independent of the
number of dimensions
ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems
Quick interaction between a human teacher and a learning machine presents
numerous benefits and challenges when working with web-scale data. The human
teacher guides the machine towards accomplishing the task of interest. The
learning machine leverages big data to find examples that maximize the training
value of its interaction with the teacher. When the teacher is restricted to
labeling examples selected by the machine, this problem is an instance of
active learning. When the teacher can provide additional information to the
machine (e.g., suggestions on what examples or predictive features should be
used) as the learning task progresses, then the problem becomes one of
interactive learning.
To accommodate the two-way communication channel needed for efficient
interactive learning, the teacher and the machine need an environment that
supports an interaction language. The machine can access, process, and
summarize more examples than the teacher can see in a lifetime. Based on the
machine's output, the teacher can revise the definition of the task or make it
more precise. Both the teacher and the machine continuously learn and benefit
from the interaction.
We have built a platform to (1) produce valuable and deployable models and
(2) support research on both the machine learning and user interface challenges
of the interactive learning problem. The platform relies on a dedicated,
low-latency, distributed, in-memory architecture that allows us to construct
web-scale learning machines with quick interaction speed. The purpose of this
paper is to describe this architecture and demonstrate how it supports our
research efforts. Preliminary results are presented as illustrations of the
architecture but are not the primary focus of the paper
Recommended from our members
Temporal Bayesian classifiers for modelling muscular dystrophy expression data
The analysis of microarray data from time-series experiments requires specialised algorithms, which take the temporal ordering of the data into account. In this paper we explore a new architecture of Bayesian classifier that can be used to understand how biological mechanisms differ with respect to time. We show that this classifier improves the classification of microarray data and at the same time ensures that the models can easily be analysed by biologists by incorporating time transparently. In this paper we focus on data that has been generated to explore different types of muscular dystrophy
An Evaluation of the Sustainability of Global Tuna Stocks Relative to Marine Stewardship Council Criteria
The Marine Stewardship Council (MSC) has established a program whereby a fishery may be certified as being sustainable. The sustainability of a fishery is defined by MSC criteria which are embodied in three Principles: relating to the status of the stock, the ecosystem of which the stock is a member and the fishery management system. Since many of these MSC criteria are comparable for global tuna stocks, the MSC scoring system was used to evaluate nineteen stocks of tropical and temperate tunas throughout the world and to evaluate the management systems of the Regional Fishery Management Organizations (RFMO) associated with these stocks
Computationally designed libraries of fluorescent proteins evaluated by preservation and diversity of function
To determine which of seven library design algorithms best introduces new protein function without destroying it altogether, seven combinatorial libraries of green fluorescent protein variants were designed and synthesized. Each was evaluated by distributions of emission intensity and color compiled from measurements made in vivo. Additional comparisons were made with a library constructed by error-prone PCR. Among the designed libraries, fluorescent function was preserved for the greatest fraction of samples in a library designed by using a structure-based computational method developed and described here. A trend was observed toward greater diversity of color in designed libraries that better preserved fluorescence. Contrary to trends observed among libraries constructed by error-prone PCR, preservation of function was observed to increase with a library's average mutation level among the four libraries designed with structure-based computational methods
Interactive Data Exploration with Smart Drill-Down
We present {\em smart drill-down}, an operator for interactively exploring a
relational table to discover and summarize "interesting" groups of tuples. Each
group of tuples is described by a {\em rule}. For instance, the rule tells us that there are a thousand tuples with value in the
first column and in the second column (and any value in the third column).
Smart drill-down presents an analyst with a list of rules that together
describe interesting aspects of the table. The analyst can tailor the
definition of interesting, and can interactively apply smart drill-down on an
existing rule to explore that part of the table. We demonstrate that the
underlying optimization problems are {\sc NP-Hard}, and describe an algorithm
for finding the approximately optimal list of rules to display when the user
uses a smart drill-down, and a dynamic sampling scheme for efficiently
interacting with large tables. Finally, we perform experiments on real datasets
on our experimental prototype to demonstrate the usefulness of smart drill-down
and study the performance of our algorithms
Pathways to Accountability II
This report summarises the results of the 2009-2010 review process on the One World Trust Global Accountability Framework and the piloting of the draft framework during 2011, and presents the full One World Trust Pathways to Accountability II indicator framework. Our work in this field work is motivated by a concern about the persisting weakness and insufficient effectiveness of global organisations from all sectors in responding to the challenge of delivering global public goods to citizens and communities, the very people whom they claim to serve and benefit, and who are most often dependent on them
- ā¦