Search CORE

723 research outputs found

Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment

Author: Barocas S.
Bishop C. M.
Goel S.
Goh G.
Greg Ridgeway J. M.
Hardt M.
Kamiran F.
Kamishima T.
Kleinberg J.
Muñoz C.
Podesta J.
Zafar M. B.
Zemel R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Automated data-driven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is quite possible that the optimally trained classifier makes decisions for people belonging to different social groups with different misclassification rates (e.g., misclassification rates for females are higher than for males), thereby placing these groups at an unfair disadvantage. To account for and avoid such unfairness, in this paper, we introduce a new notion of unfairness, disparate mistreatment, which is defined in terms of misclassification rates. We then propose intuitive measures of disparate mistreatment for decision boundary-based classifiers, which can be easily incorporated into their formulation as convex-concave constraints. Experiments on synthetic as well as real world datasets show that our methodology is effective at avoiding disparate mistreatment, often at a small cost in terms of accuracy.Comment: To appear in Proceedings of the 26th International World Wide Web Conference (WWW), 2017. Code available at: https://github.com/mbilalzafar/fair-classificatio

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Can EFL interactive listening be validly assessed?

Author: Bahns J.
Birckbichler D.
Brantmeier C.
Brantmeier C.
Brown A.
Deville M.
Faerch K.
Field J.
Heilenman K.
Lam W.
Lazaraton A.
LeBlanc R.
Long M.
Long M.
Lund R.
Lynch T.
Murphy M. J.
Oprandy R.
Pica T.
Ridgeway T.
Ridgeway T.
Ross S.
Ross S.
Rost M.
Schrafnagl J.
Schraw G.
Schwartz L.
Sherman J.
Shohamy E.
Smith K. F.
Swain M.
Swain M.
Vandergrift L.
Wigglesworth G.
Young R.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/10/2011
Field of study

Crossref

University of Queensland eSpace

Comparing ultra-high spatial resolution remote-sensing methods in mapping peatland vegetation

Author: Arroyo‐Mora J. P.
Bray J. R.
Caliński T.
Hill M. O.
Liaw A.
Lovitt J.
Ridgeway G.
Rouse J. W. J.
Publication venue
Publication date: 01/09/2019
Field of study

Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Kernel density classification and boosting: an L2 sub analysis

Author: B.W. Silverman
C. C. Taylor
D. Michie
D.E. Wright
D.J. Hand
G. Ridgeway
G.R. Terrell
I.S. Abramson
J.D.F. Habbema
J.H. Friedman
J.H. Friedman
J.H. Friedman
M. Di Marzio
M. Di Marzio
M.C. Jones
M.C. Jones
M.C. Jones
M.P. Wand
P. Bühlmann
P. Hall
P. Hall
P. Hall
R.E. Shapire
T. Hastie
Y. Freund
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Kernel density estimation is a commonly used approach to classification. However, most of the theoretical results for kernel methods apply to estimation per se and not necessarily to classification. In this paper we show that when estimating the difference between two densities, the optimal smoothing parameters are increasing functions of the sample size of the complementary group, and we provide a small simluation study which examines the relative performance of kernel density methods when the final goal is classification. A relative newcomer to the classification portfolio is “boosting”, and this paper proposes an algorithm for boosting kernel density classifiers. We note that boosting is closely linked to a previously proposed method of bias reduction in kernel density estimation and indicate how it will enjoy similar properties for classification. We show that boosting kernel classifiers reduces the bias whilst only slightly increasing the variance, with an overall reduction in error. Numerical examples and simulations are used to illustrate the findings, and we also suggest further areas of research

CiteSeerX

Crossref

White Rose Research Online

Forecasting Player Behavioral Data and Simulating in-Game Events

Author: A Natekin
AJ Fox
C Bauckhage
Colin Chen
DH Ackley
G Ridgeway
G Schwarz
G Zhang
GE Box
GE Hinton
H Akaike
JG Cragg
JG Gooijer De
JH Friedman
KD Lawrence
L Deng
L Dwyer
M Gilliland
M Längkvist
MS El-Nasr
N Srivastava
NE Breslow
PH Eilers
PJ Brockwell
RJ Hyndman
S Asmussen
S Hochreiter
S Makridakis
SN Wood
SN Wood
SN Wood
T Hastie
T Zhang
TJ Hastie
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2017
Field of study

Understanding player behavior is fundamental in game data science. Video games evolve as players interact with the game, so being able to foresee player experience would help to ensure a successful game development. In particular, game developers need to evaluate beforehand the impact of in-game events. Simulation optimization of these events is crucial to increase player engagement and maximize monetization. We present an experimental analysis of several methods to forecast game-related variables, with two main aims: to obtain accurate predictions of in-app purchases and playtime in an operational production environment, and to perform simulations of in-game events in order to maximize sales and playtime. Our ultimate purpose is to take a step towards the data-driven development of games. The results suggest that, even though the performance of traditional approaches such as ARIMA is still better, the outcomes of state-of-the-art techniques like deep learning are promising. Deep learning comes up as a well-suited general model that could be used to forecast a variety of time series with different dynamic behaviors

arXiv.org e-Print Archive

Crossref

A 180 Kpc Tidal Tail in the Luminous Infrared Merger Arp 299

Author: Alton P. B.
Augarde R.
Beck S. C.
Bottema R.
Broeils A. H.
Casoli F.
Casoli F.
de Blok W. J. G.
Doherty R. M.
Dreyer J. L. E.
Dreyer J. L. E.
Duc P.-A.
Fairall A. P.
Hilker M.
Huchtmeier W. K.
Hutchings J. B.
J. E. Hibbard
Jones T. J.
Kalberla P. M. W.
Lancon A.
Lees J. F.
M. S. Yun
Mirabel I. F.
Noguchi M.
Ridgeway S. E.
Stanford S. A.
Swift L.
van der Hulst J. M.
van der Kruit P. C.
van Driel W.
Weliachew L.
Wevers B. H. M. R.
Wevers B. H. M. R.
White S. D. M.
Publication venue: 'University of Chicago Press'
Publication date: 01/01/1999
Field of study

We present VLA HI observations and UH88 deep optical B- and R-band observations of the IR luminous merger Arp 299 (= NGC 3690 + IC 694). These data reveal a gas-rich, optically faint tidal tail with a length of over 180 kpc. The size of this tidal feature necessitates an old interaction age for the merger (~750 Myr since first periapse), which is currently experiencing a very young star burst (~20 Myr). The observations reveal a most remarkable structure within the tidal tail: it appears to be composed of two parallel filaments separated by ~20 kpc. One of the filaments is gas rich with little if any starlight, while the other is gas poor. We believe that this bifurcation results from a warped disk in one of the progenitors. The quantities and kinematics of the tidal HI suggest that Arp 299 results from the collision of a retrograde Sab-Sb galaxy (IC 694) and a prograde Sbc-Sc galaxy (NGC 3690) that occurred 750 Myr ago and which will merge into a single object in ~60 Myr. We suggest that the present IR luminous phase in this system is due in part to the retrograde spin of IC 694. Finally, we discuss the apparent lack of tidal dwarf galaxies within the tail.Comment: LaTex, 14 pages, 11 figures, 4 tables, uses emulateapj.sty. Accepted to AJ for July 1999. For version with full-resolution images see http://www.cv.nrao.edu/~jhibbard/a299/HIpaper/a299HI.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarWorks@UMass Amherst

CERN Document Server

Prediction of Dengue Incidence Using Search Query Surveillance

Author: A Hulth
A Valdivia
A Wilder-Smith
Benjamin M. Althouse
BN Breyer
C Pelat
DAT Cummings
Derek A. T. Cummings
G Eysenbach
G Eysenbach
G Ridgeway
HA Johnson
J Ginsberg
JS Brownstein
JS Brownstein
JW Ayers
JW Ayers
K Wilson
PM Luz
PM Polgreen
Rebekah J. Kent. Crockett
S Goel
T Hastie
Yih Yng Ng
Publication venue: Public Library of Science
Publication date: 01/08/2011
Field of study

Improvements in surveillance, prediction of outbreaks and the monitoring of the epidemiology of dengue virus in countries with underdeveloped surveillance systems are of great importance to ministries of health and other public health decision makers who are often constrained by budget or man-power. Google Flu Trends has proven successful in providing an early warning system for outbreaks of influenza weeks before case data are reported. We believe that there is greater potential for this technique for dengue, as the incidence of this pathogen can vary by a factor of ten in some settings, making prediction all the more important in public health planning. In this paper, we demonstrate the utility of Google search terms in predicting dengue incidence in Singapore and Bangkok, Thailand using several regression techniques. Incidence data were provided by the Singapore Ministry of Health and the Thailand Bureau of Epidemiology. We find our models predict incident cases well (correlation greater than 0.8) and periods of high incidence equally well (AUC greater than 0.95). All data and analysis code used in our study are available free online and can be adapted to other settings

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Why Are Some Plant Genera More Invasive Than Others?

Author: AH Fitter
AK Sakai
AM Allen
Andrew Hector
AT Moles
BJ Goodwin
CC Daehler
CE Lee
CM D'Antonio
CS Kolar
G De'ath
G Ridgeway
G Ridgeway
G Stacey
GM Ruiz
H Mooney
II APG
IM Parker
J Elith
JA McNeely
JC Vamosi
JG Carman
JH Friedman
JH Friedman
JH Friedman
JH Zar
JM Diez
JM Diez
John M. Drake
John Paul Schmidt
K Liu
L Breiman
M Kleyer
M Křivánek
M Pessarakli
M Rejmánek
MA Hamilton
MH Williamson
MJ Crawley
MW Cadotte
NC Ellstrand
P Goldblatt
P Pyšek
P Pyšek
PS Soltis
RN Mack
RW Pemberton
RW Sage
S Holm
SCH Barrett
SH Reichard
T Hothorn
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Determining how biological traits are related to the ability of groups of organisms to become economically damaging when established outside of their native ranges is a major goal of population biology, and important in the management of invasive species. Little is known about why some taxonomic groups are more likely to become pests than others among plants. We investigated traits that discriminate vascular plant genera, a level of taxonomic generality at which risk assessment and screening could be more effectively performed, according to the proportion of naturalized species which are pests. We focused on the United States and Canada, and, because our purpose is ultimately regulatory, considered species classified as weeds or noxious. Using contingency tables, we identified 11 genera of vascular plants that are disproportionately represented by invasive species. Results from boosted regression tree analyses show that these categories reflect biological differences. In summary, approximately 25% of variation in genus proportions of weeds or noxious species was explained by biological covariates. Key explanatory traits included genus means for wetland habitat affinity, chromosome number, and seed mass

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Occupational Sex Segregation and Management-Level Wages in Germany: What Role Does Firm Size Play?

Crossref