Search CORE

15 research outputs found

Political districting without geography

Author: Benade Gerdus
Ho-Nguyen Nam
Hooker J.N.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

Geographical considerations such as contiguity and compactness are necessary elements of political districting in practice. Yet an analysis of the problem without such constraints yields mathematical insights that can inform real-world model construction. In particular, it clarifies the sharp contrast between proportionality and competitiveness and how it might be overcome in a properly formulated objective function. It also reveals serious weaknesses of the much-discussed efficiency gap as a criterion for gerrymandering.Published versio

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution

Author: Ben Murrell
C Kosiol
D Posada
D Posada
D Robinson
Daniel Kaliski
DC Nickle
DD Lee
DJ Lipman
DT Jones
F Abascal
Gerdus Benade
J Adachi
J Felsenstein
J Felsenstein
Jan Buys
K Devarajan
Konrad Scheffler
KP Burnham
KP Burnham
L Stanfel
Lise du Buisson
MO Dayhoff
MO Dayhoff
MW Dimmic
N Goldman
N Lartillot
Robert Ketteringham
S Whelan
S Whelan
S Zoller
SA Guindon
Sasha Moola
SL Kosakovsky Pond
SL Kosakovsky Pond
SQ Le
SQ Le
Thomas Mailund
Thomas Weighill
Tristan Hands
W Delport
Y Cao
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models

Public Library of Science (PLOS)

Cape Town University OpenUCT

Crossref

Directory of Open Access Journals

PubMed Central

Stellenbosch University SUNScholar Repository

Preference Elicitation For Participatory Budgeting

Author: Benade Gerdus
Nath Swaprava
Procaccia Ariel
Shah Nisarg
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 10/02/2017
Field of study

Participatory budgeting enables the allocation of public funds by collecting and aggregating individual preferences; it has already had a sizable real-world impact. But making the most of this new paradigm requires a rethinking of some of the basics of computational social choice, including the very way in which individuals express their preferences. We analytically compare four preference elicitation methods -- knapsack votes, rankings by value or value for money, and threshold approval votes -- through the lens of implicit utilitarian voting, and find that threshold approval votes are qualitatively superior. This conclusion is supported by experiments using data from real participatory budgeting elections

Association for the Advancement of Artificial Intelligence: AAAI Publications

Non-negative matrix factorization.

Author: Ben Murrell (151760)
Daniel Kaliski (336793)
Gerdus Benade (336791)
Jan Buys (336786)
Konrad Scheffler (49776)
Lise du Buisson (336792)
Robert Ketteringham (336789)
Sasha Moola (151768)
Thomas Weighill (151773)
Tristan Hands (336794)
Publication venue
Publication date: 20/02/2013
Field of study

Non-negative matrix factorization.</p

The Francis Crick Institute

scores for all models.

Author: Ben Murrell (151760)
Daniel Kaliski (336793)
Gerdus Benade (336791)
Jan Buys (336786)
Konrad Scheffler (49776)
Lise du Buisson (336792)
Robert Ketteringham (336789)
Sasha Moola (151768)
Thomas Weighill (151773)
Tristan Hands (336794)
Publication venue
Publication date
Field of study

Each table entry is the number of datasets with in that range. For any dataset, the best model has . A model with has essentially no support.</p

The Francis Crick Institute

NNMF basis matrices.

Author: Ben Murrell (151760)
Daniel Kaliski (336793)
Gerdus Benade (336791)
Jan Buys (336786)
Konrad Scheffler (49776)
Lise du Buisson (336792)
Robert Ketteringham (336789)
Sasha Moola (151768)
Thomas Weighill (151773)
Tristan Hands (336794)
Publication venue
Publication date
Field of study

The set of NNMF basis matrices obtained for ranks ranging from 1 to 5. Amino acids are ordered according to their Stanfel classification <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0028898#pone.0028898-Stanfel1" target="_blank">[25]</a>. Rates are indicated in grayscale, with pure white being a rate of zero and pure black being the maximum rate in the matrix.</p

The Francis Crick Institute

for all models with gamma rate variation (4 categories).

Author: Ben Murrell (151760)
Daniel Kaliski (336793)
Gerdus Benade (336791)
Jan Buys (336786)
Konrad Scheffler (49776)
Lise du Buisson (336792)
Robert Ketteringham (336789)
Sasha Moola (151768)
Thomas Weighill (151773)
Tristan Hands (336794)
Publication venue
Publication date
Field of study

Each table entry is the number of datasets with in that range. For any dataset, the best model has . A model with has essentially no support.</p

The Francis Crick Institute

NNMF basis matrices correlate with amino acid properties.

Author: Ben Murrell (151760)
Daniel Kaliski (336793)
Gerdus Benade (336791)
Jan Buys (336786)
Konrad Scheffler (49776)
Lise du Buisson (336792)
Robert Ketteringham (336789)
Sasha Moola (151768)
Thomas Weighill (151773)
Tristan Hands (336794)
Publication venue
Publication date
Field of study

The correlations between amino acid properties and the basis matrices. The horizontal black line (at −0.16867) indicates the threshold for significant negative correlation (, one tailed, ).</p

The Francis Crick Institute

Selecting the larger Pandit alignments.

Author: Ben Murrell (151760)
Daniel Kaliski (336793)
Gerdus Benade (336791)
Jan Buys (336786)
Konrad Scheffler (49776)
Lise du Buisson (336792)
Robert Ketteringham (336789)
Sasha Moola (151768)
Thomas Weighill (151773)
Tristan Hands (336794)
Publication venue
Publication date
Field of study

Each blue dot represents an alignment in the Pandit database. The green region covers the alignments used in the training set, and the thin red region covers those in the test set.</p

The Francis Crick Institute