192 research outputs found
Auditing: Active Learning with Outcome-Dependent Query Costs
We propose a learning setting in which unlabeled data is free, and the cost
of a label depends on its value, which is not known in advance. We study binary
classification in an extreme case, where the algorithm only pays for negative
labels. Our motivation are applications such as fraud detection, in which
investigating an honest transaction should be avoided if possible. We term the
setting auditing, and consider the auditing complexity of an algorithm: the
number of negative labels the algorithm requires in order to learn a hypothesis
with low relative error. We design auditing algorithms for simple hypothesis
classes (thresholds and rectangles), and show that with these algorithms, the
auditing complexity can be significantly lower than the active label
complexity. We also discuss a general competitive approach for auditing and
possible modifications to the framework.Comment: Corrections in section
Factorizing LambdaMART for cold start recommendations
Recommendation systems often rely on point-wise loss metrics such as the mean
squared error. However, in real recommendation settings only few items are
presented to a user. This observation has recently encouraged the use of
rank-based metrics. LambdaMART is the state-of-the-art algorithm in learning to
rank which relies on such a metric. Despite its success it does not have a
principled regularization mechanism relying in empirical approaches to control
model complexity leaving it thus prone to overfitting.
Motivated by the fact that very often the users' and items' descriptions as
well as the preference behavior can be well summarized by a small number of
hidden factors, we propose a novel algorithm, LambdaMART Matrix Factorization
(LambdaMART-MF), that learns a low rank latent representation of users and
items using gradient boosted trees. The algorithm factorizes lambdaMART by
defining relevance scores as the inner product of the learned representations
of the users and items. The low rank is essentially a model complexity
controller; on top of it we propose additional regularizers to constraint the
learned latent representations that reflect the user and item manifolds as
these are defined by their original feature based descriptors and the
preference behavior. Finally we also propose to use a weighted variant of NDCG
to reduce the penalty for similar items with large rating discrepancy.
We experiment on two very different recommendation datasets, meta-mining and
movies-users, and evaluate the performance of LambdaMART-MF, with and without
regularization, in the cold start setting as well as in the simpler matrix
completion setting. In both cases it outperforms in a significant manner
current state of the art algorithms
Fast Differentially Private Matrix Factorization
Differentially private collaborative filtering is a challenging task, both in
terms of accuracy and speed. We present a simple algorithm that is provably
differentially private, while offering good performance, using a novel
connection of differential privacy to Bayesian posterior sampling via
Stochastic Gradient Langevin Dynamics. Due to its simplicity the algorithm
lends itself to efficient implementation. By careful systems design and by
exploiting the power law behavior of the data to maximize CPU cache bandwidth
we are able to generate 1024 dimensional models at a rate of 8.5 million
recommendations per second on a single PC
Regularized fitted Q-iteration: application to planning
We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure
Unravelling the Structure of Magnus' Pink Salt
A combination of multinuclear ultra-wideline solid-state NMR, powder X-ray diffraction (pXRD), X-ray absorption fine structure experiments, and first principles calculations of platinum magnetic shielding tensors has been employed to reveal the previously unknown crystal structure of Magnus’ pink salt (MPS), [Pt(NH3)4][PtCl4], study the isomeric Magnus’ green salt (MGS), [Pt(NH3)4][PtCl4], and examine their synthetic precursors K2PtCl4 and Pt(NH3)4Cl2·H2O. A simple synthesis of MPS is detailed which produces relatively pure product in good yield. Broad 195Pt, 14N, and 35Cl SSNMR powder patterns have been acquired using the WURST-CPMG and BRAIN-CP/WURST-CPMG pulse sequences. Experimentally measured and theoretically calculated platinum magnetic shielding tensors are shown to be very sensitive to the types and arrangements of coordinating ligands as well as intermolecular Pt–Pt metallophilic interactions. High-resolution 195Pt NMR spectra of select regions of the broad 195Pt powder patterns, in conjunction with an array of 14N and 35Cl spectra, reveal clear structural differences between all compounds. Rietveld refinements of synchrotron pXRD patterns, guided by first principles geometry optimization calculations, yield the space group, unit cell parameters, and atomic positions of MPS. The crystal structure has P-1 symmetry and resides in a pseudotetragonal unit cell with a distance of >5.5 Å between Pt sites in the square-planar Pt units. The long Pt–Pt distances and nonparallel orientation of Pt square planes prohibit metallophilic interactions within MPS. The combination of ultra-wideline NMR, pXRD, and computational methods offers much promise for future investigation and characterization of Pt-containing systems
The Role of Friction in Compaction and Segregation of Granular Materials
We investigate the role of friction in compaction and segregation of granular
materials by combining Edwards' thermodynamic hypothesis with a simple
mechanical model and mean-field based geometrical calculations. Systems of
single species with large friction coefficients are found to compact less.
Binary mixtures of grains differing in frictional properties are found to
segregate at high compactivities, in contrary to granular mixtures differing in
size, which segregate at low compactivities. A phase diagram for segregation
vs. friction coefficients of the two species is generated. Finally, the
characteristics of segregation are related directly to the volume fraction
without the explicit use of the yet unclear notion of compactivity.Comment: 9 pages, 6 figures, submitted to Phys. Rev.
- …