Search CORE

30,883 research outputs found

Oblivious Median Slope Selection

Author: Thießen Thore
Vahrenhold Jan
Publication venue
Publication date: 07/07/2021
Field of study

We study the median slope selection problem in the oblivious RAM model. In this model memory accesses have to be independent of the data processed, i.e., an adversary cannot use observed access patterns to derive additional information about the input. We show how to modify the randomized algorithm of Matou\v{s}ek (1991) to obtain an oblivious version with O(n log^2 n) expected time for n points in R^2. This complexity matches a theoretical upper bound that can be obtained through general oblivious transformation. In addition, results from a proof-of-concept implementation show that our algorithm is also practically efficient.Comment: 14 pages, to appear in Proceedings of CCCG 202

arXiv.org e-Print Archive

How to measure metallicity from five-band photometry with supervised machine learning algorithms

Author: Acquaviva Viviana
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/12/2015
Field of study

We demonstrate that it is possible to measure metallicity from the SDSS five-band photometry to better than 0.1 dex using supervised machine learning algorithms. Using spectroscopic estimates of metallicity as ground truth, we build, optimize and train several estimators to predict metallicity. We use the observed photometry, as well as derived quantities such as stellar mass and photometric redshift, as features, and we build two sample data sets at median redshifts of 0.103 and 0.218 and median r-band magnitude of 17.5 and 18.3 respectively. We find that ensemble methods, such as Random Forests of Trees and Extremely Randomized Trees, and Support Vector Machines all perform comparably well and can measure metallicity with a Root Mean Square Error (RMSE) of 0.081 and 0.090 for the two data sets when all objects are included. The fraction of outliers (objects for which |Z_true - Z_pred| > 0.2 dex) is 2.2 and 3.9%, respectively and the RMSE decreases to 0.068 and 0.069 if those objects are excluded. Because of the ability of these algorithms to capture complex relationships between data and target, our technique performs better than previously proposed methods that sought to fit metallicity using an analytic fitting formula, and has 3x more constraining power than SED fitting-based methods. Additionally, this method is extremely forgiving of contamination in the training set, and can be used with very satisfactory results for training sample sizes of just a few hundred objects. We distribute all the routines to reproduce our results and apply them to other data sets.Comment: Minor revisions, matching version published in MNRA

arXiv.org e-Print Archive

City University of New York

Crossref

Approximate selective inference via maximum likelihood

Author: Panigrahi Snigdha
Taylor Jonathan
Publication venue
Publication date: 01/09/2020
Field of study

This article considers a conditional approach to selective inference via approximate maximum likelihood for data described by Gaussian models. There are two important considerations in adopting a post-selection inferential perspective. While one of them concerns the effective use of information in data, the other aspect deals with the computational cost of adjusting for selection. Our approximate proposal serves both these purposes-- (i) exploits the use of randomness for efficient utilization of left-over information from selection; (ii) enables us to bypass potentially expensive MCMC sampling from conditional distributions. At the core of our method is the solution to a convex optimization problem which assumes a separable form across multiple selection queries. This allows us to address the problem of tractable and efficient inference in many practical scenarios, where more than one learning query is conducted to define and perhaps redefine models and their corresponding parameters. Through an in-depth analysis, we illustrate the potential of our proposal and provide extensive comparisons with other post-selective schemes in both randomized and non-randomized paradigms of inference

arXiv.org e-Print Archive

The role of mentorship in protege performance

Author: B Chapman
BR Ragins
CM Bishop
D Stauffer
GH Hardy
GT Chao
HF Moed
J King
JE Hirsch
Julio M. Ottino
KB Athreya
KE Kram
KE Kram
LL Paglis
Luís A. Nunes Amaral
M Enserink
MC Higgins
PE Bourne
R Singh
R. Dean Malmgren
RB D’Agostino
S Aryee
S Itzkovitz
SC Payne
SG Green
SI Donaldson
TA Scandura
TD Allen
TD Allen
TD Allen
TD Allen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/06/2010
Field of study

The role of mentorship on protege performance is a matter of importance to academic, business, and governmental organizations. While the benefits of mentorship for proteges, mentors and their organizations are apparent, the extent to which proteges mimic their mentors' career choices and acquire their mentorship skills is unclear. Here, we investigate one aspect of mentor emulation by studying mentorship fecundity---the number of proteges a mentor trains---with data from the Mathematics Genealogy Project, which tracks the mentorship record of thousands of mathematicians over several centuries. We demonstrate that fecundity among academic mathematicians is correlated with other measures of academic success. We also find that the average fecundity of mentors remains stable over 60 years of recorded mentorship. We further uncover three significant correlations in mentorship fecundity. First, mentors with small mentorship fecundity train proteges that go on to have a 37% larger than expected mentorship fecundity. Second, in the first third of their career, mentors with large fecundity train proteges that go on to have a 29% larger than expected fecundity. Finally, in the last third of their career, mentors with large fecundity train proteges that go on to have a 31% smaller than expected fecundity.Comment: 23 pages double-spaced, 4 figure

arXiv.org e-Print Archive

Crossref

Risk Stratification in Post-MI Patients Based on Left Ventricular Ejection Fraction and Heart-Rate Turbulence

Author: Barthel Petra
Bigger Jr. J. Thomas
Malik M.
Rolnitzky Linda
Schmidt Georg
Schneider Raphael
Ulm Kurt
Publication venue
Publication date: 01/01/2003
Field of study

Objectives: Development of risk stratification criteria for predicting mortality in post-infarction patients taking into account LVEF and heart-rate turbulence (HRT). Methods: Based on previous results the two parameters LVEF (continuously) and turbulence slope (TS) as an indicator of the HRT were combined for risk stratification. The method has been applied within two independent data sets (the MPIP-trial and the EMIAT-study). Results: The criteria were defined in order to match the outcome of applying LVEF ( 30 % in sensitivity. In the MPIP trial the optimal criteria selected are TS normal and LVEF ( 21 % or TS abnormal and LVEF ( 40 %. Within the placebo group of the EMIAT-study the corresponding criteria are: TS normal and LVEF ( 23 % or TS abnormal and LVEF ( 40 %. Combining both studies the following criteria could be obtained: TS normal and LVEF ( 20 % or TS abnormal and LVEF ( 40 %. In the MPIP study 83 out of the 581 patients (= 14.3 %) are fulfilling these criteria. Within this group 30 patients have died during the follow-up. In the EMIAT-trial 218 out of the 591 patients (= 37.9 %) are classified as high risk patients with 53 deaths. Combining both studies the high risk group contains 301 patients with 83 deaths (ppv = 27.7 %). Using the MADIT-criterion as classification rule (LVEF ( 30 %) a sample of 375 patients with 85 deaths (ppv = 24 %) can be selected. Conclusions: The stratification rule based on LVEF and TS is able to select high risk patients suitable for implanting an ICD. The rule performs better than the classical one with LVEF alone. The high risk group applying the new criteria is smaller with about the same number of deaths and therefor with a higher positive predictive value. The classification criteria have been validated within a bootstrap study with 100 replications. In all samples the rule based on TS and LVEF (= NEW) was superior to LVEV alone, the high risk group has been smaller (( s: 301 ( 14.5 (NEW) vs. 375 ( 14.5 (LVEF)) and the positive predictive value was larger (( s: 27.2 ( 2.6 % (NEW) vs. 23.3 ( 2.2 % (LVEF)). The new criteria are less expensive due to a reduced number of high risk patients selected

Open Access LMU

Percolation Analysis of a Wiener Reconstruction of the IRAS 1.2 Jy Redshift Catalog

Author: Baugh C.
Fisher K. B.
Mecke K.
Moore B.
Shandarin S. F.
Shandarin S. F.
Shane C. D.
Zeldovich Ya. B.
Publication venue: 'University of Chicago Press'
Publication date: 09/05/1996
Field of study

We present percolation analyses of Wiener Reconstructions of the IRAS 1.2 Jy Redshift Survey. There are ten reconstructions of galaxy density fields in real space spanning the range

\beta= 0.1

1.0

, where

{\beta}={\Omega^{0.6}}/b

\Omega

is the present dimensionless density and

b

is the bias factor. Our method uses the growth of the largest cluster statistic to characterize the topology of a density field, where Gaussian randomized versions of the reconstructions are used as standards for analysis. For the reconstruction volume of radius,

R {\approx} 100 h^{-1}

Mpc, percolation analysis reveals a slight `meatball' topology for the real space, galaxy distribution of the IRAS survey. cosmology-galaxies:clustering-methods:numericalComment: Revised version accepted for publication in The Astrophysical Journal, January 10, 1997 issue, Vol.47

arXiv.org e-Print Archive

Crossref

CERN Document Server