147 research outputs found
Toward Understanding Privileged Features Distillation in Learning-to-Rank
In learning-to-rank problems, a privileged feature is one that is available
during model training, but not available at test time. Such features naturally
arise in merchandised recommendation systems; for instance, "user clicked this
item" as a feature is predictive of "user purchased this item" in the offline
data, but is clearly not available during online serving. Another source of
privileged features is those that are too expensive to compute online but
feasible to be added offline. Privileged features distillation (PFD) refers to
a natural idea: train a "teacher" model using all features (including
privileged ones) and then use it to train a "student" model that does not use
the privileged features.
In this paper, we first study PFD empirically on three public ranking
datasets and an industrial-scale ranking problem derived from Amazon's logs. We
show that PFD outperforms several baselines (no-distillation,
pretraining-finetuning, self-distillation, and generalized distillation) on all
these datasets. Next, we analyze why and when PFD performs well via both
empirical ablation studies and theoretical analysis for linear models. Both
investigations uncover an interesting non-monotone behavior: as the predictive
power of a privileged feature increases, the performance of the resulting
student model initially increases but then decreases. We show the reason for
the later decreasing performance is that a very predictive privileged teacher
produces predictions with high variance, which lead to high variance student
estimates and inferior testing performance.Comment: Accepted by NeurIPS 202
High performance bdd package by exploiting memory hierarchy
Abstract The success of binary decision diagram (BDD
Intrinsic electrochemical activity of single walled carbon nanotube–Nafion assemblies
The intrinsic electrochemical properties and activity of single walled carbon nanotube (SWNT) network electrodes modified by a drop-cast Nafion film have been determined using the one electron oxidation of ferrocene trimethyl ammonium (FcTMA+) as a model redox probe in the Nafion film. Facilitated by the very low transport coefficient of FcTMA+ in Nafion (apparent diffusion coefficient of 1.8 × 10−10 cm2 s−1), SWNTs in the 2-D network behave as individual elements, at short (practical) times, each with their own characteristic diffusion, independent of neighbouring sites, and the response is diagnostic of the proportion of SWNTs active in the composite. Data are analysed using candidate models for cases where: (i) electron transfer events only occur at discrete sites along the sidewall (with a defect density typical of chemical vapour deposition SWNTs); (ii) all of the SWNTs in a network are active. The first case predicts currents that are much smaller than seen experimentally, indicating that significant portions of SWNTs are active in the SWNT–Nafion composite. However, the predictions for a fully active SWNT result in higher currents than seen experimentally, indicating that a fraction of SWNTs are not connected and/or that not all SWNTs are wetted completely by the Nafion film to provide full access of the redox mediator to the SWNT surface
Exactness of Belief Propagation for Some Graphical Models with Loops
It is well known that an arbitrary graphical model of statistical inference
defined on a tree, i.e. on a graph without loops, is solved exactly and
efficiently by an iterative Belief Propagation (BP) algorithm convergent to
unique minimum of the so-called Bethe free energy functional. For a general
graphical model on a loopy graph the functional may show multiple minima, the
iterative BP algorithm may converge to one of the minima or may not converge at
all, and the global minimum of the Bethe free energy functional is not
guaranteed to correspond to the optimal Maximum-Likelihood (ML) solution in the
zero-temperature limit. However, there are exceptions to this general rule,
discussed in \cite{05KW} and \cite{08BSS} in two different contexts, where
zero-temperature version of the BP algorithm finds ML solution for special
models on graphs with loops. These two models share a key feature: their ML
solutions can be found by an efficient Linear Programming (LP) algorithm with a
Totally-Uni-Modular (TUM) matrix of constraints. Generalizing the two models we
consider a class of graphical models reducible in the zero temperature limit to
LP with TUM constraints. Assuming that a gedanken algorithm, g-BP, funding the
global minimum of the Bethe free energy is available we show that in the limit
of zero temperature g-BP outputs the ML solution. Our consideration is based on
equivalence established between gapless Linear Programming (LP) relaxation of
the graphical model in the limit and respective LP version of the
Bethe-Free energy minimization.Comment: 12 pages, 1 figure, submitted to JSTA
Whole brain radiotherapy for brain metastases from breast cancer: estimation of survival using two stratification systems
BACKGROUND: Brain metastases (BM) are the most common form of intracranial cancer. The incidence of BM seems to have increased over the past decade. Recursive partitioning analysis (RPA) of data from three Radiation Therapy Oncology Group (RTOG) trials (1200 patients) has allowed three prognostic groups to be identified. More recently a simplified stratification system that uses the evaluation of three main prognostics factors for radiosurgery in BM was developed. METHODS: To analyze the overall survival rate (OS), prognostic factors affecting outcomes and to estimate the potential improvement in OS for patients with BM from breast cancer, stratified by RPA class and brain metastases score (BS-BM). From January 1996 to December 2004, 174 medical records of patients with diagnosis of BM from breast cancer, who received WBRT were analyzed. The surgery followed by WBRT was used in 15.5% of patients and 84.5% of others patients were submitted at WBRT alone; 108 patients (62.1%) received the fractionation schedule of 30 Gy in 10 fractions. Solitary BM was present in 37.9 % of patients. The prognostic factors evaluated for OS were: age, Karnofsky Performance Status (KPS), number of lesions, localization of lesions, neurosurgery, chemotherapy, absence extracranial disease, RPA class, BS-BM and radiation doses and fractionation. RESULTS: The OS in 1, 2 and 3 years was 33.4 %, 16.7%, and 8.8 %, respectively. The RPA class analysis showed strong relation with OS (p < 0.0001). The median survival time by RPA class in months was: class I 11.7, class II 6.2 and class III 3.0. The significant prognostic factors associated with better OS were: higher KPS (p < 0.0001), neurosurgery (P < 0.0001), single metastases (p = 0.003), BS-BM (p < 0.0001), control primary tumor (p = 0.002) and absence of extracranial metastases (p = 0.001). In multivariate analysis, the factors associated positively with OS were: neurosurgery (p < 0.0001), absence of extracranial metastases (p <0.0001) and RPA class I (p < 0.0001). CONCLUSION: Our data suggests that patients with BM from breast cancer classified as RPA class I may be effectively treated with local resection followed by WBRT, mainly in those patients with single BM, higher KPS and cranial extra disease controlled. RPA class was shown to be the most reliable indicators of survival
Detection of Repeating FRB 180916.J0158+65 Down to Frequencies of 300 MHz
We report on the detection of seven bursts from the periodically active,
repeating fast radio burst (FRB) source FRB 180916.J0158+65 in the 300-400-MHz
frequency range with the Green Bank Telescope (GBT). Emission in multiple
bursts is visible down to the bottom of the GBT band, suggesting that the
cutoff frequency (if it exists) for FRB emission is lower than 300 MHz.
Observations were conducted during predicted periods of activity of the source,
and had simultaneous coverage with the Low Frequency Array (LOFAR) and the FRB
backend on the Canadian Hydrogen Intensity Mapping Experiment (CHIME)
telescope. We find that one of the GBT-detected bursts has potentially
associated emission in the CHIME band (400-800 MHz) but we detect no bursts in
the LOFAR band (110-190 MHz), placing a limit of on the
spectral index of broadband emission from the source. We also find that
emission from the source is severely band-limited with burst bandwidths as low
as 40 MHz. In addition, we place the strictest constraint on observable
scattering of the source, 1.7 ms, at 350 MHz, suggesting that the
circumburst environment does not have strong scattering properties.
Additionally, knowing that the circumburst environment is optically thin to
free-free absorption at 300 MHz, we find evidence against the association of a
hyper-compact HII region or a young supernova remnant (age 50 yr) with the
source.Comment: Accepted for publication in ApJ
Localizing FRBs through VLBI with the Algonquin Radio Observatory 10 m Telescope
The Canadian Hydrogen Intensity Mapping Experiment (CHIME)/FRB experiment has detected thousands of fast radio bursts (FRBs) due to its sensitivity and wide field of view; however, its low angular resolution prevents it from localizing events to their host galaxies. Very long baseline interferometry (VLBI), triggered by FRB detections from CHIME/FRB will solve the challenge of localization for non-repeating events. Using a refurbished 10 m radio dish at the Algonquin Radio Observatory located in Ontario Canada, we developed a testbed for a VLBI experiment with a theoretical λ/D ≲ 30 mas. We provide an overview of the 10 m system and describe its refurbishment, the data acquisition, and a procedure for fringe fitting that simultaneously estimates the geometric delay used for localization and the dispersive delay from the ionosphere. Using single pulses from the Crab pulsar, we validate the system and localization procedure, and analyze the clock stability between sites, which is critical for coherently delay referencing an FRB event. We find a localization of ∼200 mas is possible with the performance of the current system (single-baseline). Furthermore, for sources with insufficient signal or restricted wideband to simultaneously measure both geometric and ionospheric delays, we show that the differential ionospheric contribution between the two sites must be measured to a precision of 1 × 10-8 pc cm-3 to provide a reasonable localization from a detection in the 400-800 MHz band. Finally we show detection of an FRB observed simultaneously in the CHIME and the Algonquin 10 m telescope, the first non-repeating FRB in this long baseline. This project serves as a testbed for the forthcoming CHIME/FRB Outriggers project
- …