Search CORE

59 research outputs found

Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping

Author: Ju Peizhong
Li Yining
Shroff Ness
Publication venue
Publication date: 22/06/2023
Field of study

Reinforcement learning often needs to deal with the exponential growth of states and actions when exploring optimal control in high-dimensional spaces (often known as the curse of dimensionality). In this work, we address this issue by learning the inherent structure of action-wise similar MDP to appropriately balance the performance degradation versus sample/computational complexity. In particular, we partition the action spaces into multiple groups based on the similarity in transition distribution and reward function, and build a linear decomposition model to capture the difference between the intra-group transition kernel and the intra-group rewards. Both our theoretical analysis and experiments reveal a \emph{surprising and counter-intuitive result}: while a more refined grouping strategy can reduce the approximation error caused by treating actions in the same group as identical, it also leads to increased estimation error when the size of samples or the computation resources is limited. This finding highlights the grouping strategy as a new degree of freedom that can be optimized to minimize the overall performance loss. To address this issue, we formulate a general optimization problem for determining the optimal grouping strategy, which strikes a balance between performance loss and sample/computational complexity. We further propose a computationally efficient method for selecting a nearly-optimal grouping strategy, which maintains its computational complexity independent of the size of the action space

arXiv.org e-Print Archive

Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning

Author: Ghosh Arnob
Ju Peizhong
Shroff Ness B.
Publication venue
Publication date: 31/05/2023
Field of study

Fairness plays a crucial role in various multi-agent systems (e.g., communication networks, financial markets, etc.). Many multi-agent dynamical interactions can be cast as Markov Decision Processes (MDPs). While existing research has focused on studying fairness in known environments, the exploration of fairness in such systems for unknown environments remains open. In this paper, we propose a Reinforcement Learning (RL) approach to achieve fairness in multi-agent finite-horizon episodic MDPs. Instead of maximizing the sum of individual agents' value functions, we introduce a fairness function that ensures equitable rewards across agents. Since the classical Bellman's equation does not hold when the sum of individual value functions is not maximized, we cannot use traditional approaches. Instead, in order to explore, we maintain a confidence bound of the unknown environment and then propose an online convex optimization based approach to obtain a policy constrained to this confidence region. We show that such an approach achieves sub-linear regret in terms of the number of episodes. Additionally, we provide a probably approximately correct (PAC) guarantee based on the obtained regret bound. We also propose an offline RL algorithm and bound the optimality gap with respect to the optimal fair solution. To mitigate computational complexity, we introduce a policy-gradient type method for the fair objective. Simulation experiments also demonstrate the efficacy of our approach

arXiv.org e-Print Archive

Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning

Author: Ju Peizhong
Liang Yingbin
Shroff Ness B.
Publication venue
Publication date: 09/04/2023
Field of study

Meta-learning has arisen as a successful method for improving training performance by training over many similar tasks, especially with deep neural networks (DNNs). However, the theoretical understanding of when and why overparameterized models such as DNNs can generalize well in meta-learning is still limited. As an initial step towards addressing this challenge, this paper studies the generalization performance of overfitted meta-learning under a linear regression model with Gaussian features. In contrast to a few recent studies along the same line, our framework allows the number of model parameters to be arbitrarily larger than the number of features in the ground truth signal, and hence naturally captures the overparameterized regime in practical deep meta-learning. We show that the overfitted min

\ell_2

-norm solution of model-agnostic meta-learning (MAML) can be beneficial, which is similar to the recent remarkable findings on ``benign overfitting'' and ``double descent'' phenomenon in the classical (single-task) linear regression. However, due to the uniqueness of meta-learning such as task-specific gradient descent inner training and the diversity/fluctuation of the ground-truth signals among training tasks, we find new and interesting properties that do not exist in single-task linear regression. We first provide a high-probability upper bound (under reasonable tightness) on the generalization error, where certain terms decrease when the number of features increases. Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large. Under this circumstance, we show that the overfitted min

\ell_2

-norm solution can achieve an even lower generalization error than the underparameterized solution

arXiv.org e-Print Archive

Generalization Performance of Transfer Learning: Overparameterized and Underparameterized Regimes

Author: Ju Peizhong
Liang Yingbin
Lin Sen
Shroff Ness B.
Squillante Mark S.
Publication venue
Publication date: 08/06/2023
Field of study

Transfer learning is a useful technique for achieving improved performance and reducing training costs by leveraging the knowledge gained from source tasks and applying it to target tasks. Assessing the effectiveness of transfer learning relies on understanding the similarity between the ground truth of the source and target tasks. In real-world applications, tasks often exhibit partial similarity, where certain aspects are similar while others are different or irrelevant. To investigate the impact of partial similarity on transfer learning performance, we focus on a linear regression model with two distinct sets of features: a common part shared across tasks and a task-specific part. Our study explores various types of transfer learning, encompassing two options for parameter transfer. By establishing a theoretical characterization on the error of the learned model, we compare these transfer learning options, particularly examining how generalization performance changes with the number of features/parameters in both underparameterized and overparameterized regimes. Furthermore, we provide practical guidelines for determining the number of features in the common and task-specific parts for improved generalization performance. For example, when the total number of features in the source task's learning model is fixed, we show that it is more advantageous to allocate a greater number of redundant features to the task-specific part rather than the common part. Moreover, in specific scenarios, particularly those characterized by high noise levels and small true parameters, sacrificing certain true features in the common part in favor of employing more redundant features in the task-specific part can yield notable benefits

arXiv.org e-Print Archive

The 2019 eruption of recurrent nova V3890 Sgr: Observations by Swift, NICER, and SMARTS

Author: Beardmore AP
Kuin NPM
Markwardt CB
Ness JU
Orio M
Osborne JP
Page KL
Sokolovsky KV
Walter FM
Publication venue
Publication date: 01/12/2020
Field of study

V3890 Sgr is a recurrent nova that has been seen in outburst three times so far, with the most recent eruption occurring on 2019 August 27 ut. This latest outburst was followed in detail by the Neil Gehrels Swift Observatory, from less than a day after the eruption until the nova entered the Sun observing constraint, with a small number of additional observations after the constraint ended. The X-ray light curve shows initial hard shock emission, followed by an early start of the supersoft source phase around day 8.5, with the soft emission ceasing by day 26. Together with the peak blackbody temperature of the supersoft spectrum being ∼100 eV, these timings suggest the white dwarf mass to be high, ∼ 1.3, M·. The UV photometric light curve decays monotonically, with the decay rate changing a number of times, approximately simultaneously with variations in the X-ray emission. The UV grism spectra show both line and continuum emission, with emission lines of N, C, Mg, and O being notable. These UV spectra are best dereddened using a Small Magellanic Cloud extinction law. Optical spectra from SMARTS show evidence of interaction between the nova ejecta and wind from the donor star, as well as the extended atmosphere of the red giant being flash-ionized by the supersoft X-ray photons. Data from NICER reveal a transient 83 s quasi-periodic oscillation, with a modulation amplitude of 5 per cent, adding to the sample of novae that show such short variabilities during their supersoft phase

UCL Discovery

A remarkable recurrent nova in M 31: The predicted 2014 outburst in X-rays with Swift

Author: Bode MF
Darnley MJ
Hachisu I
Henze M
Hernanz M
Kato M
Ness JU
Sala G
Shafter AW
Williams SC
Publication venue: 'EDP Sciences'
Publication date
Field of study

The M 31 nova M31N 2008-12a was recently found to be a recurrent nova (RN) with a recurrence time of about 1 year. This is by far the fastest recurrence time scale of any known RNe. Our optical monitoring programme detected the predicted 2014 outburst of M31N 2008-12a in early October. We immediately initiated an X-ray/UV monitoring campaign with Swift to study the multiwavelength evolution of the outburst. We monitored M31N 2008-12a with daily Swift observations for 20 days after discovery, covering the entire supersoft X-ray source (SSS) phase. We detected SSS emission around day six after outburst. The SSS state lasted for approximately two weeks until about day 19. M31N 2008-12a was a bright X-ray source with a high blackbody temperature. The X-ray properties of this outburst were very similar to the 2013 eruption. Combined X-ray spectra show a fast rise and decline of the effective blackbody temperature. The short-term X-ray light curve showed strong, aperiodic variability which decreased significantly after about day 14. Overall, the X-ray properties of M31N 2008-12a are consistent with the average population properties of M 31 novae. The optical and X-ray light curves can be scaled uniformly to show similar time scales as those of the Galactic RNe U Sco or RS Oph. The SSS evolution time scales and effective temperatures are consistent with a high-mass WD. We predict the next outburst of M31N 2008-12a to occur in autumn 2015

LJMU Research Online (Liverpool John Moores University)

Two uniquely arranged thyroid hormone response elements in the far upstream 5′ flanking region confer direct thyroid hormone regulation to the murine cholesterol 7α hydroxylase gene

Author: Apfel
Berry
Brent
Chang
Chen
Chiang
De Fabiani
Desvergne
Dong-Ju Shin
Drover
Drover
Farsetti
Forman
Gauthier
Gauthier
Goodwin
Gullberg
Gullberg
Hashimoto
Hodin
Hodnett
Izumo
Jacques Samarut
Janowski
Kliewer
Lazar
Lazar
Lehmann
Leid
Liu
Liu
Lu
Mason
Michelina Plateroti
Ness
Ness
Ness
Norman
Park
Peet
Perlmann
Shin
Shin
Spindler
Timothy F. Osborne
Umesono
Underwood
Wahlstrom
Wang
Wang
Weiss
Yen
Zhang
Zilz
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Cholesterol 7α hydroxlyase (CYP7A1) is a key enzyme in cholesterol catabolism to bile acids and its activity is important for maintaining appropriate cholesterol levels. The murine CYP7A1 gene is highly inducible by thyroid hormone in vivo and there is an inverse relationship between thyroid hormone and serum cholesterol. Eventhough gene expression has been shown to be upregulated, whether the induction was mediated through a direct effect of thyroid hormone on the CYP7A1 promoter has never been established. Using gene targeted mice, we show that either of the two TR isoforms are sufficient to maintain normal hepatic CYP7A1 expression but a loss of both results in a significant decrease in expression. We also identified two new functional thyroid hormone receptor-binding sites in the CYP7A1 5′ flanking sequence located 3 kb upstream from the transcription start site. One site is a DR-0, which is an unusual type of TR response element, and the other consists of only a single recognizable half site that is required for TR/retinoid X receptor (RXR) binding. These two independent TR-binding sites are closely spaced and both are required for full induction of the CYP7A1 promoter by thyroid hormone, although the DR-0 site was more crucial

HAL-ENS-LYON

Crossref

PubMed Central

eScholarship - University of California

ProdInra

X-Ray Spectroscopy of Stars

Author: A Antunes
A Feldmeier
A Feldmeier
A Maggio
A Maggio
A Maggio
A Meer van der
A Telleschi
A Telleschi
A Telleschi
A Telleschi
A ud-Doula
AC Brinkman
AE Glassgold
AG Emslie
AH Gabriel
AJ Willis
AJJ Raassen
AJJ Raassen
AJJ Raassen
AJJ Raassen
AJJ Raassen
AK Dupree
AMT Pollock
AMT Pollock
AMT Pollock
AW Fullerton
B Ball
B Ercolano
B Fuhrmeister
B Stelzer
B Stelzer
B Stelzer
B Stelzer
B Stelzer
B Stelzer
BE Wood
BJ Wargelin
BR Dennis
BW Bopp
C Argiroffi
C Argiroffi
C Argiroffi
C Argiroffi
C Liefke
CA Tout
CE Parnell
CJ Schrijver
CJ Schrijver
CJ Schrijver
CJ Schrijver
CJ Schrijver
CR Canizares
CS Choi
D Baade
D García-Alvarez
D García-Alvarez
D García-Alvarez
D Lorenzetti
D Porquet
DA Swartz
DB Henley
DB Henley
DH Cohen
DH Cohen
DH Cohen
DH Cohen
DL McKenzie
DP Huenemoerder
DP Huenemoerder
DP Huenemoerder
DP Huenemoerder
DR Ballantyne
E Behar
E Flaccomio
E Franciosini
E Gosset
E Landi
EI Vilkoviskii
EJM Besselaar van den
EN Parker
EV Gotthelf
F Damiani
F Favata
F Favata
F Favata
F Favata
F Martins
F Palla
F Reale
FD Seward
FM Walter
FM Walter
FR Harnden Jr
G Giardino
G Giardino
G Giardino
G Peres
G Rauw
G Rauw
G Rauw
G Rauw
GA Doschek
GA Wade
GAJ Hussain
GAJ Hussain
GAJ Hussain
GH Herbig
GH Herbig
GH Herbig
GHJ Oord van den
GR Blumenthal
H Sana
H Sana
H Sana
H Schild
H Zinnecker
HM Antia
HM Günther
HM Günther
HM Günther
HRM Magee
ID Howarth
II Antokhin
II Antokhin
IJD Craig
IR Stevens
IR Stevens
J Bally
J Robrade
J Robrade
J Robrade
J Robrade
J Sanz-Forcada
J Sanz-Forcada
J Sanz-Forcada
J Sanz-Forcada
JC Bouret
JC Bouret
JC Brown
JC Hénoux
JF Albacete Colombo
JF Donati
JF Donati
JF Donati
JF Donati
JF Vesecky
JH Kastner
JH Kastner
JH Kastner
JH Kastner
JH Kastner
JHMM Schmitt
JHMM Schmitt
JHMM Schmitt
JHMM Schmitt
JHMM Schmitt
JHMM Schmitt
JHMM Schmitt
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Drake
JJ Macfarlane
JJ Macfarlane
JM Laming
JM Laming
JM Laming
JM Pittard
JM Pittard
JM Pittard
JN Bahcall
JP Cassinelli
JP Cassinelli
JP Cassinelli
JP Cassinelli
JT Schmelz
JU Ness
JU Ness
JU Ness
JU Ness
JU Ness
JU Ness
K Hamaguchi
K Imanishi
K Shibata
K Shibata
K Smith
K Smith
KA Hucht van der
KG Stassun
KH Nordsieck
KJH Phillips
KJH Phillips
KV Getman
L Hartmann
L Prisinzano
L Scelsi
L Scelsi
LB Lucy
LM Oskinova
LM Oskinova
LM Oskinova
LM Oskinova
LM Oskinova
M Audard
M Audard
M Audard
M Audard
M Audard
M Audard
M De Becker
M De Becker
M De Becker
M Gagné
M Gagné
M Güdel
M Güdel
M Güdel
M Güdel
M Güdel
M Güdel
M Güdel
M Güdel
M Güdel
M Güdel
M Jardine
M Jardine
M Matranga
M Rodonò
M Siarkowski
M Tsujimoto
MA Guerrero
MA Guerrero
MA Leutenegger
MA Leutenegger
MA Smith
Manuel Güdel
MJ Aschwanden
MS Giampapa
N Calvet
N Grevesse
N Grosso
N Linder
N Markova
N Pizzolato
NA Miller
NB Crosby
NE White
NE White
NR Walborn
NR Walborn
NS Brickhouse
NS Brickhouse
NS Brickhouse
NS Schulz
P Gondoin
P Gondoin
P Testa
P Testa
P Testa
P Testa
P Testa
P Testa
PC Schneider
PJ Cargill
PR Young
PS Wojdowski
Q Li
R Ignace
R Ignace
R Ignace
R Mewe
R Mewe
R Mewe
R Mewe
R Nordon
R Nordon
R Nordon
R Ottmann
R Pallavicini
R Pallavicini
R Rosner
R Ventura
RA Osten
RA Osten
RA Osten
RA Osten
RA Osten
RA Stern
RH Kramer
RJ Bray
RK Prinja
RK Ulrich
RL Mutel
RS Schnerr
S Bowyer
S Czesla
S Hubrig
S Hubrig
S Hubrig
S Krucker
S Randich
S Sciortino
S Sciortino
S Serio
S Skinner
S Watanabe
SA Zhekov
SA Zhekov
SA Zhekov
SG Gregory
SH Pravdo
SK Antiochos
SK Antiochos
SL Hawley
SL Skinner
SL Skinner
SL Skinner
SL Skinner
SL Skinner
SL Skinner
SM Chung
SM Kahn
SP Owocki
SP Owocki
SP Owocki
SP Owocki
T Bai
T Chlebowski
T Eversberg
T Preibisch
TR Ayres
TR Ayres
TW Berghöfer
TW Berghöfer
U Feldman
U Mitra-Kraev
V Petit
VL Kashyap
W Chen
WHM Ku
WL Waldron
WL Waldron
WL Waldron
WL Waldron
WL Waldron
WM Neupert
Y Nazé
Y Nazé
Y Nazé
Y Nazé
Y Nazé
Y Nazé
Y Nazé
Y Sugawara
Y Tsuboi
Yaël Nazé
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

(abridged) Non-degenerate stars of essentially all spectral classes are soft X-ray sources. Low-mass stars on the cooler part of the main sequence and their pre-main sequence predecessors define the dominant stellar population in the galaxy by number. Their X-ray spectra are reminiscent, in the broadest sense, of X-ray spectra from the solar corona. X-ray emission from cool stars is indeed ascribed to magnetically trapped hot gas analogous to the solar coronal plasma. Coronal structure, its thermal stratification and geometric extent can be interpreted based on various spectral diagnostics. New features have been identified in pre-main sequence stars; some of these may be related to accretion shocks on the stellar surface, fluorescence on circumstellar disks due to X-ray irradiation, or shock heating in stellar outflows. Massive, hot stars clearly dominate the interaction with the galactic interstellar medium: they are the main sources of ionizing radiation, mechanical energy and chemical enrichment in galaxies. High-energy emission permits to probe some of the most important processes at work in these stars, and put constraints on their most peculiar feature: the stellar wind. Here, we review recent advances in our understanding of cool and hot stars through the study of X-ray spectra, in particular high-resolution spectra now available from XMM-Newton and Chandra. We address issues related to coronal structure, flares, the composition of coronal plasma, X-ray production in accretion streams and outflows, X-rays from single OB-type stars, massive binaries, magnetic hot objects and evolved WR stars.Comment: accepted for Astron. Astrophys. Rev., 98 journal pages, 30 figures (partly multiple); some corrections made after proof stag

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Open Repository and Bibliography - Liège

Nova LMC 2009a as observed with XMM-Newton, compared with other novae

Author: Behar E
Bode MF
Dobrotka A
Henze M
Her S
Hernanz M
Ness JU
Orio M
Ospina N
Pei S
Pinto C
Sala G
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

We examine four high-resolution reflection grating spectrometers (RGS) spectra of the February 2009 outburst of the luminous recurrent nova LMC 2009a. They were very complex and rich in intricate absorption and emission features. The continuum was consistent with a dominant component originating in the atmosphere of a shell burning white dwarf (WD) with peak effective temperature between 810 000 K and a million K, and mass in the 1.2-1.4 M⊙range. A moderate blue shift of the absorption features of a few hundred km s-1can be explained with a residual nova wind depleting the WD surface at a rate of about 10-8M⊙yr-1. The emission spectrum seems to be due to both photoionization and shock ionization in the ejecta. The supersoft X-ray flux was irregularly variable on time-scales of hours, with decreasing amplitude of the variability. We find that both the period and the amplitude of another, already known 33.3-s modulation varied within time-scales of hours. We compared N LMC 2009a with other Magellanic Clouds novae, including four serendipitously discovered as supersoft X-ray sources (SSS) among 13 observed within 16 yr after the eruption. The new detected targets were much less luminous than expected: we suggest that they were partially obscured by the accretion disc. Lack of SSS detections in theMagellanic Clouds novae more than 5.5 yr after the eruption constrains the average duration of the nuclear burning phase

LJMU Research Online (Liverpool John Moores University)