Search CORE

7,968 research outputs found

On time, frequency, and polar motion Quarterly reports, 1 Jan. - 30 Jun. 1969

Author: Markowitz W.
Publication venue
Publication date
Field of study

Sudden changes in earth rotational acceleration and polar secular motio

NASA Technical Reports Server

Clipped-Objective Policy Gradients for Pessimistic Policy Optimization

Author: Markowitz Jared
Staley Edward W.
Publication venue
Publication date: 09/11/2023
Field of study

To facilitate efficient learning, policy gradient approaches to deep reinforcement learning (RL) are typically paired with variance reduction measures and strategies for making large but safe policy changes based on a batch of experiences. Natural policy gradient methods, including Trust Region Policy Optimization (TRPO), seek to produce monotonic improvement through bounded changes in policy outputs. Proximal Policy Optimization (PPO) is a commonly used, first-order algorithm that instead uses loss clipping to take multiple safe optimization steps per batch of data, replacing the bound on the single step of TRPO with regularization on multiple steps. In this work, we find that the performance of PPO, when applied to continuous action spaces, may be consistently improved through a simple change in objective. Instead of the importance sampling objective of PPO, we instead recommend a basic policy gradient, clipped in an equivalent fashion. While both objectives produce biased gradient estimates with respect to the RL objective, they also both display significantly reduced variance compared to the unbiased off-policy policy gradient. Additionally, we show that (1) the clipped-objective policy gradient (COPG) objective is on average "pessimistic" compared to both the PPO objective and (2) this pessimism promotes enhanced exploration. As a result, we empirically observe that COPG produces improved learning compared to PPO in single-task, constrained, and multi-task learning, without adding significant computational cost or complexity. Compared to TRPO, the COPG approach is seen to offer comparable or superior performance, while retaining the simplicity of a first-order method.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Recommended from our members

SDT: A Database Schema Design and Translation Tool Reference Manual Draft 4.1

Author: Fang W
Markowitz V M
Publication venue: eScholarship, University of California
Publication date: 01/05/1991
Field of study

eScholarship - University of California

X-ray vs. Optical Variations in the Seyfert 1 Nucleus NGC 3516: A Puzzling Disconnectedness

Author: Alex Markowitz
Dan Maoz
Done C.
Guilbert P. W.
Kirpal Nandra
Rick Edelson
Timmer J.
Publication venue: 'University of Chicago Press'
Publication date: 21/07/2002
Field of study

We present optical broadband (B and R) observations of the Seyfert 1 nucleus NGC 3516, obtained at Wise Observatory from March 1997 to March 2002, contemporaneously with X-ray 2-10 keV measurements with RXTE. With these data we increase the temporal baseline of this dataset to 5 years, more than triple to the coverage we have previously presented for this object. Analysis of the new data does not confirm the 100-day lag of X-ray behind optical variations, tentatively reported in our previous work. Indeed, excluding the first year's data, which drive the previous result, there is no significant correlation at any lag between the X-ray and optical bands. We also find no correlation at any lag between optical flux and various X-ray hardness ratios. We conclude that the close relation observed between the bands during the first year of our program was either a fluke, or perhaps the result of the exceptionally bright state of NGC 3516 in 1997, to which it has yet to return. Reviewing the results of published joint X-ray and UV/optical Seyfert monitoring programs, we speculate that there are at least two components or mechanisms contributing to the X-ray continuum emission up to 10 keV: a soft component that is correlated with UV/optical variations on timescales >1 day, and whose presence can be detected when the source is observed at low enough energies (about 1 keV), is unabsorbed, or is in a sufficiently bright phase; and a hard component whose variations are uncorrelated with the UV/optical.Comment: 9 pages, AJ, in pres

arXiv.org e-Print Archive

Crossref

CERN Document Server

Multiscaled Cross-Correlation Dynamics in Financial Time-Series

Author: Borghesi C.
Bouchaud J. P.
Elton E. J.
Epps T. W.
Gençay R.
Guhr T.
Markowitz H.
Sharpe W. F.
Tóth B.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2008
Field of study

The cross correlation matrix between equities comprises multiple interactions between traders with varying strategies and time horizons. In this paper, we use the Maximum Overlap Discrete Wavelet Transform to calculate correlation matrices over different timescales and then explore the eigenvalue spectrum over sliding time windows. The dynamics of the eigenvalue spectrum at different times and scales provides insight into the interactions between the numerous constituents involved. Eigenvalue dynamics are examined for both medium and high-frequency equity returns, with the associated correlation structure shown to be dependent on both time and scale. Additionally, the Epps effect is established using this multivariate method and analyzed at longer scales than previously studied. A partition of the eigenvalue time-series demonstrates, at very short scales, the emergence of negative returns when the largest eigenvalue is greatest. Finally, a portfolio optimization shows the importance of timescale information in the context of risk management

arXiv.org e-Print Archive

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Structural Levels of Mental Illness Stigma and Discrimination

Author: Corrigan Patrick W.
Markowitz Fred E.
Watson Amy
Publication venue: Oxford University Press and Maryland Psychiatric Research Center
Publication date: 01/01/2004
Field of study

Most of the models that currently describe processes related to mental illness stigma are based on individual-level psychological paradigms. In this article, using a sociological paradigm, we apply the concepts of structural discrimination to broaden our understanding of stigmatizing processes directed at people with mental illness. Structural, or institutional, discrimination includes the policies of private and governmental institutions that intentionally restrict the opportunities of people with mental illness. It also includes major institutions' policies that are not intended to discriminate but whose consequences nevertheless hinder the options of people with mental illness. After more fully defining intentional and unintentional forms of structural discrimination, we provide current examples of each. Then we discuss the implications of structural models for advancing our understanding of mental illness stigma, including the methodological challenges posed by this paradigm

Huskie Commons

CiteSeerX

Study of stability and control moment gyro wobble damping of flexible, spinning space stations

Author: Berman H.
Holmer W.
Markowitz J.
Publication venue
Publication date
Field of study

An executive summary and an analysis of the results are discussed. A user's guide for the digital computer program that simulates the flexible, spinning space station is presented. Control analysis activities and derivation of dynamic equations of motion and the modal analysis are also cited

NASA Technical Reports Server

A Suzaku, NuSTAR, and XMM-Newton view on variable absorption and relativistic reflection in NGC 4151

Author: Beuchert T.
Brenneman L. W.
Dauser T.
García J. A.
Kadler M.
Keck M. L.
Markowitz A. G.
Wilms J.
Zdziarski A. A.
Publication venue: 'EDP Sciences'
Publication date: 31/03/2017
Field of study

We disentangle X-ray disk reflection from complex line-of-sight absorption in the nearby Seyfert NGC 4151, using a suite of Suzaku, NuSTAR, and XMM-Newton observations. Extending upon earlier published work, we pursue a physically motivated model using the latest angle-resolved version of the lamp-post geometry reflection model relxillCp_lp together with a Comptonization continuum. We use the long-look simultaneous Suzaku/NuSTAR observation to develop a baseline model wherein we model reflected emission as a combination of lamp-post components at the heights of 1.2 and 15.0 gravitational radii. We argue for a vertically extended corona as opposed to two compact and distinct primary sources. We find two neutral absorbers (one full-covering and one partial-covering), an ionized absorber (

\log \xi = 2.8

), and a highly-ionized ultra-fast outflow, which have all been reported previously. All analyzed spectra are well described by this baseline model. The bulk of the spectral variability between 1 keV and 6 keV can be accounted for by changes in the column density of both neutral absorbers, which appear to be degenerate and inversely correlated with the variable hard continuum component flux. We track variability in absorption on both short (2 d) and long (

\sim

1 yr) timescales; the observed evolution is either consistent with changes in the absorber structure (clumpy absorber at distances ranging from the broad line region (BLR) to the inner torus or a dusty radiatively driven wind) or a geometrically stable neutral absorber that becomes increasingly ionized at a rising flux level. The soft X-rays below 1 keV are dominated by photoionized emission from extended gas that may act as a warm mirror for the nuclear radiation.Comment: 21 pages, 19 figures, 8 tables, accepted for publication by A&

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Caltech Authors

Managing Risk of Bidding in Display Advertising

Author: Amin K.
Crouhy M.
Elton E. J.
Graepel T.
Hull J.
Markowitz H.
Mun J.
Muthukrishnan S.
Sharpe W. F.
Wang X.
Wasserman L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/01/2017
Field of study

In this paper, we deal with the uncertainty of bidding for display advertising. Similar to the financial market trading, real-time bidding (RTB) based display advertising employs an auction mechanism to automate the impression level media buying; and running a campaign is no different than an investment of acquiring new customers in return for obtaining additional converted sales. Thus, how to optimally bid on an ad impression to drive the profit and return-on-investment becomes essential. However, the large randomness of the user behaviors and the cost uncertainty caused by the auction competition may result in a significant risk from the campaign performance estimation. In this paper, we explicitly model the uncertainty of user click-through rate estimation and auction competition to capture the risk. We borrow an idea from finance and derive the value at risk for each ad display opportunity. Our formulation results in two risk-aware bidding strategies that penalize risky ad impressions and focus more on the ones with higher expected return and lower risk. The empirical study on real-world data demonstrates the effectiveness of our proposed risk-aware bidding strategies: yielding profit gains of 15.4% in offline experiments and up to 17.5% in an online A/B test on a commercial RTB platform over the widely applied bidding strategies

arXiv.org e-Print Archive

Crossref