2,714 research outputs found

    Learning policies for Markov decision processes from data

    Full text link
    We consider the problem of learning a policy for a Markov decision process consistent with data captured on the state-actions pairs followed by the policy. We assume that the policy belongs to a class of parameterized policies which are defined using features associated with the state-action pairs. The features are known a priori, however, only an unknown subset of them could be relevant. The policy parameters that correspond to an observed target policy are recovered using `1-regularized logistic regression that best fits the observed state-action samples. We establish bounds on the difference between the average reward of the estimated and the original policy (regret) in terms of the generalization error and the ergodic coefficient of the underlying Markov chain. To that end, we combine sample complexity theory and sensitivity analysis of the stationary distribution of Markov chains. Our analysis suggests that to achieve regret within order O( √ ), it suffices to use training sample size on the order of Ω(logn · poly(1/ )), where n is the number of the features. We demonstrate the effectiveness of our method on a synthetic robot navigation example

    Learning policies for Markov decision processes from data

    Full text link
    We consider the problem of learning a policy for a Markov decision process consistent with data captured on the state-actions pairs followed by the policy. We assume that the policy belongs to a class of parameterized policies which are defined using features associated with the state-action pairs. The features are known a priori, however, only an unknown subset of them could be relevant. The policy parameters that correspond to an observed target policy are recovered using `1-regularized logistic regression that best fits the observed state-action samples. We establish bounds on the difference between the average reward of the estimated and the original policy (regret) in terms of the generalization error and the ergodic coefficient of the underlying Markov chain. To that end, we combine sample complexity theory and sensitivity analysis of the stationary distribution of Markov chains. Our analysis suggests that to achieve regret within order O( √ ), it suffices to use training sample size on the order of Ω(logn · poly(1/ )), where n is the number of the features. We demonstrate the effectiveness of our method on a synthetic robot navigation example

    Chinese social media reaction to the MERS-CoV and avian influenza A(H7N9) outbreaks

    Get PDF
    BACKGROUND: As internet and social media use have skyrocketed, epidemiologists have begun to use online data such as Google query data and Twitter trends to track the activity levels of influenza and other infectious diseases. In China, Weibo is an extremely popular microblogging site that is equivalent to Twitter. Capitalizing on the wealth of public opinion data contained in posts on Weibo, this study used Weibo as a measure of the Chinese people's reactions to two different outbreaks: the 2012 Middle East Respiratory Syndrome Coronavirus (MERS-CoV) outbreak, and the 2013 outbreak of human infection of avian influenza A(H7N9) in China. METHODS: Keyword searches were performed in Weibo data collected by The University of Hong Kong's Weiboscope project. Baseline values were determined for each keyword and reaction values per million posts in the days after outbreak information was released to the public. RESULTS: The results show that the Chinese people reacted significantly to both outbreaks online, where their social media reaction was two orders of magnitude stronger to the H7N9 influenza outbreak that happened in China than the MERS-CoV outbreak that was far away from China. CONCLUSIONS: These results demonstrate that social media could be a useful measure of public awareness and reaction to disease outbreak information released by health authorities.published_or_final_versio

    Controlling light-with-light without nonlinearity

    Full text link
    According to Huygens' superposition principle, light beams traveling in a linear medium will pass though one another without mutual disturbance. Indeed, it is widely held that controlling light signals with light requires intense laser fields to facilitate beam interactions in nonlinear media, where the superposition principle can be broken. We demonstrate here that two coherent beams of light of arbitrarily low intensity can interact on a metamaterial layer of nanoscale thickness in such a way that one beam modulates the intensity of the other. We show that the interference of beams can eliminate the plasmonic Joule losses of light energy in the metamaterial or, in contrast, can lead to almost total absorbtion of light. Applications of this phenomenon may lie in ultrafast all-optical pulse-recovery devices, coherence filters and THz-bandwidth light-by-light modulators

    The Search for Other Planets and Life

    Get PDF
    This Les Houches School offers students a wide ranging view of the field of exoplanets and the search for life beyond the solar system. Observational and theoretical opportunities abound in a new field of astronomy that will be growing for decades to come. I give a brief introduction and overview to the many detailed talks that will be presented in this volume

    Dopamine Responsiveness in the Nucl. Accumbens Shell and Parameters of the Heroin-Influenced Conditioned Place Preference in Rats

    No full text
    Previous evidence demonstrated that drug-induced extracellular dopamine (DA) concentrations in the nucl. accumbens shell (AcbSh) might underlie different vulnerabilities to heroin addiction in inbred mice strains. We investigated a potential role of the responsiveness of the DA system in the AcbSh with respect to the vulnerability to heroin-influenced conditioned place preference (CPP) in rats. Animals were randomly assigned to the heroin and saline (control) groups. Heroin-group rats were then re-classified into two groups according to the degree of heroin-induced CPP, high preference (HP) and low-preference (LP) ones. The levels of extracellular DA and dihydroxyphenyl acetic acid (DOPAC) were estimated dynamically by in vivo microdialysis. Compared with the saline group, extracellular DA and DOPAC concentrations in the heroin-treated groups were significantly higher 30 min after the last injection, but the DA level decreased sharply in these groups on days 1 and 3 and became lower than that of the saline group. Compared with LP-group rats, HP-rats displayed a higher heroin-induced increase in the DA concentration 30 min after the last heroin injection and higher DOPAC and DOPAC/DA ratios 14 days after such injection. These results suggest that differences in the DA system responsiveness in the AcbSh may determine individual differences in vulnerability to heroin addiction.Результати попередніх досліджень продемонстрували, що змінений під впливом фармакологічних агентів рівень дофаміну (DA) в шкаралупі nucl. accumbens (AcbSh) є визначальним фактором для вразливості до героїнової аддукції у лінійних мишей. Ми досліджували можливу роль реактивності DA-ергічної системи AcbSh у вразливості умовнорефлекторної преференції місця (УРПМ) щодо героїну у щурів лінї Спрейг–Доулі. Щури були рандомізовано поділені на «героїнову» та контрольну групи. Щури першої з них потім додатково поділи на дві групи відповідно до інтенсивності змін УРПМ під впливом героїну – тварин з високою та низькою «героїновою» преференцією (HP та LP). Рівні DA та дигідроксифенілоцтової кислоти (DOPAC) у позаклітинному просторі AcbSh оцінювали в динаміці за допомогою мікродіалізу in vivo. Позаклітинні концентрації DА та DOPAC у «героїнових» групах через 30 хв після останньої ін’єкції були істотно вищими, ніж у контролі, але рівень DA у тварин цих груп швидко знижувався і на першу та третю добу ставав нижчим порівняно з контролем. Тварини групи HP порівняно зі щурами групи LP демонстрували вищі значення індукованого героїном збільшення концентрації DA через 30 хв після останньої ін’єкції героїну та вищі рівень DOPAC і відношення DOPAC/DA через 14 діб після такої ін’єкції. Подібні результати дозволяють вважати, що різниці в реактивності DA-ергічної системи в AcbSh можуть визначати індивідуальні відмінності вразливості щодо героїнової залежності

    Nanocomposites of polymer and inorganic nanoparticles for optical and magnetic applications

    Get PDF
    This article provides an up-to-date review on nanocomposites composed of inorganic nanoparticles and the polymer matrix for optical and magnetic applications. Optical or magnetic characteristics can change upon the decrease of particle sizes to very small dimensions, which are, in general, of major interest in the area of nanocomposite materials. The use of inorganic nanoparticles into the polymer matrix can provide high-performance novel materials that find applications in many industrial fields. With this respect, frequently considered features are optical properties such as light absorption (UV and color), and the extent of light scattering or, in the case of metal particles, photoluminescence, dichroism, and so on, and magnetic properties such as superparamagnetism, electromagnetic wave absorption, and electromagnetic interference shielding. A general introduction, definition, and historical development of polymer–inorganic nanocomposites as well as a comprehensive review of synthetic techniques for polymer–inorganic nanocomposites will be given. Future possibilities for the development of nanocomposites for optical and magnetic applications are also introduced. It is expected that the use of new functional inorganic nano-fillers will lead to new polymer–inorganic nanocomposites with unique combinations of material properties. By careful selection of synthetic techniques and understanding/exploiting the unique physics of the polymeric nanocomposites in such materials, novel functional polymer–inorganic nanocomposites can be designed and fabricated for new interesting applications such as optoelectronic and magneto-optic applications
    corecore