59 research outputs found

    Thompson sampling based Monte-Carlo planning in POMDPs

    No full text
    Monte-Carlo tree search (MCTS) has been drawinggreat interest in recent years for planning under uncertainty. One of the key challenges is the tradeoffbetween exploration and exploitation. To addressthis, we introduce a novel online planning algorithmfor large POMDPs using Thompson sampling basedMCTS that balances between cumulative and simple regrets.The proposed algorithm ā€” Dirichlet-Dirichlet-NormalGamma based Partially Observable Monte-Carlo Planning (D2NG-POMCP) ā€” treats the accumulatedreward of performing an action from a beliefstate in the MCTS search tree as a random variable followingan unknown distribution with hidden parameters.Bayesian method is used to model and infer theposterior distribution of these parameters by choosingthe conjugate prior in the form of a combination of twoDirichlet and one NormalGamma distributions. Thompsonsampling is exploited to guide the action selection inthe search tree. Experimental results confirmed that ouralgorithm outperforms the state-of-the-art approacheson several common benchmark problems

    Regression Compatible Listwise Objectives for Calibrated Ranking

    Full text link
    As Learning-to-Rank (LTR) approaches primarily seek to improve ranking quality, their output scores are not scale-calibrated by design -- for example, adding a constant to the score of each item on the list will not affect the list ordering. This fundamentally limits LTR usage in score-sensitive applications. Though a simple multi-objective approach that combines a regression and a ranking objective can effectively learn scale-calibrated scores, we argue that the two objectives can be inherently conflicting, which makes the trade-off far from ideal for both of them. In this paper, we propose a novel regression compatible ranking (RCR) approach to achieve a better trade-off. The advantage of the proposed approach is that the regression and ranking components are well aligned which brings new opportunities for harmonious regression and ranking. Theoretically, we show that the two components share the same minimizer at global minima while the regression component ensures scale calibration. Empirically, we show that the proposed approach performs well on both regression and ranking metrics on several public LTR datasets, and significantly improves the Pareto frontiers in the context of multi-objective optimization. Furthermore, we evaluated the proposed approach on YouTube Search and found that it not only improved the ranking quality of the production pCTR model, but also brought gains to the click prediction accuracy

    RD-Suite: A Benchmark for Ranking Distillation

    Full text link
    The distillation of ranking models has become an important topic in both academia and industry. In recent years, several advanced methods have been proposed to tackle this problem, often leveraging ranking information from teacher rankers that is absent in traditional classification settings. To date, there is no well-established consensus on how to evaluate this class of models. Moreover, inconsistent benchmarking on a wide range of tasks and datasets make it difficult to assess or invigorate advances in this field. This paper first examines representative prior arts on ranking distillation, and raises three questions to be answered around methodology and reproducibility. To that end, we propose a systematic and unified benchmark, Ranking Distillation Suite (RD-Suite), which is a suite of tasks with 4 large real-world datasets, encompassing two major modalities (textual and numeric) and two applications (standard distillation and distillation transfer). RD-Suite consists of benchmark results that challenge some of the common wisdom in the field, and the release of datasets with teacher scores and evaluation scripts for future research. RD-Suite paves the way towards better understanding of ranking distillation, facilities more research in this direction, and presents new challenges.Comment: 15 pages, 2 figures. arXiv admin note: text overlap with arXiv:2011.04006 by other author

    Overview: Recent advances in the understanding of the northern Eurasian environments and of the urban air quality in China ā€“ a Pan-Eurasian Experiment (PEEX) programme perspective

    Get PDF
    The Pan-Eurasian Experiment (PEEX) Science Plan, released in 2015, addressed a need for a holistic system understanding and outlined the most urgent research needs for the rapidly changing Arctic-boreal region. Air quality in China, together with the long-range transport of atmospheric pollutants, was also indicated as one of the most crucial topics of the research agenda. These two geographical regions, the northern Eurasian Arctic-boreal region and China, especially the megacities in China, were identified as a "PEEX region". It is also important to recognize that the PEEX geographical region is an area where science-based policy actions would have significant impacts on the global climate. This paper summarizes results obtained during the last 5 years in the northern Eurasian region, together with recent observations of the air quality in the urban environments in China, in the context of the PEEX programme. The main regions of interest are the Russian Arctic, northern Eurasian boreal forests (Siberia) and peatlands, and the megacities in China. We frame our analysis against research themes introduced in the PEEX Science Plan in 2015. We summarize recent progress towards an enhanced holistic understanding of the land-atmosphere-ocean systems feedbacks. We conclude that although the scientific knowledge in these regions has increased, the new results are in many cases insufficient, and there are still gaps in our understanding of large-scale climate-Earth surface interactions and feedbacks. This arises from limitations in research infrastructures, especially the lack of coordinated, continuous and comprehensive in situ observations of the study region as well as integrative data analyses, hindering a comprehensive system analysis. The fast-changing environment and ecosystem changes driven by climate change, socio-economic activities like the China Silk Road Initiative, and the global trends like urbanization further complicate such analyses. We recognize new topics with an increasing importance in the near future, especially "the enhancing biological sequestration capacity of greenhouse gases into forests and soils to mitigate climate change" and the "socio-economic development to tackle air quality issues".Peer reviewe

    Bayesian mixture modelling and inference based Thompson sampling in Monte-Carlo tree search

    No full text
    Monte-Carlo tree search is drawing great interest in the domain of planning under uncertainty, particularly when little or no domain knowledge is available. One of the central problems is the trade-off between exploration and exploitation. In this paper we present a novel Bayesian mixture modelling and inference based Thompson sampling approach to addressing this dilemma. The proposed Dirichlet-NormalGamma MCTS (DNG-MCTS) algorithm represents the uncertainty of the accumulated reward for actions in the MCTS search tree as a mixture of Normal distributions and inferences on it in Bayesian settings by choosing conjugate priors in the form of combinations of Dirichlet and NormalGamma distributions. Thompson sampling is used to select the best action at each decision node. Experimental results show that our proposed algorithm has achieved the state-of-the-art comparing with popular UCT algorithm in the context of online planning for general Markov decision processe

    Comprehensive transcriptomics and proteomics analysis of Carassius auratus gills in response to Aeromonas hydrophila

    No full text
    As one of the mucosal barriers, fish gills represent the first line of defense against pathogen infection. However, the exact mechanism of gill mucosal immune response to bacterial infection still needs further investigation in fish. Here, to investigate pathological changes and molecular mechanisms of the mucosal immune response in the gills of crucian carp (Carassius auratus) challenged by Aeromonas hydrophila, the transcriptomics and proteomics were performed by using multi-omics analyses of RNA-seq coupled with iTRAQ techniques. The results demonstrated gill immune response were mostly related to the activation of complement and coagulation cascades, antigen processing and presentation, phagosome, NOD-like receptor (NLR) and nuclear factor ĪŗB (NFĪŗB) signaling pathway. Selected 21 immune-related DEGs (ie., Clam, nfyal, snrpf, acin1b, psme, sf3b5, rbm8a, rbm25, prpf18, g3bp2, snrpd3l, tecrem-2, cfl-A, C7, lysC, ddx5, hsp90, Ī±-2M, C9, C3 and slc4a1a) were verified for their immune roles in the A. hydrophila infection via using qRT-PCR assay. Meanwhile, some complement (C3, C7, C9, CFD, DF and FH) and antigen presenting (HSP90, MHC ā…”, CALR, CANX and PSME) proteins were significantly participated in the process of defense against infections in gill tissues, and protein-protein interaction (PPI) network displayed the immune signaling pathways and interactions among these DEPs. The correlation analysis indicated that the iTRAQ and qRT-PCR results was significantly correlated (Pearson's correlation coefficientĀ =Ā 0.70, pĀ <Ā 0.01). To our knowledge, the transcriptomics and proteomics of gills firstly identified by multi-omics analyses contribute to understanding on the molecular mechanisms of local mucosal immunity in cyprinid species

    Thompson Sampling Based Monte-Carlo Planning in POMDPs

    No full text
    Monte-Carlo tree search (MCTS) has been drawing great interest in recent years for planning under uncertainty. One of the key challenges is the trade-off between exploration and exploitation. To address this, we introduce a novel online planning algorithm for large POMDPs using Thompson sampling based MCTS that balances between cumulative and simple regrets. The proposed algorithmĀ Dirichlet-Dirichlet-NormalGamma based Partially Observable Monte-Carlo Planning (D2NG-POMCP) treats the accumulated reward of performing an action from a belief state in the MCTS search tree as a random variable following an unknown distribution with hidden parameters. Bayesian method is used to model and infer the posterior distribution of these parameters by choosing the conjugate prior in the form of a combination of two Dirichlet and one NormalGamma distributions. Thompson sampling is exploited to guide the action selection in the search tree. Experimental results confirmed that our algorithm outperforms the state-of-the-art approaches on several common benchmark problems

    Single molybdenum atom anchored on N-doped carbon as a promising electrocatalyst for nitrogen reduction into ammonia at ambient conditions

    No full text
    Ammonia (NH3) is one of the most important industrial chemicals owing to its wide applications in various fields. However, the synthesis of NH3 at ambient conditions remains a coveted goal for chemists. In this work, we study the potential of the newly synthesized single-atom catalysts, i.e., single metal atoms (Cu, Pd, Pt, and Mo) supported on N-doped carbon for N2 reduction reaction (NRR) by employing first-principles calculations. It is found that Mo1-N1C2 can catalyze NRR through the enzymatic mechanism with an ultralow overpotential of 0.24 V. Most importantly, the removal of the produced NH3 is rapid with a free-energy uphill of only 0.47 eV for the Mo1-N1C2 catalyst, which is much lower than that for ever-reported catalysts with low overpotentials and endows Mo1-N1C2 with excellent durability. The coordination effect on activity is further evaluated, showing that the experimentally realized active site, single Mo atom coordinated by one N atom and two C atoms (Mo-N1C2), possesses the highest catalytic performance. Our study offers new opportunities for advancing electrochemical conversion of N2 into NH3 at ambient conditions

    PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces

    No full text
    This paper provides a novel POMDP planning method, called Palm LEAf SEarch (PLEASE), which allows the selection of more than one outcome when their potential impacts are close to the highest one during its forward exploration. Compared with existing trial-based algorithms, PLEASE can save considerable time to propagate the bound improvements of beliefs in deep levels of the search tree to the root belief because of fewer backup operations. Experiments showed that PLEASE scales up SARSOP, one of the fastest algorithms, by orders of magnitude on some POMDP tasks with large observation spaces
    • ā€¦
    corecore