637 research outputs found
A Regularized Opponent Model with Maximum Entropy Objective
In a single-agent setting, reinforcement learning (RL) tasks can be cast into
an inference problem by introducing a binary random variable o, which stands
for the "optimality". In this paper, we redefine the binary random variable o
in multi-agent setting and formalize multi-agent reinforcement learning (MARL)
as probabilistic inference. We derive a variational lower bound of the
likelihood of achieving the optimality and name it as Regularized Opponent
Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel
perspective on opponent modeling and show how it can improve the performance of
training agents theoretically and empirically in cooperative games. To optimize
ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of
convergence. We extend the exact algorithm to complex environments by proposing
an approximate version, ROMMEO-AC. We evaluate these two algorithms on the
challenging iterated matrix game and differential game respectively and show
that they can outperform strong MARL baselines.Comment: Accepted to International Joint Conference on Artificial Intelligence
(IJCA2019
Strong Convergence of Modified Algorithms Based on the Regularization for the Constrained Convex Minimization Problem
As is known, the regularization method plays an important role in solving constrained convex minimization problems. Based on the idea of regularization, implicit and explicit iterative algorithms are proposed in this paper and the sequences generated by the algorithms can converge strongly to a solution of the constrained convex minimization problem, which also solves a certain variational inequality. As an application, we also apply the algorithm to solve the split feasibility problem
3-[4-(Dimethylamino)benzylideneamino]benzonitrile
The molecule of the title Schiff base, C16H15N3, is non-planar and displays a trans configuration with respect to the C=N double bond. The two benzene rings make a dihedral angle of 49.24 (3)°
Layer-dependent transport properties in the Moir\'e of strained homobilayer transition metal dichalcogenides
Bilayer moir\'e structures have attracted significant attention recently due
to their spatially modulated layer degrees of freedom. However, the
layer-dependent transport mechanism in the moir\'e structures is still a
problem to be explored. Here we investigate the layer-dependent transport
properties regulated by the strain, the interlayer bias and the number of
moir\'e periods in a strained moir\'e homobilayer TMDs nanoribbon based on
low-energy efficient models. The charge carriers can pass perfectly through the
scattering region with the moir\'e potential. While, it is noted that the
overall transmission coefficient is mainly contributed from either intralayer
or interlayer transmissions. The transition of transport mechanism between
intralayer and interlayer transmissions can be achieved by adjusting the
strain. The intralayer transmissions are suppressed and one of the interlayer
transmissions can be selected by a vertical external electric field, which can
cause a controllable layer polarization. Moreover, the staggered intralayer and
interlayer minigaps are formed as the number of moir\'e periods increases in
the scattering region due to the overlap of the wave functions in two adjacent
moir\'e periods. Our finding points to an opportunity to realize layer
functionalities by the strain and electric field.Comment: 6 pages, 4 figure
Recommended from our members
Abnormal Voxel-Wise Degree Centrality in Patients With Late-Life Depression: A Resting-State Functional Magnetic Resonance Imaging Study.
Objectives:Late-life depression (LLD) has negative impacts on somatic, emotional and cognitive domains of the lives of patients. Elucidating the abnormality in the brain networks of LLD patients could help to strengthen the understanding of LLD pathophysiology, however, the studies exploring the spontaneous brain activity in LLD during the resting state remain limited. This study aimed at identifying the voxel-level whole-brain functional connectivity changes in LLD patients. Methods:Fifty patients with late-life depression (LLD) and 33 healthy controls were recruited. All participants underwent a resting-state functional magnetic resonance imaging scan to assess the voxel-wise degree centrality (DC) changes in the patients. Furthermore, DC was compared between two patient subgroups, the late-onset depression (LOD) and the early-onset depression (EOD). Results:Compared with the healthy controls, LLD patients showed increased DC in the inferior parietal lobule, parahippocampal gyrus, brainstem and cerebellum (p < 0.05, AlphaSim-corrected). LLD patients also showed decreased DC in the somatosensory and motor cortices and cerebellum (p < 0.05, AlphaSim-corrected). Compared with EOD patients, LOD patients showed increased centrality in the superior and middle temporal gyrus and decreased centrality in the occipital region (p < 0.05, AlphaSim-corrected). No significant correlation was found between the DC value and the symptom severity or disease duration in the patients after the correction for multiple comparisons. Conclusions:These findings indicate that the intrinsic abnormality of network centrality exists in a wide range of brain areas in LLD patients. LOD patients differ with EOD patients in cortical network centrality. Our study might help to strengthen the understanding of the pathophysiology of LLD and the potential neural substrates underlie related emotional and cognitive impairments observed in the patients
Maximizing lifetime of range-adjustable wireless sensor networks: a neighborhood-based estimation of distribution algorithm
Sensor activity scheduling is critical for prolonging the lifetime of wireless sensor networks (WSNs). However, most existing methods assume sensors to have one fixed sensing range. Prevalence of sensors with adjustable sensing ranges posts two new challenges to the topic: 1) expanded search space, due to the rise in the number of possible activation modes and 2) more complex energy allocation, as the sensors differ in the energy consumption rate when using different sensing ranges. These two challenges make it hard to directly solve the lifetime maximization problem of WSNs with range-adjustable sensors (LM-RASs). This article proposes a neighborhood-based estimation of distribution algorithm (NEDA) to address it in a recursive manner. In NEDA, each individual represents a coverage scheme in which the sensors are selectively activated to monitor all the targets. A linear programming (LP) model is built to assign activation time to the schemes in the population so that their sum, the network lifetime, can be maximized conditioned on the current population. Using the activation time derived from LP as individual fitness, the NEDA is driven to seek coverage schemes promising for prolonging the network lifetime. The network lifetime is thus optimized by repeating the steps of the coverage scheme evolution and LP model solving. To encourage the search for diverse coverage schemes, a neighborhood sampling strategy is introduced. Besides, a heuristic repair strategy is designed to fine-tune the existing schemes for further improving the search efficiency. Experimental results on WSNs of different scales show that NEDA outperforms state-of-the-art approaches. It is also expected that NEDA can serve as a potential framework for solving other flexible LP problems that share the same structure with LM-RAS
Differential evolution with two-level parameter adaptation
The performance of differential evolution (DE) largely depends on its mutation strategy and control parameters. In this paper, we propose an adaptive DE (ADE) algorithm with a new mutation strategy DE/lbest/1 and a two-level adaptive parameter control scheme. The DE/lbest/1 strategy is a variant of the greedy DE/best/1 strategy. However, the population is mutated under the guide of multiple locally best individuals in DE/lbest/1 instead of one globally best individual in DE/best/1. This strategy is beneficial to the balance between fast convergence and population diversity. The two-level adaptive parameter control scheme is implemented mainly in two steps. In the first step, the population-level parameters F p and CR p for the whole population are adaptively controlled according to the optimization states, namely, the exploration state and the exploitation state in each generation. These optimization states are estimated by measuring the population distribution. Then, the individual-level parameters F i and CR i for each individual are generated by adjusting the population-level parameters. The adjustment is based on considering the individual's fitness value and its distance from the globally best individual. This way, the parameters can be adapted to not only the overall state of the population but also the characteristics of different individuals. The performance of the proposed ADE is evaluated on a suite of benchmark functions. Experimental results show that ADE generally outperforms four state-of-the-art DE variants on different kinds of optimization problems. The effects of ADE components, parameter properties of ADE, search behavior of ADE, and parameter sensitivity of ADE are also studied. Finally, we investigate the capability of ADE for solving three real-world optimization problems
Consistency of P53 immunohistochemical expression between preoperative biopsy and final surgical specimens of endometrial cancer
ObjectiveThe aim of this study is to explore the consistency of P53 immunohistochemical expression between preoperative biopsy and final pathology in endometrial cancer (EC), and to predict the prognosis of patients based on the 4-tier P53 expression and classic clinicopathological parameters.MethodsThe medical data of patients with stage I-III EC who received preoperative biopsy and initial surgical treatment in two medical centers was retrospectively collected. The consistency of P53 immunohistochemistry expression between preoperative biopsy and final pathology was compared using Cohen’s kappa coefficient and Sankey diagram, then 4-tier P53 expression was defined (P53wt/P53wt, P53abn/P53wt, P53wt/P53abn, and P53abn/P53abn). Univariate and multivariate Cox regression analysis was used to determine the correlation between 4-tier P53 expression and the prognosis of patients. On this basis, the nomogram models were established to predict the prognosis of patients by combining 4-layer P53 expression and classic clinicopathological parameters, then risk stratification was performed on patients.ResultsA total of 1186 patients were ultimately included in this study through inclusion and exclusion criteria. Overall, the consistency of P53 expression between preoperative biopsy and final pathology was 83.8%, with a kappa coefficient of 0.624. ROC curve suggested that the AUC of 4-tier P53 expression to predict the prognosis of patients was better than AUC of P53 expression in preoperative biopsy or final pathology alone. Univariate and multivariate Cox regression analysis suggested that 4-tier P53 expression was an independent influencing factor for recurrence and death. On this basis, the nomogram models based on 4-tier P53 expression and classical clinicopathological factors were successfully established. ROC curve suggested that the AUC (AUC for recurrence and death was 0.856 and 0.838, respectively) of the models was superior to the single 4-tier P53 expression or the single classical clinicopathological parameters, which could provide a better risk stratification for patients.ConclusionThe expression of P53 immunohistochemistry had relatively good consistency between preoperative biopsy and final pathology of EC. Due to the discrepancy of P53 immunohistochemistry between preoperative biopsy and final pathology, the prognosis of patients can be better evaluated based on the 4-layer P53 expression and classic clinical pathological parameters
- …