18 research outputs found

    Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning

    Full text link
    Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. However, even state of the art algorithms can be difficult to tune for optimum performance. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyperparameters, along with a mechanism for choosing the best performing set(s) on-line. In the literature, the ensemble technique is used to improve performance in general, but the current work specifically addresses decreasing the hyperparameter tuning effort. Furthermore, our approach targets on-line learning on a single robotic system, and does not require running multiple simulators in parallel. Although the idea is generic, the Deep Deterministic Policy Gradient was the model chosen, being a representative deep learning actor-critic method with good performance in continuous action settings but known high variance. We compare our online weighted q-ensemble approach to q-average ensemble strategies addressed in literature using alternate policy training, as well as online training, demonstrating the advantage of the new approach in eliminating hyperparameter tuning. The applicability to real-world systems was validated in common robotic benchmark environments: the bipedal robot half cheetah and the swimmer. Online Weighted Q-Ensemble presented overall lower variance and superior results when compared with q-average ensembles using randomized parameterizations

    Timelines are Publisher-Driven Caches: Analyzing and Shaping Timeline Networks

    Get PDF
    International audienceCache networks are one of the building blocks of information centric networks (ICNs). Most of the recent work on cache networks has focused on networks of request driven caches, which are populated based on users requests for content generated by publishers. However, user generated content still poses the most pressing challenges. For such content time-lines are the de facto sharing solution. In this paper, we establish a connection between time-lines and publisher-driven caches. We propose simple models and metrics to analyze publisher-driven caches, allowing for variable-sized objects. Then, we design two efficient algorithms for timeline workload shaping, leveraging admission and price control in order, for instance, to aid service providers to attain prescribed service level agreements

    Characterization of Coupled Ground State and Excited State Equilibria by Fluorescence Spectral Deconvolution

    Get PDF
    Fluorescence probes with multiparametric response based on the relative variation in the intensities of several emission bands are of great general utility. An accurate interpretation of the system requires the determination of the number, positions and intensities of the spectral components. We have developed a new algorithm for spectral deconvolution that is applicable to fluorescence probes exhibiting a two-state ground-state equilibrium and a two-state excited-state reaction. Three distinct fluorescence emission bands are resolved, with a distribution of intensities that is excitation-wavelength-dependent. The deconvolution of the spectrum into individual components is based on their representation as asymmetric Siano-Metzler log-normal functions. The application of the algorithm to the solvation response of a 3-hydroxychromone (3HC) derivative that exhibits an H-bonding-dependent excited-state intramolecular proton transfer (ESIPT) reaction allowed the separation of the spectral signatures characteristic of polarity and hydrogen bonding. This example demonstrates the ability of the method to characterize two potentially uncorrelated parameters characterizing dye environment and interactions

    Specific Visualization of Glioma Cells in Living Low-Grade Tumor Tissue

    Get PDF
    BACKGROUND: The current therapy of malignant gliomas is based on surgical resection, radio-chemotherapy and chemotherapy. Recent retrospective case-series have highlighted the significance of the extent of resection as a prognostic factor predicting the course of the disease. Complete resection in low-grade gliomas that show no MRI-enhanced images are especially difficult. The aim in this study was to develop a robust, specific, new fluorescent probe for glioma cells that is easy to apply to live tumor biopsies and could identify tumor cells from normal brain cells at all levels of magnification. METHODOLOGY/PRINCIPAL FINDINGS: In this investigation we employed brightly fluorescent, photostable quantum dots (QDs) to specifically target epidermal growth factor receptor (EGFR) that is upregulated in many gliomas. Living glioma and normal cells or tissue biopsies were incubated with QDs coupled to EGF and/or monoclonal antibodies against EGFR for 30 minutes, washed and imaged. The data include results from cell-culture, animal model and ex vivo human tumor biopsies of both low-grade and high-grade gliomas and show high probe specificity. Tumor cells could be visualized from the macroscopic to single cell level with contrast ratios as high as 1000: 1 compared to normal brain tissue. CONCLUSIONS/SIGNIFICANCE: The ability of the targeted probes to clearly distinguish tumor cells in low-grade tumor biopsies, where no enhanced MRI image was obtained, demonstrates the great potential of the method. We propose that future application of specifically targeted fluorescent particles during surgery could allow intraoperative guidance for the removal of residual tumor cells from the resection cavity and thus increase patient survival

    Application Driven Design Of Embedded Real-Time Image Processors

    No full text
    Real-time image processing systems become more and more embedded in systems for industrial inspection, autonomous robots, photo-copying, traffic control, automotive control, surveillance, security, and the like. Starting in the 80's many systems - mainly for low-level image processing - have been developed. The architectures range from framegrabbers with attached Digital Signal Processors (DSPs), to systolic pipelines, square and linear single-instruction multiple-data (SIMD) systems, pyramids, PCclusters, and smart cameras. Many of those systems lack a suitable software support, are based on a special programming language, are stand alone and cannot be tightly coupled to the rest of the processors of the embedded system. As a consequence, most often the embedded system cannot be programmed in one uniform way

    Skeletons and Asynchronous RPC for Embedded Data- and Task Parallel Image Processing

    No full text
    Developing embedded parallel image processing applications is usually a very hardware-dependent process, requiring deep knowledge of the processors used. Furthermore, if the chosen hardware does not meet the requirements, the application must be rewritten for a new platform. We wish to avoid these problems by encapsulating the parallelism. We have proposed the use of algorithmic skeletons [3] to express the data parallelism inherent to low-level image processing. However, since different operations run best on different kinds of processors, we need to exploit task parallelism as well
    corecore