892 research outputs found

    Deep Reinforcement Learning-based Image Captioning with Embedding Reward

    Full text link
    Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However, in this paper, we introduce a novel decision-making framework for image captioning. We utilize a "policy network" and a "value network" to collaboratively generate captions. The policy network serves as a local guidance by providing the confidence of predicting the next word according to the current state. Additionally, the value network serves as a global and lookahead guidance by evaluating all possible extensions of the current state. In essence, it adjusts the goal of predicting the correct words towards the goal of generating captions similar to the ground truth captions. We train both networks using an actor-critic reinforcement learning model, with a novel reward defined by visual-semantic embedding. Extensive experiments and analyses on the Microsoft COCO dataset show that the proposed framework outperforms state-of-the-art approaches across different evaluation metrics

    Selection of the initial design for the two-stage continual reassessment method

    Get PDF
    The continual reassessment method (CRM) was proposed in a Bayesian framework whereby the first patient is assigned to the prior guess of the maximum tolerated dose which is usually not the lowest dose level. This assignment may lead to safety concerns in practice because physicians usually prefer not to skip lower dose levels before escalating to the higher dose levels. The two-stage CRM was proposed to address such concern whereby model based dose escalation is preceded by a pre-specified escalating sequence starting from the lowest dose level. While a theoretical framework to build the two-stage CRM has been proposed, the selection of the initial dose escalating sequence, generally referred to as the initial design, remains arbitrary, either by specifing cohorts of three patients or by trial and error through extensive simulations. Motivated by a currently ongoing oncology dose finding study for which physicians stated their desire to start from the lowest dose even though the maximum tolerated dose was thought to be one of the higher dose levels, we proposed a systematic approach for selecting the initial design for the two-stage CRM. The initial design obtained using the proposed algorithm yields better operating characteristics compared to using a cohort of three initial design with a calibrated CRM. The proposed algorithm simplifies and provides a systematic approach for the selection of initial design for the two-stage CRM. Moreover, initial designs to be used as reference for planning a two-stage CRM are provided
    • …
    corecore