301 research outputs found

    Simultaneous Perturbation Algorithms for Batch Off-Policy Search

    Full text link
    We propose novel policy search algorithms in the context of off-policy, batch mode reinforcement learning (RL) with continuous state and action spaces. Given a batch collection of trajectories, we perform off-line policy evaluation using an algorithm similar to that by [Fonteneau et al., 2010]. Using this Monte-Carlo like policy evaluator, we perform policy search in a class of parameterized policies. We propose both first order policy gradient and second order policy Newton algorithms. All our algorithms incorporate simultaneous perturbation estimates for the gradient as well as the Hessian of the cost-to-go vector, since the latter is unknown and only biased estimates are available. We demonstrate their practicality on a simple 1-dimensional continuous state space problem

    La pêche crevettière en Côte d'Ivoire. Bilan 1969-1970 et perspectives

    Get PDF
    The evolution of the fishery of the pink shrimp Penaeus duorarum Burkenroad is analysed from its beginning in 1969 until the end of 1970. A rapid and general decline of the yield has been evident during this period. The actual shrimp fleet seems to be too big to allow an exploitation economically convenient of the stock

    La pêche au chalut en Côte d'Ivoire. Maximum de rendement économique

    Get PDF
    This paper analyses the relations between effort and catch per unit effort of trawlers which worked in Côte d'Ivoire from Jan 1966 to Dec 1970. A fishing effort permitting to exploit fishery in the best rentability conditions is proposed

    GEOGRAPHICAL VARIABILITY IN THE AMOUNT OF BIGEYE CAUGHT UNDER FADS BY PURSE SEINERS IN THE EASTERN ATLANTIC: FROM THE MULTISPECIES SAMPLES AND THE ICCAT STATISTICS

    Get PDF
    This paper analyses the geographical distribution of bigeye FAD catches by PS using results of multispecies sampling of EU FAD catches (1991-2016). This analysis shows marked geographical gradients in the geographical distribution of bigeye catches, catches being rare in coastal areas and increasingly abundant at increasing distances from the shore. Opposite changes are observed for yellowfin abundance, while skipjack abundance tends to be similar in most areas. Yearly trends in relative abundances of bigeye and skipjack are also observed. These observed species compositions are widely in contradiction with the species composition of Task II data. This statistical problem in the bigeye geographical distribution is a source of errors in the choice and analysis of FAD moratoria. It is also a source of potential error in the Task 1 bigeye catches. Based on fine scale sampled catches, bigeye catches by the Ghanaian fleet could be widely overestimated today because of its improper data processing. Our study makes the recommendation that improved Task II statistics should be prepared for the EU&al PS and for the Ghanaian fleet before the bigeye stock assessment

    Analyse des rendements des chalutiers ivoiriens. Définition d'un effort de pêche

    Get PDF
    A comparison between the yields obtained during 1968 and 1969 from the trawlers based at Abidjan harbour was carried out in various fishing areas. Seasonal fluctuations of abundance were first eliminated and then the regression between yield and motor power was calculated. The unit of fishing effort, one hour of fishing for a standard trawler of 400 BHP, was chosen for the fishing statistics of the Ivorian fleet

    A framework for the standardisation of tropical tuna purse seine CPUE: application to the yellowfin tuna in the Indian Ocean

    Get PDF
    We revised the existing framework for tuna CPUE standardisation in light of the increasing literature that advocates the use of mixed effects models to account for the characteristics of logbook data. We apply the framework on yellowfin tuna (YFT) from the Indian Ocean, caught by the purse seine EU fleet (Spain and France) from 1984 to 2015. We used a comprehensive list of candidate covariates, including non- conventional covariates, and run exploratory models to assess the contribution of each covariate. Due to the large number of covariates, the lasso – least absolute shrinkage and selection operator- method was applied for data mining and model selection purposes. The results are two standardised YFT CPUE time series for the period 1984-2015, one for large fish caught in free-school related sets, and one for mainly juveniles caught in floating object related sets. Issues on the usefulness of highly aggregated data (low resolution: annual and fleet wide) is discussed along with the need for more detailed information on the use of dFADs, preferably at the level of a fishing trip.Preprin

    Using historical fisheries data to predict tuna distribution within the British Indian Ocean Territory marine protected area, and implications for its management

    Get PDF
    1. Recently, several large marine protected areas (MPAs) have been established globally and it is hoped that they will aid the recovery of populations of highly-mobile, large pelagic species. Understanding the distribution of these species within MPAs is key to delivering effective management but monitoring can be challenging over such vast areas of open ocean. 2. Historical fisheries data, collected prior to reserve establishment, can provide an insight into the past distributions of target species. We investigated the 10spatial and temporal distribution of yellowfin (Thunnus albacares) and skipjack (Katsuwonus pelamis) tuna catch using logbook data from the purse seine fishery in British Indian Ocean Territory (BIOT) from 1996 to 2010, before it was established as an MPA in April 2010. 3. Generalized additive models (GAMs) were used to predict tuna presence and relative abundance from fishing records in relation to temporal and environmental variables. Significant variables included sea salinity, temperature and water velocity. 4. Predictions from the models identified a distinct hotspot for large yellowfin tuna within the MPA, and areas of high predicted relative abundance of skipjack tuna. We recommend that these areas are used as focal points from which populations can be monitored and investigations into tuna residency time can occur, so that the effectiveness of the MPA in conserving highly-mobile pelagic fish can be determined

    Generating informative trajectories by using bounds on the return of control policies

    Get PDF
    Abstract We propose new methods for guiding the generation of informative trajectories when solving discrete-time optimal control problems. These methods exploit recently published results that provide ways for computing bounds on the return of control policies from a set of trajectories. Keywords: reinforcement learning, optimal control, sampling strategies Introduction. Discrete-time optimal control problems arise in many fields such as finance, medicine, engineering as well as artificial intelligence. Whatever the techniques used for solving such problems, their performance is related to the amount of information available on the system dynamics and the reward function of the optimal control problem. In this paper, we consider settings in which information on the system dynamics must be inferred from trajectories and, furthermore, due to cost and time constraints, only a limited number of trajectories can be generated. We assume that a regularity structure -given in the form of Lipschitz continuity assumptions -exists on the system dynamics and the reward function. Under such assumptions, we exploit recently published methods for computing bounds on the return of control policies from a set of trajectorie
    • …
    corecore