55 research outputs found

    The NetHack learning environment

    Get PDF
    Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the development of challenging environments that test the limits of current methods. While existing RL environments are either sufficiently complex or based on fast simulation, they are rarely both. Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack. We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL, while dramatically reducing the computational resources required to gather a large amount of experience. We compare NLE and its task suite to existing alternatives, and discuss why it is an ideal medium for testing the robustness and systematic generalization of RL agents. We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration, alongside qualitative analysis of various agents trained in the environment. NLE is open source and available at https://github.com/facebookresearch/nle

    Performance Measurement for Digital Library Services

    Get PDF
    Aim of this literature review is to draw a picture of past and current research on performance measurement, as applies to digital library services and, at large, to the digital environment, thus consisted of electronic information services and resources. The review starts with a statement of the topic and a tentative definition of digital library to be used as a comparing model for any attempt to measure its performance. A virtuous cycle of good library management puts users and the provision of quality services to the core of its values. Performance measurement supports this process. Assessment for digital library services is outlined in terms of use, services provided, costs, management tools, added value against the mission and goals of the institution. In the last decade, novelties brought about by the introduction of digital technologies in libraries have caused efforts converge to devise both new objective models of statistical data gathering and sets of sound reliable measures and indicators, apt to gauge performance. Breaking fresh ground has proved not to be an easy task: lack of consistency, of comparable data, and standards, due to the evolutionary state of the matter, have given birth to a number of initiatives and projects, mainly in the United States and the United Kingdom, which are still looking for common grounds of development. Testing is in progress and crucial to get evidence of appropriateness, reliability and comparability of performance indicators. At the same time, a number of researchers are looking beyond mere measurement of use of and access, considered too limitative and moving forward to think out new evaluation techniques and a comprehensive view of the digital library. The issue of impact and outcome assessment, in terms of benefits or changes in knowledge, behaviours and attitudes, users can derive from services and resources with potential long-term effects, is the new frontier

    Modeling the Global Water Resource System in an Integrated Assessment Modeling Framework: IGSM-WRS

    Get PDF
    Abstract and PDF report are also available on the MIT Joint Program on the Science and Policy of Global Change website (http://globalchange.mit.edu/)The availability of water resources affects energy, agricultural and environmental systems, which are linked together as well as to climate via the water cycle. As such, watersheds and river basins are directly impacted by local and regional climate variations and change. In turn, these managed systems provide direct inputs to the global economy that serve and promote public health, agricultural and energy production, ecosystem surfaces and infrastructure. We have enhanced the Integrated Global System Model (IGSM) framework capabilities to model effects on the managed water-resource systems of the influence of potential climate change and associated shifts in hydrologic variation and extremes (i.e. non-stationarity in the hydro-climate system), and how we may be able to adapt to these impacts. A key component of this enhancement is the linkage of the Water Resources System (WRS) into the IGSM framework. WRS is a global river basin scale model of water resources management, agricultural (rain-fed and irrigated crops and livestock) and aquatic environmental systems. In particular, WRS will provide the capability within the IGSM framework to explore allocation of water among irrigation, hydropower, urban/industrial, and in-stream uses and investigate how society might adapt water resources due to shifts in hydro-climate variations and extremes. This paper presents the overall design of WRS, its linkages to the land system and economic models of the IGSM, and results of test bed runs of WRS components to address issues of temporal and spatial scales in these linkages.This study received support from the MIT Joint Program on the Science and Policy of Global Change, which is funded by a consortium of government, industry and foundation sponsors

    Identifikasi Karakteristik Dataset untuk Federated SPARQL Query

    Get PDF
    Saat ini telah dikembangkan federated SPARQL query engine yang mempunyai kemampuan untuk melakukan query dari beberapa SPARQL endpoint yang terdistribusi, sehingga data yang berasal berbagai sumber memungkinkan untuk diperoleh. Ketika dijalankan untuk melakukan query, masing-masing query engine mempunyai kinerja yang berbeda-beda. Salah satu faktor yang berpengaruh terhadap kinerja dari query engine adalah karakteristik dari dataset RDF yang diakses, seperti jumlah triple, kelas, property, subjek, entity, objek, dan spreading factor dataset. Tugas Akhir ini dilakukan untuk mengidentifikasi karakteristik dataset RDF serta mengetahui karakteristik dataset yang berpengaruh terhadap kinerja dari query engine. Penelitian dilakukan dengan mengidentifikasi 10 dataset yang diambil dari jurnal penelitian lain. Sedangkan uji coba untuk mengetahui keterkaitan antara karakteristik dataset dengan kinerja dari query engine dilakukan menggunakan federated SPARQL query engine FedX. Dari hasil analisis, diketahui bahwa jumlah triple dan jumlah kelas yang terkait dengan query cenderung berpengaruh terhadap kinerja dari query engine. Sedangkan jumlah property yang terkait dengan dataset, spreading factor dataset, dan spreading factor dataset yang terkait dengan query cenderung tidak berpengaruh terhadap kinerja dari query engine. ======================================================================================================================== Federated SPARQL query engines that are able to query from multiple distributed SPARQL endpoints have been developed, so that data from multiple sources are possible to obtain. When it is used to execute a query, a query engine usually has different performance compared to the others. One of the factors that affect the performance of the query engine is the characteristic of the accessed RDF dataset, such as the number of triples, the number of classes, the number of properties, the number of subjects, the number of entities, the number of objects, and the spreading factor of dataset. This final project is done to identify the characteristic of RDF dataset and to know dataset characteristic which is able influence the performance of query engine. The study was conducted by identifying 10 datasets taken from other research journals. The test to determine the relationship between dataset characteristics and the performance of the query engine is done using federated SPARQL query engine FedX. From the analysis results, it is known that the number of triples and the number of classes associated with the query tend to affect the performance of the query engine. Meanwhile, the number of properties associated with the query, spreading factor of dataset, and spreading factor of dataset associated with the query tend not to have an effect on performance of query engine

    H-WORK project: Multilevel interventions to promote mental health in SMEs and public workplaces

    Get PDF
    The paper describes the study design, research questions and methods of a large, international intervention project aimed at improving employee mental health and well-being in SMEs and public organisations. The study is innovative in multiple ways. First, it goes beyond the current debate on whether individual- or organisational-level interventions are most effective in improving employee health and well-being and tests the cumulative effects of multilevel interventions, that is, interventions addressing individual, group, leader and organisational levels. Second, it tailors its interventions to address the aftermaths of the Covid-19 pandemic and develop suitable multilevel interventions for dealing with new ways of working. Third, it uses realist evaluation to explore and identify the working ingredients of and the conditions required for each level of intervention, and their outcomes. Finally, an economic evaluation will assess both the cost-effectiveness analysis and the affordability of the interventions from the employer perspective. The study integrates the training transfer and the organisational process evaluation literature to develop toolkits helping end-users to promote mental health and well-being in the workplace

    The role of small missions in planetary and lunar exploration

    Get PDF
    The Space Studies Board of the National Research Council charged its Committee on Planetary and Lunar Exploration (COMPLEX) to (1) examine the degree to which small missions, such as those fitting within the constraints of the Discovery program, can achieve priority objectives in the lunar and planetary sciences; (2) determine those characteristics, such as level of risk, flight rate, target mix, university involvement, technology development, management structure and procedures, and so on, that could allow a successful program; (3) assess issues, such as instrument selection, mission operations, data analysis, and data archiving, to ensure the greatest scientific return from a particular mission, given a rapid deployment schedule and a tightly constrained budget; and (4) review past programmatic attempts to establish small planetary science mission lines, including the Planetary Observers and Planetary Explorers, and consider the impact management practices have had on such programs. A series of small missions presents the planetary science community with the opportunity to expand the scope of its activities and to develop the potential and inventiveness of its members in ways not possible within the confines of large, traditional programs. COMPLEX also realized that a program of small planetary missions was, in and of itself, incapable of meeting all of the prime objectives contained in its report 'An Integrated Strategy for the Planetary Sciences: 1995-2010.' Recommendations are provided for the small planetary missions to fulfill their promise

    The institutional repository in the digital library

    Get PDF
    We begin by looking at the concept of institutional repositories within the broader context of digital libraries. ‘Digital libraries’ can mean many things, but we consider them to be libraries first and foremost, and built upon the enduring principles of information management which have lain at the heart of the practice of librarianship for hundreds of years. We look also at the significance of the qualification which defines the scope of this book – the institutional repository. Libraries are themselves repositories, and have always dealt in the management of repositories for their users. With libraries now routinely managing repositories of various types in digital format, what does it mean to qualify ‘repository’ with ‘institutional’

    Towards Meaningful Statements in IR Evaluation. Mapping Evaluation Measures to Interval Scales

    Full text link
    Recently, it was shown that most popular IR measures are not interval-scaled, implying that decades of experimental IR research used potentially improper methods, which may have produced questionable results. However, it was unclear if and to what extent these findings apply to actual evaluations and this opened a debate in the community with researchers standing on opposite positions about whether this should be considered an issue (or not) and to what extent. In this paper, we first give an introduction to the representational measurement theory explaining why certain operations and significance tests are permissible only with scales of a certain level. For that, we introduce the notion of meaningfulness specifying the conditions under which the truth (or falsity) of a statement is invariant under permissible transformations of a scale. Furthermore, we show how the recall base and the length of the run may make comparison and aggregation across topics problematic. Then we propose a straightforward and powerful approach for turning an evaluation measure into an interval scale, and describe an experimental evaluation of the differences between using the original measures and the interval-scaled ones. For all the regarded measures - namely Precision, Recall, Average Precision, (Normalized) Discounted Cumulative Gain, Rank-Biased Precision and Reciprocal Rank - we observe substantial effects, both on the order of average values and on the outcome of significance tests. For the latter, previously significant differences turn out to be insignificant, while insignificant ones become significant. The effect varies remarkably between the tests considered but overall, on average, we observed a 25% change in the decision about which systems are significantly different and which are not
    • …
    corecore