19 research outputs found

    Status report on the NCRIS eResearch capability summary

    Get PDF
    Preface The period 2006 to 2014 has seen an approach to the national support of eResearch infrastructure by the Australian Government which is unprecedented. Not only has investment been at a significantly greater scale than previously, but the intent and approach has been highly innovative, shaped by a strategic approach to research support in which the critical element, the catchword, has been collaboration. The innovative directions shaped by this strategy, under the banner of the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS), have led to significant and creative initiatives and activity, seminal to new research and fields of discovery. Origin This document is a Technical Report on the Status of the NCRIS eResearch Capability. It was commissioned by the Australian Government Department of Education and Training in the second half of 2014 to examine a range of questions and issues concerning the development of this infrastructure over the period 2006-2014. The infrastructure has been built and implemented over this period following investments made by the Australian Government amounting to over $430 million, under a number of funding initiatives

    ESG-CET Final Progress Title

    Full text link

    The Australian research data infrastructure strategy

    Get PDF
    Executive summary Data is central to all research. Data, in its raw or processed form, from its original source (such as an ocean sensor) or via an analytical processor (such as the cores of a supercomputer), depends invariably on research infrastructure for its collection, generation, manipulation, characterisation, use and dissemination. Research data infrastructure refers to a range of facilities, equipment or tools that serve research through data generation, manipulation, curation, and access. It includes data itself. The Australian Government has made significant investments in research data infrastructure, guided by principles set out in existing strategies. In light of newly developed sets of principles—in particular the Strategic Framework for Research Infrastructure Investment principles, which appear in the 2011 Strategic Roadmap for Australian Research Infrastructure—the Government established the RDIC. The committee reviewed the national research data landscape to provide advice on how to optimise existing and future investments in research data infrastructure. Developed as the Australian Research Data Infrastructure Strategy, this advice provides a basis for policy makers, investors, developers, operators and users to build and sustain an effective and holistic Australian research data infrastructure system. It is a system that collects data systematically and intentionally, organises data to make it more valuable, and uses data insightfully many times over. The strategy proposes three key requirements for a successful national research data infrastructure framework: sustained infrastructure to support priority research data collections, data generation and management appropriate data governance and access arrangements delivery of enhanced research outcomes from effective data infrastructure arrangement

    Integrating data and analysis technologies within leading environmental research infrastructures: Challenges and approaches

    Get PDF
    When researchers analyze data, it typically requires significant effort in data preparation to make the data analysis ready. This often involves cleaning, pre-processing, harmonizing, or integrating data from one or multiple sources and placing them into a computational environment in a form suitable for analysis. Research infrastructures and their data repositories host data and make them available to researchers, but rarely offer a computational environment for data analysis. Published data are often persistently identified, but such identifiers resolve onto landing pages that must be (manually) navigated to identify how data are accessed. This navigation is typically challenging or impossible for machines. This paper surveys existing approaches for improving environmental data access to facilitate more rapid data analyses in computational environments, and thus contribute to a more seamless integration of data and analysis. By analysing current state-of-the-art approaches and solutions being implemented by world‑leading environmental research infrastructures, we highlight the existing practices to interface data repositories with computational environments and the challenges moving forward. We found that while the level of standardization has improved during recent years, it still is challenging for machines to discover and access data based on persistent identifiers. This is problematic in regard to the emerging requirements for FAIR (Findable, Accessible, Interoperable, and Reusable) data, in general, and problematic for seamless integration of data and analysis, in particular. There are a number of promising approaches that would improve the state-of-the-art. A key approach presented here involves software libraries that streamline reading data and metadata into computational environments. We describe this approach in detail for two research infrastructures. We argue that the development and maintenance of specialized libraries for each RI and a range of programming languages used in data analysis does not scale well. Based on this observation, we propose a set of established standards and web practices that, if implemented by environmental research infrastructures, will enable the development of RI and programming language independent software libraries with much reduced effort required for library implementation and maintenance as well as considerably lower learning requirements on users. To catalyse such advancement, we propose a roadmap and key action points for technology harmonization among RIs that we argue will build the foundation for efficient and effective integration of data and analysis.This work was supported by the European Union’s Horizon 2020 research and innovation program under grant agreements No. 824068 (ENVRI-FAIR project) and No. 831558 (FAIR- sFAIR project). NEON is a project sponsored by the National Science Foundation (NSF) and managed under cooperative support agreement (EF-1029808) to Battell

    Indiana University’s advanced cyberinfrastructure in service of IU strategic goals: Activities of the Research Technologies Division of UITS and National Center for Genome Analysis Support – two Pervasive Technology Institute cyberinfrastructure and service centers - during FY2014

    Get PDF
    This report presents information on the activities of the Research Technologies Division of UITS and the National Center for Genome Analysis Support, two cyberinfrastructure and service centers of the Pervasive Technology Institute. Research Technologies (RT) is a subunit of University Information Technology Services (UITS) and it operates and supports the largest computational, data, and visualization systems at IU. The National Center for Genome Analysis Support (NCGAS) is primarily federally funded, serving the national community of genome scientists. NCGAS leadership is drawn from the Office of the Vice President for Information Technology, UITS, the College, and the School of Informatics and Computing. This report focuses on contributions of RT and NCGAS to accomplishment of IU’s bicentennial goals, and is organized according to those goals. Together the activities of NCGAS and RT represent a large share of the activities of PTI in support of the IU community. PTI’s Research Centers (Data to Insight Center, Digital Science Center, and the Center for Applies Cybersecurity Research) also provide support to the IU community in various forms but the primary focus of these research centers is informatics, information technology, and computer science research

    Application of machine learning techniques to weather forecasting

    Get PDF
    84 p.El pronóstico del tiempo es, incluso hoy en día, una actividad realizada principalmente por humanos. Si bien las simulaciones por computadora desempeñan un papel importante en el modelado del estado y la evolución de la atmósfera, faltan metodologías para automatizar la interpretación de la información generada por estos modelos. Esta tesis doctoral explora el uso de metodologías de aprendizaje automático para resolver problemas específicos en meteorología y haciendo especial énfasis en la exploración de metodologías para mejorar la precisión de los modelos numéricos de predicción del tiempo. El trabajo presentado en este manuscrito comprende dos enfoques diferentes a la aplicación de algoritmos de aprendizaje automático a problemas de predicción meteorológica. En la primera parte, las metodologías clásicas, como la regresión multivariada no paramétrica y los árboles binarios, se utilizan para realizar regresiones en datos meteorológicos. Esta primera parte, está centrada particularmente en el pronóstico del viento, cuya naturaleza circular crea desafíos interesantes para los algoritmos clásicos de aprendizaje automático. La segunda parte de esta tesis explora el análisis de los datos meteorológicos como un problema de predicción estructurado genérico utilizando redes neuronales profundas. Las redes neuronales, como las redes convolucionales y recurrentes, proporcionan un método para capturar la estructura espacial y temporal inherente en los modelos de predicción del tiempo. Esta parte explora el potencial de las redes neuronales convolucionales profundas para resolver problemas difíciles en meteorología, como el modelado de la precipitación a partir de campos de modelos numéricos básicos. La investigación que sustenta esta tesis sirve como un ejemplo de cómo la colaboración entre las comunidades de aprendizaje automático y meteorología puede resultar mutuamente beneficiosa y conducir a avances en ambas disciplinas. Los modelos de pronóstico del tiempo y los datos de observación representan ejemplos únicos de conjuntos de datos grandes (petabytes), estructurados y de alta calidad, que la comunidad de aprendizaje automático exige para desarrollar la próxima generación de algoritmos escalables

    Application of machine learning techniques to weather forecasting

    Get PDF
    Weather forecasting is, still today, a human based activity. Although computer simulations play a major role in modelling the state and evolution of the atmosphere, there is a lack of methodologies to automate the interpretation of the information generated by these models. This doctoral thesis explores the use of machine learning methodologies to solve specific problems in meteorology and particularly focuses on the exploration of methodologies to improve the accuracy of numerical weather prediction models using machine learning. The work presented in this manuscript contains two different approaches using machine learning. In the first part, classical methodologies, such as multivariate non-parametric regression and binary trees are explored to perform regression on meteorological data. In this first part, we particularly focus on forecasting wind, where the circular nature of this variable opens interesting challenges for classic machine learning algorithms and techniques. The second part of this thesis, explores the analysis of weather data as a generic structured prediction problem using deep neural networks. Neural networks, such as convolutional and recurrent networks provide a method for capturing the spatial and temporal structure inherent in weather prediction models. This part explores the potential of deep convolutional neural networks in solving difficult problems in meteorology, such as modelling precipitation from basic numerical model fields. The research performed during the completion of this thesis demonstrates that collaboration between the machine learning and meteorology research communities is mutually beneficial and leads to advances in both disciplines. Weather forecasting models and observational data represent unique examples of large (petabytes), structured and high-quality data sets, that the machine learning community demands for developing the next generation of scalable algorithms

    Institutional plan FY 2003-FY 2007.

    Full text link
    corecore