2,598,506 research outputs found

    An Assessment of Data Transfer Performance for Large-Scale Climate Data Analysis and Recommendations for the Data Infrastructure for CMIP6

    Full text link
    We document the data transfer workflow, data transfer performance, and other aspects of staging approximately 56 terabytes of climate model output data from the distributed Coupled Model Intercomparison Project (CMIP5) archive to the National Energy Research Supercomputing Center (NERSC) at the Lawrence Berkeley National Laboratory required for tracking and characterizing extratropical storms, a phenomena of importance in the mid-latitudes. We present this analysis to illustrate the current challenges in assembling multi-model data sets at major computing facilities for large-scale studies of CMIP5 data. Because of the larger archive size of the upcoming CMIP6 phase of model intercomparison, we expect such data transfers to become of increasing importance, and perhaps of routine necessity. We find that data transfer rates using the ESGF are often slower than what is typically available to US residences and that there is significant room for improvement in the data transfer capabilities of the ESGF portal and data centers both in terms of workflow mechanics and in data transfer performance. We believe performance improvements of at least an order of magnitude are within technical reach using current best practices, as illustrated by the performance we achieved in transferring the complete raw data set between two high performance computing facilities. To achieve these performance improvements, we recommend: that current best practices (such as the Science DMZ model) be applied to the data servers and networks at ESGF data centers; that sufficient financial and human resources be devoted at the ESGF data centers for systems and network engineering tasks to support high performance data movement; and that performance metrics for data transfer between ESGF data centers and major computing facilities used for climate data analysis be established, regularly tested, and published

    DEVELOPMENT OF LEARNING MODULE MENGOLAH DATA DENGAN MICROSOFT ACCESS 2003 ON COMPUTER SKILLS AND MANAGEMENTINFORMATION LESSONS IN SMK NEGERI 2 SUKOHARJO

    Get PDF
    This study aims to: 1) Make learning modules process the data with Microsoft Access 2003, 2) Examine the feasibility of learning module to process data with Microsoft Access 2003 as a medium of teaching in SMK Negeri 2 Sukoharjo, and 3) Determine the effectiveness of using learning modules process data with Microsoft Access 2003 on competency achievement data processing applications. This research is a research and development use development model Brog & Gall. Research and development is carried out by five steps: 1) Conducting analysis of products, 2) To develop the initial product, 3) Validation of expert and revision, 4) small-scale field trials and revisions, and 5) large-scale field tests and the final product. The subject of research on small-scale field trials were 12 students and research subjects on a large scale field tests of 78 students with a sampling technique that is purposive sampling. Determination of eligibility is done in an expert validation and small-scale field trials using a questionnaire while the effectiveness is done in large scale field tests using the results of the assessment practices. The data analysis technique for the feasibility of using descriptive statistics while the effectiveness of using two-sample t-test independent. According to expert assessment of materials and media expert in the expert validation, the learning module fit for use while in the small-scale field trials of learning modules fit for use by percentage of 83.33%. The results obtained by t-test t = 24.028 with df = 74 and p = 0.000, so there is a difference between the practicum students who use learning modules that do not use the learning modules. The mean of the lab for a class that uses a module that is 90.618 while the class does not use a module that is 69.405. As a whole class using the modules stated thoroughly in the competence and who do not use the module there are 5 students who need to make improvements

    Visualizing Gene Clusters using Neighborhood Graphs in R

    Get PDF
    The visualization of cluster solutions in gene expression data analysis gives practitioners an understanding of the cluster structure of their data and makes it easier to interpret the cluster results. Neighborhood graphs allow for visual assessment of relationships between adjacent clusters. The number of clusters in gene expression data is for biological reasons rather large. As a linear projection of the data into 2 dimensions does not scale well in the number of clusters there is a need for new visualization techniques using non-linear arrangement of the clusters. The new visualization tool is implemented in the open source statistical computing environment R. It is demonstrated on microarray data from yeast

    Feasibility of using neural networks to obtain simplified capacity curves for seismic assessment

    Get PDF
    The selection of a given method for the seismic vulnerability assessment of buildings is mostly dependent on the scale of the analysis. Results obtained in large-scale studies are usually less accurate than the ones obtained in small-scale studies. In this paper a study about the feasibility of using Artificial Neural Networks (ANNs) to carry out fast and accurate large-scale seismic vulnerability studies has been presented. In the proposed approach, an ANN was used to obtain a simplified capacity curve of a building typology, in order to use the N2 method to assess the structural seismic behaviour, as presented in the Annex B of the Eurocode 8. Aiming to study the accuracy of the proposed approach, two ANNs with equal architectures were trained with a different number of vectors, trying to evaluate the ANN capacity to achieve good results in domains of the problem which are not well represented by the training vectors. The case study presented in this work allowed the conclusion that the ANN precision is very dependent on the amount of data used to train the ANN and demonstrated that it is possible to use ANN to obtain simplified capacity curves for seismic assessment purposes with high precision.info:eu-repo/semantics/publishedVersio

    Perceived Wealth as a Poverty Measure for Constructing a Poverty Profile. A Case Study Of Four Villages In Rural Tanzania

    Get PDF
    Poverty assessment and targetting usually relies on expensive, large scale survey data. We argue that, in some cases, exploiting information villagers have on their immediate neighbors in close-knit agricultural societies might provide an alternative. We use the results of a participatory wealth ranking gathered in four villages in Tanzania and explore correlations between perceived wealth and indicators related to household characteristics, human capital, housing and durables, and productive assets. Comparing our results to a similar analysis using houshold expenditure survey data, we find that participatory methods confirm the validity of most commonly used poverty indicators, but we also find some remarkable differences.

    A parallel grid-based implementation for real time processing of event log data in collaborative applications

    Get PDF
    Collaborative applications usually register user interaction in the form of semi-structured plain text event log data. Extracting and structuring of data is a prerequisite for later key processes such as the analysis of interactions, assessment of group activity, or the provision of awareness and feedback. Yet, in real situations of online collaborative activity, the processing of log data is usually done offline since structuring event log data is, in general, a computationally costly process and the amount of log data tends to be very large. Techniques to speed and scale up the structuring and processing of log data with minimal impact on the performance of the collaborative application are thus desirable to be able to process log data in real time. In this paper, we present a parallel grid-based implementation for processing in real time the event log data generated in collaborative applications. Our results show the feasibility of using grid middleware to speed and scale up the process of structuring and processing semi-structured event log data. The Grid prototype follows the Master-Worker (MW) paradigm. It is implemented using the Globus Toolkit (GT) and is tested on the Planetlab platform

    Free global DSM assessment on large scale areas exploiting the potentialities of the innovative google earth engine platform

    Get PDF
    The high-performance cloud-computing platform Google Earth Engine has been developed for global-scale analysis based on the Earth observation data. In particular, in this work, the geometric accuracy of the two most used nearly-global free DSMs (SRTM and ASTER) has been evaluated on the territories of four American States (Colorado, Michigan, Nevada, Utah) and one Italian Region (Trentino Alto-Adige, Northern Italy) exploiting the potentiality of this platform. These are large areas characterized by different terrain morphology, land covers and slopes. The assessment has been performed using two different reference DSMs: the USGS National Elevation Dataset (NED) and a LiDAR acquisition. The DSMs accuracy has been evaluated through computation of standard statistic parameters, both at global scale (considering the whole State/Region) and in function of the terrain morphology using several slope classes. The geometric accuracy in terms of Standard deviation and NMAD, for SRTM range from 2-3 meters in the first slope class to about 45 meters in the last one, whereas for ASTER, the values range from 5-6 to 30 meters. In general, the performed analysis shows a better accuracy for the SRTM in the flat areas whereas the ASTER GDEM is more reliable in the steep areas, where the slopes increase. These preliminary results highlight the GEE potentialities to perform DSM assessment on a global scale

    Program on Earth Observation Data Management Systems (EODMS)

    Get PDF
    An assessment was made of the needs of a group of potential users of satellite remotely sensed data (state, regional, and local agencies) involved in natural resources management in five states, and alternative data management systems to satisfy these needs are outlined. Tasks described include: (1) a comprehensive data needs analysis of state and local users; (2) the design of remote sensing-derivable information products that serve priority state and local data needs; (3) a cost and performance analysis of alternative processing centers for producing these products; (4) an assessment of the impacts of policy, regulation and government structure on implementing large-scale use of remote sensing technology in this community of users; and (5) the elaboration of alternative institutional arrangements for operational Earth Observation Data Management Systems (EODMS). It is concluded that an operational EODMS will be of most use to state, regional, and local agencies if it provides a full range of information services -- from raw data acquisition to interpretation and dissemination of final information products

    REGIONAL INNOVATION SYSTEM FAILURES AND HIGHLIGHTS

    Get PDF
    The systemic analysis of innovation conceives complex analytical frameworks, with intense socio-technological aspects of knowledge generation and encompasses a detailed analysis of system failures. These frameworks are not suitable for benchmarking a wide range of regions, due to low availability of such elaborate data sources. On the other hand, metric regional innovation micro data offer the opportunity for large-scale cross-regional benchmarking exercise illustrating mainly the market failures of the innovation systems although this type of analysis does not provide any detailed systemic envisioning. Is the combination of these two analytical approaches possible? This study presents the Interaction Intension Indicator (3I) analytical framework, analysing system failures and highlights of various regional innovation deployment patterns along with the analysis of the Romanian innovation system.Regional Economic Activity: Growth, Development, and Changes, Regional innovation policies, regional innovation metrics, regional innovation systems, innovation policy assessment
    corecore