348,033 research outputs found

    A Big Data Analyzer for Large Trace Logs

    Full text link
    Current generation of Internet-based services are typically hosted on large data centers that take the form of warehouse-size structures housing tens of thousands of servers. Continued availability of a modern data center is the result of a complex orchestration among many internal and external actors including computing hardware, multiple layers of intricate software, networking and storage devices, electrical power and cooling plants. During the course of their operation, many of these components produce large amounts of data in the form of event and error logs that are essential not only for identifying and resolving problems but also for improving data center efficiency and management. Most of these activities would benefit significantly from data analytics techniques to exploit hidden statistical patterns and correlations that may be present in the data. The sheer volume of data to be analyzed makes uncovering these correlations and patterns a challenging task. This paper presents BiDAl, a prototype Java tool for log-data analysis that incorporates several Big Data technologies in order to simplify the task of extracting information from data traces produced by large clusters and server farms. BiDAl provides the user with several analysis languages (SQL, R and Hadoop MapReduce) and storage backends (HDFS and SQLite) that can be freely mixed and matched so that a custom tool for a specific task can be easily constructed. BiDAl has a modular architecture so that it can be extended with other backends and analysis languages in the future. In this paper we present the design of BiDAl and describe our experience using it to analyze publicly-available traces from Google data clusters, with the goal of building a realistic model of a complex data center.Comment: 26 pages, 10 figure

    Big Data Strategies for Data Center Infrastructure Management Using a 3D Gaming Platform

    Full text link
    High Performance Computing (HPC) is intrinsically linked to effective Data Center Infrastructure Management (DCIM). Cloud services and HPC have become key components in Department of Defense and corporate Information Technology competitive strategies in the global and commercial spaces. As a result, the reliance on consistent, reliable Data Center space is more critical than ever. The costs and complexity of providing quality DCIM are constantly being tested and evaluated by the United States Government and companies such as Google, Microsoft and Facebook. This paper will demonstrate a system where Big Data strategies and 3D gaming technology is leveraged to successfully monitor and analyze multiple HPC systems and a lights-out modular HP EcoPOD 240a Data Center on a singular platform. Big Data technology and a 3D gaming platform enables the relative real time monitoring of 5000 environmental sensors, more than 3500 IT data points and display visual analytics of the overall operating condition of the Data Center from a command center over 100 miles away. In addition, the Big Data model allows for in depth analysis of historical trends and conditions to optimize operations achieving even greater efficiencies and reliability.Comment: 6 pages; accepted to IEEE High Peformance Extreme Computing (HPEC) conference 201

    Gaining from Training?: Designing an Online Training Module for University of Hawai‘i-West O‘ahu Peer-Tutors

    Get PDF
    The No‘eau Center, a learning center at the University of Hawai‘i-West O‘ahu (UHWO), provides supplemental support services to UHWO students through peer tutoring. In order to offer this service, the No‘eau Center hires UHWO undergraduate students and prepares them for tutoring through a rigorous training program. Following the guidelines of the International Tutor Training Program Certification (ITTPC) provided by the College Reading and Learning Association (CRLA), the center is qualified to provide Level 1 tutor training, which focuses on foundational tutoring elements for peer tutors. Having completed the requirements of Level 1 training, returning peer tutors have expressed a desire to broaden their tutoring abilities. In order to obtain Level-2 ITTPC certification from the CRLA, the No‘eau Center is required to provide training on ways to enhance the learning environment of a tutoring session. The purpose of this project was to create and evaluate an online tutor-training module to educate peer tutors on ways to structure and modify the learning environment of a tutoring session. The module was created using Google Sites, a free web development platform, as well as a combination of tools including: Google Docs, Google Forms, and YouTube. A constructivist design approached blended with anchored instruction were integrated into the design. This study involved a total of 11 participants ranging in ages from 18 to 26. All data collected from the project was analyzed and reported through the use of statistical and descriptive analysis. The results of the data suggest that after completing the online tutor training module, participants’ knowledge on tutoring strategies increased

    Community-Based Services that Facilitate Interoperability and Intercomparison of Precipitation Datasets from Multiple Sources

    Get PDF
    Over the past 12 years, large volumes of precipitation data have been generated from space-based observatories (e.g., TRMM), merging of data products (e.g., gridded 3B42), models (e.g., GMAO), climatologies (e.g., Chang SSM/I derived rain indices), field campaigns, and ground-based measuring stations. The science research, applications, and education communities have greatly benefited from the unrestricted availability of these data from the Goddard Earth Sciences Data and Information Services Center (GES DISC) and, in particular, the services tailored toward precipitation data access and usability. In addition, tools and services that are responsive to the expressed evolving needs of the precipitation data user communities have been developed at the Precipitation Data and Information Services Center (PDISC) (http://disc.gsfc.nasa.gov/precipitation or google NASA PDISC), located at the GES DISC, to provide users with quick data exploration and access capabilities. In recent years, data management and access services have become increasingly sophisticated, such that they now afford researchers, particularly those interested in multi-data set science analysis and/or data validation, the ability to homogenize data sets, in order to apply multi-variant, comparison, and evaluation functions. Included in these services is the ability to capture data quality and data provenance. These interoperability services can be directly applied to future data sets, such as those from the Global Precipitation Measurement (GPM) mission. This presentation describes the data sets and services at the PDISC that are currently used by precipitation science and applications researchers, and which will be enhanced in preparation for GPM and associated multi-sensor data research. Specifically, the GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni) will be illustrated. Giovanni enables scientific exploration of Earth science data without researchers having to perform the complicated data access and match-up processes. In addition, PDISC tool and service capabilities being adapted for GPM data will be described, including the Google-like Mirador data search and access engine; semantic technology to help manage large amounts of multi-sensor data and their relationships; data access through various Web services (e.g., OPeNDAP, GDS, WMS, WCS); conversion to various formats (e.g., netCDF, HDF, KML (for Google Earth)); visualization and analysis of Level 2 data profiles and maps; parameter and spatial subsetting; time and temporal aggregation; regridding; data version control and provenance; continuous archive verification; and expertise in data-related standards and interoperability. The goal of providing these services is to further the progress towards a common framework by which data analysis/validation can be more easily accomplished

    Using NASA's Giovanni Web Portal to Access and Visualize Satellite-based Earth Science Data in the Classroom

    Get PDF
    One of the biggest obstacles for the average Earth science student today is locating and obtaining satellite-based remote sensing data sets in a format that is accessible and optimal for their data analysis needs. At the Goddard Earth Sciences Data and Information Services Center (GES-DISC) alone, on the order of hundreds of Terabytes of data are available for distribution to scientists, students and the general public. The single biggest and time-consuming hurdle for most students when they begin their study of the various datasets is how to slog through this mountain of data to arrive at a properly sub-setted and manageable data set to answer their science question(s). The GES DISC provides a number of tools for data access and visualization, including the Google-like Mirador search engine and the powerful GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni) web interface

    Pengaruh Kepuasan Kerja, Motivasi Kerja dan Komitmen Afektif terhadap Kinerja Karyawan

    Get PDF
    This study aims to prove whether job satisfaction, work motivation, and affective commitment affect employee performance.. The study took samples of employees who worked at the Seira Wermaktian Health Center. The sampling technique used total sampling. Data collection was carried out by sending questionnaires via personal chat or via email in the form of google form to puskesmas employees. The number of questionnaires processed is 45 questionnaires. The data analysis used multiple linear regression analysis. The results of this study indicate that job satisfaction, work motivation, and affective commitment partially have a positive effect on employee performance

    A Big Data analyzer for large trace logs

    Get PDF
    Current generation of Internet-based services are typically hosted on large data centers that take the form of warehouse-size structures housing tens of thousands of servers. Continued availability of a modern data center is the result of a complex orchestration among many internal and external actors including computing hardware, multiple layers of intricate software, networking and storage devices, electrical power and cooling plants. During the course of their operation, many of these components produce large amounts of data in the form of event and error logs that are essential not only for identifying and resolving problems but also for improving data center efficiency and management. Most of these activities would benefit significantly from data analytics techniques to exploit hidden statistical patterns and correlations that may be present in the data. The sheer volume of data to be analyzed makes uncovering these correlations and patterns a challenging task. This paper presents Big Data analyzer (BiDAl), a prototype Java tool for log-data analysis that incorporates several Big Data technologies in order to simplify the task of extracting information from data traces produced by large clusters and server farms. BiDAl provides the user with several analysis languages (SQL, R and Hadoop MapReduce) and storage backends (HDFS and SQLite) that can be freely mixed and matched so that a custom tool for a specific task can be easily constructed. BiDAl has a modular architecture so that it can be extended with other backends and analysis languages in the future. In this paper we present the design of BiDAl and describe our experience using it to analyze publicly-available traces from Google data clusters, with the goal of building a realistic model of a complex data center

    The Impact of The Merdeka Curriculum on Indonesia Education

    Get PDF
    This study aims to determine the effect of the size of the  independent curriculum on  Indonesian education. This study is a type of SLR research and meta-analysis. This research data comes from an analysis of 10 national and international journals published in 2018-2023. The keyword for data search is the influence of the independent curriculum in Indonesia. Search data sources through google scholar databases, Educarion Resources Information Center (ERIC), and ScienceDirect. Data analysis with the help of Comprensive Meta-analysis (CMA) applications. The results of the study concluded that the independent curriculum has a positive effect on the education system in Indonesia with a summary effect size value (rE = 0.68; Z = 8.146; p < 0.001). This finding shows that the application of the mendeka curriculum has a significant effect on Indonesian education in the medium category. The implementation of an independent curriculum can train students to have critical thinking skills, scientific literacy and student numeracy in learning

    BiDAl: Big Data Analyzer for Cluster Traces

    Get PDF
    Modern data centers that provide Internet-scale services are stadium-size structures housing tens of thousands of heterogeneous devices (server clusters, networking equipment, power and cooling infrastructures) that must operate continuously and reliably. As part of their operation, these devices produce large amounts of data in the form of event and error logs that are essential not only for identifying problems but also for improving data center efficiency and management. These activities employ data analytics and often exploit hidden statistical patterns and correlations among different factors present in the data. Uncovering these patterns and correlations is challenging due to the sheer volume of data to be analyzed. This paper presents BiDAl, a prototype “log-data analysis framework” that incorporates various Big Data technologies to simplify the analysis of data traces from large clusters. BiDAl is written in Java with a modular and extensible architecture so that different storage backends (currently, HDFS and SQLite are supported), as well as different analysis languages (current implementation supports SQL, R and Hadoop MapReduce) can be easily selected as appropriate. We present the design of BiDAl and describe our experience using it to analyze several public traces of Google data clusters for building a simulation model capable of reproducing observed behavior

    Creating User-Friendly Tools for Data Analysis and Visualization in K-12 Classrooms: A Fortran Dinosaur Meets Generation Y

    Get PDF
    During the summer of 2007, as part of the second year of a NASA-funded project in partnership with Christopher Newport University called SPHERE (Students as Professionals Helping Educators Research the Earth), a group of undergraduate students spent 8 weeks in a research internship at or near NASA Langley Research Center. Three students from this group formed the Clouds group along with a NASA mentor (Chambers), and the brief addition of a local high school student fulfilling a mentorship requirement. The Clouds group was given the task of exploring and analyzing ground-based cloud observations obtained by K-12 students as part of the Students' Cloud Observations On-Line (S'COOL) Project, and the corresponding satellite data. This project began in 1997. The primary analysis tools developed for it were in FORTRAN, a computer language none of the students were familiar with. While they persevered through computer challenges and picky syntax, it eventually became obvious that this was not the most fruitful approach for a project aimed at motivating K-12 students to do their own data analysis. Thus, about halfway through the summer the group shifted its focus to more modern data analysis and visualization tools, namely spreadsheets and Google(tm) Earth. The result of their efforts, so far, is two different Excel spreadsheets and a Google(tm) Earth file. The spreadsheets are set up to allow participating classrooms to paste in a particular dataset of interest, using the standard S'COOL format, and easily perform a variety of analyses and comparisons of the ground cloud observation reports and their correspondence with the satellite data. This includes summarizing cloud occurrence and cloud cover statistics, and comparing cloud cover measurements from the two points of view. A visual classification tool is also provided to compare the cloud levels reported from the two viewpoints. This provides a statistical counterpart to the existing S'COOL data visualization tool, which is used for individual ground-to-satellite correspondences. The Google(tm) Earth file contains a set of placemarks and ground overlays to show participating students the area around their school that the satellite is measuring. This approach will be automated and made interactive by the S'COOL database expert and will also be used to help refine the latitude/longitude location of the participating schools. Once complete, these new data analysis tools will be posted on the S'COOL website for use by the project participants in schools around the US and the world
    corecore