348,033 research outputs found
A Big Data Analyzer for Large Trace Logs
Current generation of Internet-based services are typically hosted on large
data centers that take the form of warehouse-size structures housing tens of
thousands of servers. Continued availability of a modern data center is the
result of a complex orchestration among many internal and external actors
including computing hardware, multiple layers of intricate software, networking
and storage devices, electrical power and cooling plants. During the course of
their operation, many of these components produce large amounts of data in the
form of event and error logs that are essential not only for identifying and
resolving problems but also for improving data center efficiency and
management. Most of these activities would benefit significantly from data
analytics techniques to exploit hidden statistical patterns and correlations
that may be present in the data. The sheer volume of data to be analyzed makes
uncovering these correlations and patterns a challenging task. This paper
presents BiDAl, a prototype Java tool for log-data analysis that incorporates
several Big Data technologies in order to simplify the task of extracting
information from data traces produced by large clusters and server farms. BiDAl
provides the user with several analysis languages (SQL, R and Hadoop MapReduce)
and storage backends (HDFS and SQLite) that can be freely mixed and matched so
that a custom tool for a specific task can be easily constructed. BiDAl has a
modular architecture so that it can be extended with other backends and
analysis languages in the future. In this paper we present the design of BiDAl
and describe our experience using it to analyze publicly-available traces from
Google data clusters, with the goal of building a realistic model of a complex
data center.Comment: 26 pages, 10 figure
Big Data Strategies for Data Center Infrastructure Management Using a 3D Gaming Platform
High Performance Computing (HPC) is intrinsically linked to effective Data
Center Infrastructure Management (DCIM). Cloud services and HPC have become key
components in Department of Defense and corporate Information Technology
competitive strategies in the global and commercial spaces. As a result, the
reliance on consistent, reliable Data Center space is more critical than ever.
The costs and complexity of providing quality DCIM are constantly being tested
and evaluated by the United States Government and companies such as Google,
Microsoft and Facebook. This paper will demonstrate a system where Big Data
strategies and 3D gaming technology is leveraged to successfully monitor and
analyze multiple HPC systems and a lights-out modular HP EcoPOD 240a Data
Center on a singular platform. Big Data technology and a 3D gaming platform
enables the relative real time monitoring of 5000 environmental sensors, more
than 3500 IT data points and display visual analytics of the overall operating
condition of the Data Center from a command center over 100 miles away. In
addition, the Big Data model allows for in depth analysis of historical trends
and conditions to optimize operations achieving even greater efficiencies and
reliability.Comment: 6 pages; accepted to IEEE High Peformance Extreme Computing (HPEC)
conference 201
Gaining from Training?: Designing an Online Training Module for University of Hawai‘i-West O‘ahu Peer-Tutors
The No‘eau Center, a learning center at the University of Hawai‘i-West O‘ahu (UHWO), provides supplemental support services to UHWO students through peer tutoring. In order to offer this service, the No‘eau Center hires UHWO undergraduate students and prepares them for tutoring through a rigorous training program. Following the guidelines of the International Tutor Training Program Certification (ITTPC) provided by the College Reading and Learning Association (CRLA), the center is qualified to provide Level 1 tutor training, which focuses on foundational tutoring elements for peer tutors. Having completed the requirements of Level 1 training, returning peer tutors have expressed a desire to broaden their tutoring abilities. In order to obtain Level-2 ITTPC certification from the CRLA, the No‘eau Center is required to provide training on ways to enhance the learning environment of a tutoring session. The purpose of this project was to create and evaluate an online tutor-training module to educate peer tutors on ways to structure and modify the learning environment of a tutoring session. The module was created using Google Sites, a free web development platform, as well as a combination of tools including: Google Docs, Google Forms, and YouTube. A constructivist design approached blended with anchored instruction were integrated into the design. This study involved a total of 11 participants ranging in ages from 18 to 26. All data collected from the project was analyzed and reported through the use of statistical and descriptive analysis. The results of the data suggest that after completing the online tutor training module, participants’ knowledge on tutoring strategies increased
Community-Based Services that Facilitate Interoperability and Intercomparison of Precipitation Datasets from Multiple Sources
Over the past 12 years, large volumes of precipitation data have been generated from space-based observatories (e.g., TRMM), merging of data products (e.g., gridded 3B42), models (e.g., GMAO), climatologies (e.g., Chang SSM/I derived rain indices), field campaigns, and ground-based measuring stations. The science research, applications, and education communities have greatly benefited from the unrestricted availability of these data from the Goddard Earth Sciences Data and Information Services Center (GES DISC) and, in particular, the services tailored toward precipitation data access and usability. In addition, tools and services that are responsive to the expressed evolving needs of the precipitation data user communities have been developed at the Precipitation Data and Information Services Center (PDISC) (http://disc.gsfc.nasa.gov/precipitation or google NASA PDISC), located at the GES DISC, to provide users with quick data exploration and access capabilities. In recent years, data management and access services have become increasingly sophisticated, such that they now afford researchers, particularly those interested in multi-data set science analysis and/or data validation, the ability to homogenize data sets, in order to apply multi-variant, comparison, and evaluation functions. Included in these services is the ability to capture data quality and data provenance. These interoperability services can be directly applied to future data sets, such as those from the Global Precipitation Measurement (GPM) mission. This presentation describes the data sets and services at the PDISC that are currently used by precipitation science and applications researchers, and which will be enhanced in preparation for GPM and associated multi-sensor data research. Specifically, the GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni) will be illustrated. Giovanni enables scientific exploration of Earth science data without researchers having to perform the complicated data access and match-up processes. In addition, PDISC tool and service capabilities being adapted for GPM data will be described, including the Google-like Mirador data search and access engine; semantic technology to help manage large amounts of multi-sensor data and their relationships; data access through various Web services (e.g., OPeNDAP, GDS, WMS, WCS); conversion to various formats (e.g., netCDF, HDF, KML (for Google Earth)); visualization and analysis of Level 2 data profiles and maps; parameter and spatial subsetting; time and temporal aggregation; regridding; data version control and provenance; continuous archive verification; and expertise in data-related standards and interoperability. The goal of providing these services is to further the progress towards a common framework by which data analysis/validation can be more easily accomplished
Using NASA's Giovanni Web Portal to Access and Visualize Satellite-based Earth Science Data in the Classroom
One of the biggest obstacles for the average Earth science student today is locating and obtaining satellite-based remote sensing data sets in a format that is accessible and optimal for their data analysis needs. At the Goddard Earth Sciences Data and Information Services Center (GES-DISC) alone, on the order of hundreds of Terabytes of data are available for distribution to scientists, students and the general public. The single biggest and time-consuming hurdle for most students when they begin their study of the various datasets is how to slog through this mountain of data to arrive at a properly sub-setted and manageable data set to answer their science question(s). The GES DISC provides a number of tools for data access and visualization, including the Google-like Mirador search engine and the powerful GES-DISC Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni) web interface
Pengaruh Kepuasan Kerja, Motivasi Kerja dan Komitmen Afektif terhadap Kinerja Karyawan
This study aims to prove whether job satisfaction, work motivation, and affective commitment affect employee performance.. The study took samples of employees who worked at the Seira Wermaktian Health Center. The sampling technique used total sampling. Data collection was carried out by sending questionnaires via personal chat or via email in the form of google form to puskesmas employees. The number of questionnaires processed is 45 questionnaires. The data analysis used multiple linear regression analysis. The results of this study indicate that job satisfaction, work motivation, and affective commitment partially have a positive effect on employee performance
A Big Data analyzer for large trace logs
Current generation of Internet-based services are typically hosted on large data centers that take the form of warehouse-size structures housing tens of thousands of servers. Continued availability of a modern data center is the result of a complex orchestration among many internal and external actors including computing hardware, multiple layers of intricate software, networking and storage devices, electrical power and cooling plants. During the course of their operation, many of these components produce large amounts of data in the form of event and error logs that are essential not only for identifying and resolving problems but also for improving data center efficiency and management. Most of these activities would benefit significantly from data analytics techniques to exploit hidden statistical patterns and correlations that may be present in the data. The sheer volume of data to be analyzed makes uncovering these correlations and patterns a challenging task. This paper presents Big Data analyzer (BiDAl), a prototype Java tool for log-data analysis that incorporates several Big Data technologies in order to simplify the task of extracting information from data traces produced by large clusters and server farms. BiDAl provides the user with several analysis languages (SQL, R and Hadoop MapReduce) and storage backends (HDFS and SQLite) that can be freely mixed and matched so that a custom tool for a specific task can be easily constructed. BiDAl has a modular architecture so that it can be extended with other backends and analysis languages in the future. In this paper we present the design of BiDAl and describe our experience using it to analyze publicly-available traces from Google data clusters, with the goal of building a realistic model of a complex data center
The Impact of The Merdeka Curriculum on Indonesia Education
This study aims to determine the effect of the size of the independent curriculum on Indonesian education. This study is a type of SLR research and meta-analysis. This research data comes from an analysis of 10 national and international journals published in 2018-2023. The keyword for data search is the influence of the independent curriculum in Indonesia. Search data sources through google scholar databases, Educarion Resources Information Center (ERIC), and ScienceDirect. Data analysis with the help of Comprensive Meta-analysis (CMA) applications. The results of the study concluded that the independent curriculum has a positive effect on the education system in Indonesia with a summary effect size value (rE = 0.68; Z = 8.146; p < 0.001). This finding shows that the application of the mendeka curriculum has a significant effect on Indonesian education in the medium category. The implementation of an independent curriculum can train students to have critical thinking skills, scientific literacy and student numeracy in learning
BiDAl: Big Data Analyzer for Cluster Traces
Modern data centers that provide Internet-scale services are stadium-size structures housing tens of thousands of heterogeneous devices (server clusters, networking equipment, power and cooling infrastructures) that must operate continuously and reliably. As part of their operation, these devices produce large amounts of data in the form of event and error logs that are essential not only for identifying problems but also for improving data center efficiency and management. These activities employ data analytics and often exploit hidden statistical patterns and correlations among different factors present in the data. Uncovering these patterns and correlations is challenging due to the sheer volume of data to be analyzed. This paper presents BiDAl, a prototype “log-data analysis framework” that incorporates various Big Data technologies to simplify the analysis of data traces from large clusters. BiDAl is written in Java with a modular and extensible architecture so that different storage backends (currently, HDFS and SQLite are supported), as well as different analysis languages (current implementation supports SQL, R and Hadoop MapReduce) can be easily selected as appropriate. We present the design of BiDAl and describe our experience using it to analyze several public traces of Google data clusters for building a simulation model capable of reproducing observed behavior
Creating User-Friendly Tools for Data Analysis and Visualization in K-12 Classrooms: A Fortran Dinosaur Meets Generation Y
During the summer of 2007, as part of the second year of a NASA-funded project in partnership with Christopher Newport University called SPHERE (Students as Professionals Helping Educators Research the Earth), a group of undergraduate students spent 8 weeks in a research internship at or near NASA Langley Research Center. Three students from this group formed the Clouds group along with a NASA mentor (Chambers), and the brief addition of a local high school student fulfilling a mentorship requirement. The Clouds group was given the task of exploring and analyzing ground-based cloud observations obtained by K-12 students as part of the Students' Cloud Observations On-Line (S'COOL) Project, and the corresponding satellite data. This project began in 1997. The primary analysis tools developed for it were in FORTRAN, a computer language none of the students were familiar with. While they persevered through computer challenges and picky syntax, it eventually became obvious that this was not the most fruitful approach for a project aimed at motivating K-12 students to do their own data analysis. Thus, about halfway through the summer the group shifted its focus to more modern data analysis and visualization tools, namely spreadsheets and Google(tm) Earth. The result of their efforts, so far, is two different Excel spreadsheets and a Google(tm) Earth file. The spreadsheets are set up to allow participating classrooms to paste in a particular dataset of interest, using the standard S'COOL format, and easily perform a variety of analyses and comparisons of the ground cloud observation reports and their correspondence with the satellite data. This includes summarizing cloud occurrence and cloud cover statistics, and comparing cloud cover measurements from the two points of view. A visual classification tool is also provided to compare the cloud levels reported from the two viewpoints. This provides a statistical counterpart to the existing S'COOL data visualization tool, which is used for individual ground-to-satellite correspondences. The Google(tm) Earth file contains a set of placemarks and ground overlays to show participating students the area around their school that the satellite is measuring. This approach will be automated and made interactive by the S'COOL database expert and will also be used to help refine the latitude/longitude location of the participating schools. Once complete, these new data analysis tools will be posted on the S'COOL website for use by the project participants in schools around the US and the world
- …