56,141 research outputs found
Lustre, Hadoop, Accumulo
Data processing systems impose multiple views on data as it is processed by
the system. These views include spreadsheets, databases, matrices, and graphs.
There are a wide variety of technologies that can be used to store and process
data through these different steps. The Lustre parallel file system, the Hadoop
distributed file system, and the Accumulo database are all designed to address
the largest and the most challenging data storage problems. There have been
many ad-hoc comparisons of these technologies. This paper describes the
foundational principles of each technology, provides simple models for
assessing their capabilities, and compares the various technologies on a
hypothetical common cluster. These comparisons indicate that Lustre provides 2x
more storage capacity, is less likely to loose data during 3 simultaneous drive
failures, and provides higher bandwidth on general purpose workloads. Hadoop
can provide 4x greater read bandwidth on special purpose workloads. Accumulo
provides 10,000x lower latency on random lookups than either Lustre or Hadoop
but Accumulo's bulk bandwidth is 10x less. Significant recent work has been
done to enable mix-and-match solutions that allow Lustre, Hadoop, and Accumulo
to be combined in different ways.Comment: 6 pages; accepted to IEEE High Performance Extreme Computing
conference, Waltham, MA, 201
Performance Measurements of Supercomputing and Cloud Storage Solutions
Increasing amounts of data from varied sources, particularly in the fields of
machine learning and graph analytics, are causing storage requirements to grow
rapidly. A variety of technologies exist for storing and sharing these data,
ranging from parallel file systems used by supercomputers to distributed block
storage systems found in clouds. Relatively few comparative measurements exist
to inform decisions about which storage systems are best suited for particular
tasks. This work provides these measurements for two of the most popular
storage technologies: Lustre and Amazon S3. Lustre is an open-source, high
performance, parallel file system used by many of the largest supercomputers in
the world. Amazon's Simple Storage Service, or S3, is part of the Amazon Web
Services offering, and offers a scalable, distributed option to store and
retrieve data from anywhere on the Internet. Parallel processing is essential
for achieving high performance on modern storage systems. The performance tests
used span the gamut of parallel I/O scenarios, ranging from single-client,
single-node Amazon S3 and Lustre performance to a large-scale, multi-client
test designed to demonstrate the capabilities of a modern storage appliance
under heavy load. These results show that, when parallel I/O is used correctly
(i.e., many simultaneous read or write processes), full network bandwidth
performance is achievable and ranged from 10 gigabits/s over a 10 GigE S3
connection to 0.35 terabits/s using Lustre on a 1200 port 10 GigE switch. These
results demonstrate that S3 is well-suited to sharing vast quantities of data
over the Internet, while Lustre is well-suited to processing large quantities
of data locally.Comment: 5 pages, 4 figures, to appear in IEEE HPEC 201
Development of a Fabric Lustre Scale
Fabric lustre is one of those attributes which affects the visual appearance of a fabric. It is the amount of specular light the fabric reflects. So far, there is no simple and satisfactory method for either the subjective or objective assessment of fabric lustre since its measurement is complex. A series of experiments were conducted for the development of a scale for the subjective measurement of fabric lustre. A number of woven fabric samples with varying luster were used for the subjective assessment of lustre by trained assessors. A glossmeter was then used to measure the fabric samples objectively. Simple regression analysis technique was applied to relate the subjective to the objective lustre data and results indicated a high degree of agreement between them. The instrumental data were further used to construct a lustre scale which was assessed statistically for its reliability using larger fabric sample population. Furthermore, the lustre of the fabric samples were measured spectrophotometrically and results showed a good correlation between the delta Y values and the grade values of the physical lustre scale.Keywords: Fabric lustre, lustre scale, glossmeter, spectrophotomete
The Need for Health and Community Resources in Monterey County
The Monterey County Health Department in the Planning Evaluation and Policy Unit (PEP), focuses on three areas; facilitating the implementation of the Health Department Strategic Plan, aligning and monitoring the department\u27s performance standards. PEP has had an intern who noticed the needs of Monterey County residents facing barriers when accessing health and community resources. The Monterey County’s Population is at 444,732 and more than half of the population are people of color. With having such a diverse population there are a lot of barriers to consider when accessing health and community resources such as language barriers and navigating health insurance. The consequences of this are a shorter life span, receiving poor quality care and lack or no information which can lead to mistrust in the community. This capstone project will demonstrate some successes and challenges faced in the community by interviewing agencies and organizations on their experiences. Based on the interns\u27 findings, language barriers and community engagement play an important role as to why many Monterey County residents are facing barriers when it comes to accessing health and community resources. The intern recommends organizations and agencies to do more community outreach to help engaging community residents and build a connection
LusRegTes: A Regression Testing Tool for Lustre Programs
Lustre is a synchronous data-flow declarative language widely used for safety-critical applications (avionics, energy, transport...). In such applications, the testing activity for detecting errors of the system plays a crucial role. During the development and maintenance processes, Lustre programs are often evolving, so regression testing should be performed to detect bugs. In this paper, we present a tool for automatic regression testing of Lustre programs. We have defined an approach to generate test cases in regression testing of Lustre programs. In this approach, a Lustre program is represented by an operator network, then the set of paths is identified and the path activation conditions are symbolically computed for each version. Regression test cases are generated by comparing paths between versions. The approach was implemented in a tool, called LusRegTes, in order to automate the test process for Lustre programs
- …