1,149 research outputs found
VXA: A Virtual Architecture for Durable Compressed Archives
Data compression algorithms change frequently, and obsolete decoders do not
always run on new hardware and operating systems, threatening the long-term
usability of content archived using those algorithms. Re-encoding content into
new formats is cumbersome, and highly undesirable when lossy compression is
involved. Processor architectures, in contrast, have remained comparatively
stable over recent decades. VXA, an archival storage system designed around
this observation, archives executable decoders along with the encoded content
it stores. VXA decoders run in a specialized virtual machine that implements an
OS-independent execution environment based on the standard x86 architecture.
The VXA virtual machine strictly limits access to host system services, making
decoders safe to run even if an archive contains malicious code. VXA's adoption
of a "native" processor architecture instead of type-safe language technology
allows reuse of existing "hand-optimized" decoders in C and assembly language,
and permits decoders access to performance-enhancing architecture features such
as vector processing instructions. The performance cost of VXA's virtualization
is typically less than 15% compared with the same decoders running natively.
The storage cost of archived decoders, typically 30-130KB each, can be
amortized across many archived files sharing the same compression method.Comment: 14 pages, 7 figures, 2 table
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
A cache framework for nomadic clients of web services
This research explores the problems associated with caching of SOAP Web Service request/response pairs, and presents a domain independent framework enabling transparent caching of Web Service requests for mobile clients. The framework intercepts method calls intended for the web service and proceeds by buffering and caching of the outgoing method call and the inbound responses. This enables a mobile application to seamlessly use Web Services by masking fluctuations in network conditions.
This framework addresses two main issues, firstly how to enrich the WS standards to enable caching and secondly how to maintain consistency for state dependent Web Service request/response pairs
The Family of MapReduce and Large Scale Data Processing Systems
In the last two decades, the continuous increase of computational power has
produced an overwhelming flow of data which has called for a paradigm shift in
the computing architecture and large scale data processing mechanisms.
MapReduce is a simple and powerful programming model that enables easy
development of scalable parallel applications to process vast amounts of data
on large clusters of commodity machines. It isolates the application from the
details of running a distributed program such as issues on data distribution,
scheduling and fault tolerance. However, the original implementation of the
MapReduce framework had some limitations that have been tackled by many
research efforts in several followup works after its introduction. This article
provides a comprehensive survey for a family of approaches and mechanisms of
large scale data processing mechanisms that have been implemented based on the
original idea of the MapReduce framework and are currently gaining a lot of
momentum in both research and industrial communities. We also cover a set of
introduced systems that have been implemented to provide declarative
programming interfaces on top of the MapReduce framework. In addition, we
review several large scale data processing systems that resemble some of the
ideas of the MapReduce framework for different purposes and application
scenarios. Finally, we discuss some of the future research directions for
implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author
Optimising clients with API gateways
This thesis investigates the benefits and complications around working with API (Application Programming Interface) gateways. When we say API gateway, we mean to proxy and potentially enhance the communication between servers and clients, such as browsers, by transforming the data. We do this by examining the underlying protocol HTTP/1.1 and the general theory regarding API gateways. An API gateway framework was developed in order to further understand some of the common problems and provide a way to rapidly develop prototype solutions to them. The framework was applied in three case studies in order to discover potential problematic areas and solve these in real world production systems. We could from the results see that the benefits gained from using an API gateway varied from case to case, and with results in hand, predict in which scenarios API gateways are the most beneficial.APIer över HTTP anpassas sällan för olika klienters behov vilket medför krånglig kommunikation och reducerad prestanda. En API-gateway kan placeras mellan klienter och APIer för att åtgärda detta
Migrating Integration from SOAP to REST : Can the Advantages of Migration Justify the Project?
This thesis investigates the functional and conceptual differences between SOAP-based and RESTful web services and their implications in the context of a real-world migration project. The primary research questions addressed are:
• What are the key functional and conceptual differences between SOAP-based and RESTful web services?
• How can SOAP-based and RESTful service clients be implemented into a general client?
• Can developing a client to work with REST and SOAP be justified based on differences in performance and maintainability?
The thesis begins with a literature review of the core principles and features of SOAP and REST, highlighting their strengths, weaknesses, and suitability for different use cases. A detailed comparison table is provided to summarize the key differences between the two web services.
The thesis presents a case study of a migration project from Lemonsoft's web team, which involved adapting an existing integration to support SOAP-based and RESTful services. The project utilized design patterns and a general client implementation to achieve a unified solution compatible with both protocols.
In terms of performance, the evaluation showed that the general client led to faster execution times and reduced memory usage, enhancing the overall system efficiency. Additionally, improvements in maintainability were achieved by simplifying the codebase, using design patterns and object factories, adopting an interface-driven design, and promoting collaborative code reviews. These enhancements have not only resulted in a better user experience but also minimized future resource demands and maintenance costs.
In conclusion, this thesis provides valuable insights into the functional and conceptual differences between SOAP-based and RESTful web services, the challenges and best practices for implementing a general client, and the justification for resource usage in such a solution based on performance and maintainability improvements
A Query Matching Approach for Object Relational Databases Over Semantic Cache
The acceptance of object relational database has grown in recent years; however, their response time is a big concern. Especially, when large data are retrieved frequently on such databases from diverse servers, response time becomes alarming. Different techniques have been investigated to reduce the response time, and cache is among such techniques. Cache has three variants, namely tuple cache, page cache, and semantic cache. Semantic cache is more efficient compared to others due to capability to store already processed data with its semantics. A semantic cache stores data computed on demand rather than retrieved from the server. Several approaches proposed on relational databases over semantic caching but response time on relational database is unsatisfactory. Hence, we proposed object relational databases over semantic cache. It is a novelty because semantic cache is mature for evaluation of relational databases but not for object relational databases. In this research, the implementation of query matching on object relational database with semantic caching along with object query is investigated to reduce the response time. Then, a case study is conducted on an object relational database model, and an object (relational database) query with semantic segment is applied. Results depict significant improvement in query response time
Web Content Delivery Optimization
Milliseconds matters, when they’re counted. If we consider the life of the universe into one single year, then on 31 December at 11:59:59.5 PM, “speed” was transportation’s concern, and now after 500 milliseconds it is web’s, and no one knows whose concern it would be in coming milliseconds, but at this very moment; this thesis proposes an optimization method, mainly for content delivery on slow connections. The method utilizes a proxy as a middle box to fetch the content; requested by a client, from a single or multiple web servers, and bundles all of the fetched image content types that fits into the bundling policy; inside a JavaScript file in Base64 format. This optimization method reduces the number of HTTP requests between the client and multiple web servers as a result of its proposed bundling solution, and at the same time optimizes the HTTP compression efficiency as a result of its proposed method of aggregative textual content compression. Page loading time results of the test web pages; which were specially designed and developed to capture the optimum benefits of the proposed method; proved up to 81% faster page loading time for all connection types. However, other tests in non-optimal situations such as webpages which use “Lazy Loading” techniques, showed just 35% to 50% benefits, that is only achievable on 2G and 3G connections (0.2 Mbps – 15 Mbps downlink) and not faster connections
- …