1,149 research outputs found

    VXA: A Virtual Architecture for Durable Compressed Archives

    Full text link
    Data compression algorithms change frequently, and obsolete decoders do not always run on new hardware and operating systems, threatening the long-term usability of content archived using those algorithms. Re-encoding content into new formats is cumbersome, and highly undesirable when lossy compression is involved. Processor architectures, in contrast, have remained comparatively stable over recent decades. VXA, an archival storage system designed around this observation, archives executable decoders along with the encoded content it stores. VXA decoders run in a specialized virtual machine that implements an OS-independent execution environment based on the standard x86 architecture. The VXA virtual machine strictly limits access to host system services, making decoders safe to run even if an archive contains malicious code. VXA's adoption of a "native" processor architecture instead of type-safe language technology allows reuse of existing "hand-optimized" decoders in C and assembly language, and permits decoders access to performance-enhancing architecture features such as vector processing instructions. The performance cost of VXA's virtualization is typically less than 15% compared with the same decoders running natively. The storage cost of archived decoders, typically 30-130KB each, can be amortized across many archived files sharing the same compression method.Comment: 14 pages, 7 figures, 2 table

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    A cache framework for nomadic clients of web services

    Get PDF
    This research explores the problems associated with caching of SOAP Web Service request/response pairs, and presents a domain independent framework enabling transparent caching of Web Service requests for mobile clients. The framework intercepts method calls intended for the web service and proceeds by buffering and caching of the outgoing method call and the inbound responses. This enables a mobile application to seamlessly use Web Services by masking fluctuations in network conditions. This framework addresses two main issues, firstly how to enrich the WS standards to enable caching and secondly how to maintain consistency for state dependent Web Service request/response pairs

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    Optimising clients with API gateways

    Get PDF
    This thesis investigates the benefits and complications around working with API (Application Programming Interface) gateways. When we say API gateway, we mean to proxy and potentially enhance the communication between servers and clients, such as browsers, by transforming the data. We do this by examining the underlying protocol HTTP/1.1 and the general theory regarding API gateways. An API gateway framework was developed in order to further understand some of the common problems and provide a way to rapidly develop prototype solutions to them. The framework was applied in three case studies in order to discover potential problematic areas and solve these in real world production systems. We could from the results see that the benefits gained from using an API gateway varied from case to case, and with results in hand, predict in which scenarios API gateways are the most beneficial.APIer över HTTP anpassas sällan för olika klienters behov vilket medför krånglig kommunikation och reducerad prestanda. En API-gateway kan placeras mellan klienter och APIer för att åtgärda detta

    Migrating Integration from SOAP to REST : Can the Advantages of Migration Justify the Project?

    Get PDF
    This thesis investigates the functional and conceptual differences between SOAP-based and RESTful web services and their implications in the context of a real-world migration project. The primary research questions addressed are: • What are the key functional and conceptual differences between SOAP-based and RESTful web services? • How can SOAP-based and RESTful service clients be implemented into a general client? • Can developing a client to work with REST and SOAP be justified based on differences in performance and maintainability? The thesis begins with a literature review of the core principles and features of SOAP and REST, highlighting their strengths, weaknesses, and suitability for different use cases. A detailed comparison table is provided to summarize the key differences between the two web services. The thesis presents a case study of a migration project from Lemonsoft's web team, which involved adapting an existing integration to support SOAP-based and RESTful services. The project utilized design patterns and a general client implementation to achieve a unified solution compatible with both protocols. In terms of performance, the evaluation showed that the general client led to faster execution times and reduced memory usage, enhancing the overall system efficiency. Additionally, improvements in maintainability were achieved by simplifying the codebase, using design patterns and object factories, adopting an interface-driven design, and promoting collaborative code reviews. These enhancements have not only resulted in a better user experience but also minimized future resource demands and maintenance costs. In conclusion, this thesis provides valuable insights into the functional and conceptual differences between SOAP-based and RESTful web services, the challenges and best practices for implementing a general client, and the justification for resource usage in such a solution based on performance and maintainability improvements

    A Query Matching Approach for Object Relational Databases Over Semantic Cache

    Get PDF
    The acceptance of object relational database has grown in recent years; however, their response time is a big concern. Especially, when large data are retrieved frequently on such databases from diverse servers, response time becomes alarming. Different techniques have been investigated to reduce the response time, and cache is among such techniques. Cache has three variants, namely tuple cache, page cache, and semantic cache. Semantic cache is more efficient compared to others due to capability to store already processed data with its semantics. A semantic cache stores data computed on demand rather than retrieved from the server. Several approaches proposed on relational databases over semantic caching but response time on relational database is unsatisfactory. Hence, we proposed object relational databases over semantic cache. It is a novelty because semantic cache is mature for evaluation of relational databases but not for object relational databases. In this research, the implementation of query matching on object relational database with semantic caching along with object query is investigated to reduce the response time. Then, a case study is conducted on an object relational database model, and an object (relational database) query with semantic segment is applied. Results depict significant improvement in query response time

    Web Content Delivery Optimization

    Get PDF
    Milliseconds matters, when they’re counted. If we consider the life of the universe into one single year, then on 31 December at 11:59:59.5 PM, “speed” was transportation’s concern, and now after 500 milliseconds it is web’s, and no one knows whose concern it would be in coming milliseconds, but at this very moment; this thesis proposes an optimization method, mainly for content delivery on slow connections. The method utilizes a proxy as a middle box to fetch the content; requested by a client, from a single or multiple web servers, and bundles all of the fetched image content types that fits into the bundling policy; inside a JavaScript file in Base64 format. This optimization method reduces the number of HTTP requests between the client and multiple web servers as a result of its proposed bundling solution, and at the same time optimizes the HTTP compression efficiency as a result of its proposed method of aggregative textual content compression. Page loading time results of the test web pages; which were specially designed and developed to capture the optimum benefits of the proposed method; proved up to 81% faster page loading time for all connection types. However, other tests in non-optimal situations such as webpages which use “Lazy Loading” techniques, showed just 35% to 50% benefits, that is only achievable on 2G and 3G connections (0.2 Mbps – 15 Mbps downlink) and not faster connections
    • …
    corecore