181,608 research outputs found
Resurrection: Rethinking Magnetic Tapes For Cost Efficient Data Preservation
With the advent of Big Data technologies-the capacity to store and efficiently process large sets of data, doors of opportunities for developing business intelligence that was previously unknown, has opened. Each phase in the processing of this data requires specialized infrastructures. One such phase, the preservation and archiving of data, has proven its usefulness time and again. Data archives are processed using novel data mining methods to elicit vital data gathered over long periods of time and efficiently audit the growth of a business or an organization. Data preservation is also an important aspect of business processes which helps in avoiding loss of important information due to system failures, human errors and natural calamities.
This thesis investigates the need, discusses possibilities and presents a novel, highly cost-effective, unified, long- term storage solution for data. Some of the common processes followed in large-scale data warehousing systems are analyzed for overlooked, inordinate shortcomings and a profitably feasible solution is conceived for them. The gap between the general needs of 'efficient' long-term storage and common, current functionalities is analyzed. An attempt to bridge this gap is made through the use of a hybrid, hierarchical media based, performance enhancing middleware and a monolithic namespace filesystem in a new storage architecture, Tape Cloud.
The scope of studies carried out by us involves interpreting the effects of using heterogeneous storage media in terms of operational behavior, average latency of data transactions and power consumption. The results show the advantages of the new storage system by demonstrating the difference in operating costs, personnel costs and total cost of ownership from varied perspectives in a business model.Computer Science, Department o
Elimination of deduplication and reduce communication overhead in cloud
We extend an attribute-based storage system with safe deduplication in a hybrid cloud setting, where a private cloud is accountable for duplicate detection and a public cloud manages the storage. Related with the prior data deduplication systems, our system has two compensations. It can be used to private portion data with users by agreeing access policies slightly distribution of decryption keys. It realizes the typical view of semantic security for data privacy while existing systems only accomplish it by critical and punier security notion. In adding, we set into view an organization to alter a cipher text over one starter policy into cipher texts of the equal plaintext but beneath other starter guidelines deprived of revealing the basic plaintext
Memory performance of and-parallel prolog on shared-memory architectures
The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds significantly
beyond those of sequential systems, while supporting Prolog semantics and preserving sequential performance and storage efficiency. This paper presents simulation results supporting these claims with special emphasis on memory performance on a two-level sharedmemory multiprocessor organization. Several solutions to the cache coherency problem are analyzed. It is shown that RAP-WAM offers good locality and storage efficiency and that it can effectively take advantage of broadcast caches. It is argued that speeds in excess of 2 ML IPS on real applications exhibiting medium parallelism can be attained with current technology
Practical cryptographic strategies in the post-quantum era
We review new frontiers in information security technologies in
communications and distributed storage technologies with the use of classical,
quantum, hybrid classical-quantum, and post-quantum cryptography. We analyze
the current state-of-the-art, critical characteristics, development trends, and
limitations of these techniques for application in enterprise information
protection systems. An approach concerning the selection of practical
encryption technologies for enterprises with branched communication networks is
introduced.Comment: 5 pages, 2 figures; review pape
On-Demand Big Data Integration: A Hybrid ETL Approach for Reproducible Scientific Research
Scientific research requires access, analysis, and sharing of data that is
distributed across various heterogeneous data sources at the scale of the
Internet. An eager ETL process constructs an integrated data repository as its
first step, integrating and loading data in its entirety from the data sources.
The bootstrapping of this process is not efficient for scientific research that
requires access to data from very large and typically numerous distributed data
sources. a lazy ETL process loads only the metadata, but still eagerly. Lazy
ETL is faster in bootstrapping. However, queries on the integrated data
repository of eager ETL perform faster, due to the availability of the entire
data beforehand.
In this paper, we propose a novel ETL approach for scientific data
integration, as a hybrid of eager and lazy ETL approaches, and applied both to
data as well as metadata. This way, Hybrid ETL supports incremental integration
and loading of metadata and data from the data sources. We incorporate a
human-in-the-loop approach, to enhance the hybrid ETL, with selective data
integration driven by the user queries and sharing of integrated data between
users. We implement our hybrid ETL approach in a prototype platform, Obidos,
and evaluate it in the context of data sharing for medical research. Obidos
outperforms both the eager ETL and lazy ETL approaches, for scientific research
data integration and sharing, through its selective loading of data and
metadata, while storing the integrated data in a scalable integrated data
repository.Comment: Pre-print Submitted to the DMAH Special Issue of the Springer DAPD
Journa
Supply optimization model in the hierarchical geographically distributed organization
The strategic importance of the procurement function in the large organizations management requires using effective tools by the logistics management to justify decisions in the supply process. The architecture features of hierarchical geographically distributed organizations allow the use of a hybrid supply scheme that rationally combines the advantages of centralized and decentralized purchasing and supply management (PSM). The article suggests a supply optimization model in the hierarchical geographically distributed organization (HGDO), reflecting the features of a complex, multifactorial and multi-stage procurement process. The model allows to find the optimal options for purchasing and supplying products for the criterion of minimizing the total logistics costs that characterize this process for the entire period of planning HGDO logistics support, taking into account the values of the various parameters of participants and the logistics functions of the procurement process over each period of time. The model is an effective tool for supporting and coordinating decisions made by logistics managers at different levels of management of HGDO based on numerous options for purchasing and supplying products and their budgeting in conditions of the dynamics and diversity of internal and external factors of influence
- …