228,285 research outputs found
Middleware-based Database Replication: The Gaps between Theory and Practice
The need for high availability and performance in data management systems has
been fueling a long running interest in database replication from both academia
and industry. However, academic groups often attack replication problems in
isolation, overlooking the need for completeness in their solutions, while
commercial teams take a holistic approach that often misses opportunities for
fundamental innovation. This has created over time a gap between academic
research and industrial practice.
This paper aims to characterize the gap along three axes: performance,
availability, and administration. We build on our own experience developing and
deploying replication systems in commercial and academic settings, as well as
on a large body of prior related work. We sift through representative examples
from the last decade of open-source, academic, and commercial database
replication systems and combine this material with case studies from real
systems deployed at Fortune 500 customers. We propose two agendas, one for
academic research and one for industrial R&D, which we believe can bridge the
gap within 5-10 years. This way, we hope to both motivate and help researchers
in making the theory and practice of middleware-based database replication more
relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on
Management of Data, Vancouver, Canada, June 200
Recommended from our members
The Impact of Web-Scale Discovery on the Use of Electronic Resources
In 2015, the University of California, Berkeley, launched EBSCO Discovery Service (EDS), a web-scale discovery tool, with a goal of improving visibility and usage of collections. This study applies linear regression analysis to usage data for ebooks, ejournals, and abstracts and indexing (A&I) databases before and after implementation of EDS in order to identify correlations between the discovery layer and usage of library electronic resources across platforms. Our findings diverge from conclusions drawn in the previous literature that indicate that resource use generally increases after a discovery tool is implemented. We examine data from a longer period of time than the previous literature had, looking for statistically significant changes in resource use. The discovery layer at UC Berkeley did not lead to equal increases across platforms, but rather to a complex array of increases and decreases in use according to a variety of factors.
Human Resource Information Systems for Competitive Advantage: Interviews with Ten Leaders
[Excerpt] Increasingly, today\u27s organizations use computer technology to manage human resources (HR). Surveys confirm this trend (Richards-Carpenter, 1989; Grossman and Magnus, 1988; Human Resource Systems Professionals 1988; KPMGPeat Marwick, 1988). HR professionals and managers routinely have Personnel Computers (PCs) or computer terminals on their desks or in their departments. HR computer applications, once confined to payroll and benefit domains, now encompass incentive compensation, staffing, succession planning, and training. Five years ago, we had but a handful of PC-based software applications for HR management. Today, we find a burgeoning market of products spanning a broad spectrum of price, sophistication, and quality (Personnel Journal, 1990). Top universities now consider computer literacy a basic requirement for students of HR, and many consulting firms and universities offer classes designed to help seasoned HR professionals use computers in their work (Boudreau, 1990). Changes in computer technology offer expanding potential for HR management (Business Week, 1990; Laudon and Laudon, 1988)
Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.
Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)
Report on the ECO-PB Workshop on the proposed EC Organic Seed Regime 2004
From the Introduction to the Proceedings
"The European Consortium for Organic Plant Breeding (ECO-PB) is an active network supporting the production and use of organic seeds. It sees the European Union’s Organic Seed Regime as potentially a great step forward for in the development of organic seed but is concerned about the latest discussions regarding implementation per January 1st 2004.
Current European Commission proposals seem to allow a great deal of room for derogation. Seed companies are indicating that this year (2003) their organic seed sales have already dropped. Growers are aware that the probable new rules will allow derogation even for those crops for which there is sufficient, appropriate organic seeds. A number of key seed companies have announced they will definitely stop their organic programmes should the criteria for derogation remain unclear and if derogation remains possible for all crops irrespective of availability, as it becomes financially unviable. That is a real threat and would be a great setback for ongoing efforts to build up a healthy organic seed sector and hence further close the organic production chain.
Furthermore, we have received much feedback about national authorities which are tentative about how to
tackle the national implementation of the new seed regulation.
ECO-PB has therefore decided to organise at short notice a WORKSHOP on the ORGANIC SEED Regime 2004 to help clarify the above issues. We aim to facilitate an international discussion with key national players to exchange valuable information and concerns, and to establish common points of view on the organic seed regime in the organic sector on international level. The workshop will tie into discussions held and decisions made in the Article 14 Committee of the European Commission on April 4-6.
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
- …