19,278 research outputs found
Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management
Spreadsheet software is the tool of choice for interactive ad-hoc data
management, with adoption by billions of users. However, spreadsheets are not
scalable, unlike database systems. On the other hand, database systems, while
highly scalable, do not support interactivity as a first-class primitive. We
are developing DataSpread, to holistically integrate spreadsheets as a
front-end interface with databases as a back-end datastore, providing
scalability to spreadsheets, and interactivity to databases, an integration we
term presentational data management (PDM). In this paper, we make a first step
towards this vision: developing a storage engine for PDM, studying how to
flexibly represent spreadsheet data within a database and how to support and
maintain access by position. We first conduct an extensive survey of
spreadsheet use to motivate our functional requirements for a storage engine
for PDM. We develop a natural set of mechanisms for flexibly representing
spreadsheet data and demonstrate that identifying the optimal representation is
NP-Hard; however, we develop an efficient approach to identify the optimal
representation from an important and intuitive subclass of representations. We
extend our mechanisms with positional access mechanisms that don't suffer from
cascading update issues, leading to constant time access and modification
performance. We evaluate these representations on a workload of typical
spreadsheets and spreadsheet operations, providing up to 20% reduction in
storage, and up to 50% reduction in formula evaluation time
Just below the surface: developing knowledge management systems using the paradigm of the noetic prism
In this paper we examine how the principles embodied in the paradigm of the noetic prism can illuminate the construction of knowledge management systems. We draw on the formalism of the prism to examine three successful tools: frames, spreadsheets and databases, and show how their power and also their shortcomings arise from their domain representation, and how any organisational system based on integration of these tools and conversion between them is inevitably lossy. We suggest how a late-binding, hybrid knowledge based management system (KBMS) could be designed that draws on the lessons learnt from these tools, by maintaining noetica at an atomic level and storing the combinatory processes necessary to create higher level structure as the need arises. We outline the “just-below-the-surface” systems design, and describe its implementation in an enterprise-wide knowledge-based system that has all of the conventional office automation features
CampProf: A Visual Performance Analysis Tool for Memory Bound GPU Kernels
Current GPU tools and performance models provide some common architectural insights that guide the programmers to write optimal code. We challenge these performance models, by modeling and analyzing a lesser known, but very severe performance pitfall, called 'Partition Camping', in NVIDIA GPUs. Partition Camping is caused by memory accesses that are skewed towards a subset of the available memory partitions, which may degrade the performance of memory-bound CUDA kernels by up to seven-times. No existing tool can detect the partition camping effect in CUDA kernels.
We complement the existing tools by developing 'CampProf', a spreadsheet based, visual analysis tool, that detects the degree to which any memory-bound kernel suffers from partition camping. In addition, CampProf also predicts the kernel's performance at all execution configurations, if its performance parameters are known at any one of them. To demonstrate the utility of CampProf, we analyze three different applications using our tool, and demonstrate how it can be used to discover partition camping. We also demonstrate how CampProf can be used to monitor the performance improvements in the kernels, as the partition camping effect is being removed.
The performance model that drives CampProf was developed by applying multiple linear regression techniques over a set of specific micro-benchmarks that simulated the partition camping behavior. Our results show that the geometric mean of errors in our prediction model is within 12% of the actual execution times. In summary, CampProf is a new, accurate, and easy-to-use tool that can be used in conjunction with the existing tools to analyze and improve the overall performance of memory-bound CUDA kernels
A CNL for Contract-Oriented Diagrams
We present a first step towards a framework for defining and manipulating
normative documents or contracts described as Contract-Oriented (C-O) Diagrams.
These diagrams provide a visual representation for such texts, giving the
possibility to express a signatory's obligations, permissions and prohibitions,
with or without timing constraints, as well as the penalties resulting from the
non-fulfilment of a contract. This work presents a CNL for verbalising C-O
Diagrams, a web-based tool allowing editing in this CNL, and another for
visualising and manipulating the diagrams interactively. We then show how these
proof-of-concept tools can be used by applying them to a small example
Selection of Statistical Software for Solving Big Data Problems for Teaching
The need for analysts with expertise in big data software is becoming more apparent in 4 today’s society. Unfortunately, the demand for these analysts far exceeds the number 5 available. A potential way to combat this shortage is to identify the software sought by 6 employers and to align this with the software taught by universities. This paper will 7 examine multiple data analysis software – Excel add-ins, SPSS, SAS, Minitab, and R – and 8 it will outline the cost, training, statistical methods/tests/uses, and specific uses within 9 industry for each of these software. It will further explain implications for universities and 10 students (PDF
- …