Search CORE

342 research outputs found

Online Schema Evolution is (Almost) Free for Snapshot Databases

Author: Hu Tianxun
Wang Tianzheng
Zhou Qingqing
Publication venue
Publication date: 08/10/2022
Field of study

Modern database applications often change their schemas to keep up with the changing requirements. However, support for online and transactional schema evolution remains challenging in existing database systems. Specifically, prior work often takes ad hoc approaches to schema evolution with 'patches' applied to existing systems, leading to many corner cases and often incomplete functionality. Applications therefore often have to carefully schedule downtimes for schema changes, sacrificing availability. This paper presents Tesseract, a new approach to online and transactional schema evolution without the aforementioned drawbacks. We design Tesseract based on a key observation: in widely used multi-versioned database systems, schema evolution can be modeled as data modification operations that change the entire table, i.e., data-definition-as-modification (DDaM). This allows us to support schema almost 'for free' by leveraging the concurrency control protocol. By simple tweaks to existing snapshot isolation protocols, on a 40-core server we show that under a variety of workloads, Tesseract is able to provide online, transactional schema evolution without service downtime, and retain high application performance when schema evolution is in progress.Comment: To appear at Proceedings of the 2023 International Conference on Very Large Data Bases (VLDB 2023

arXiv.org e-Print Archive

User-centered Program Analysis Tools

Author: Khoo Yit Phang
Publication venue
Publication date: 01/01/2013
Field of study

The research and industrial communities have made great strides in developing advanced software defect detection tools based on program analysis. Most of the work in this area has focused on developing novel program analysis algorithms to find bugs more efficiently or accurately, or to find more sophisticated kinds of bugs. However, the focus on algorithms often leads to tools that are complex and difficult to actually use to debug programs. We believe that we can design better, more useful program analysis tools by taking a user-centered approach. In this dissertation, we present three possible elements of such an approach. First, we improve the user interface by designing Path Projection, a toolkit for visualizing program paths, such as call stacks, that are commonly used to explain errors. We evaluated Path Projection in a user study and found that programmers were able to verify error reports more quickly with similar accuracy, and strongly preferred Path Projection to a standard code viewer. Second, we make it easier for programmers to combine different algorithms to customize the precision or efficiency of a tool for their target programs. We designed Mix, a framework that allows programmers to apply either type checking, which is fast but imprecise, or symbolic execution, which is precise but slow, to different parts of their programs. Mix keeps its design simple by making no modifications to the constituent analyses. Instead, programmers use Mix annotations to mark blocks of code that should be typed checked or symbolically executed, and Mix automatically combines the results. We evaluated the effectiveness of Mix by implementing a prototype called Mixy for C and using it to check for null pointer errors in vsftpd. Finally, we integrate program analysis more directly into the debugging process. We designed Expositor, an interactive dynamic program analysis and debugging environment built on top of scripting and time-travel debugging. In Expositor, programmers write program analyses as scripts that analyze entire program executions, using list-like operations such as map and filter to manipulate execution traces. For efficiency, Expositor uses lazy data structures throughout its implementation to compute results on-demand, enabling a more interactive user experience. We developed a prototype of Expositor using GDB and UndoDB, and used it to debug a stack overflow and to unravel a subtle data race in Firefox

Framework for Querying and Analysis of Evolving Graphs

Author: Moffitt Vera Zaychik
Publication venue: Drexel University
Publication date: 01/05/2017
Field of study

The average person spends several hours a day behind the wheel of their vehicles, which are usually equipped with on-board computers capable of collecting real-time data concerning driving behavior. However, this data source has rarely been tapped for healthcare and behavioral research purposes. This MS thesis is done in the context of the Diagnostic Driving project, an NSF funded collaborative project between Drexel, Children Hospital of Philadelphia (CHOP) and the University of Central Florida that aims at studying the possibility of using driving behavior data to diagnose medical conditions. Specifically, this paper introduces focuses on the classification of driving behavior data collected in a driving simulator using deep neural networks. The target classification task is to differentiate novice versus expert drivers. The paper presents a comparative study on using different variants of LSTM (Long-Short Term Memory networks) and Auto-encoder networks to deal with the fact that we have a small amount of labels (16 examples of people driving in the simulator, each labeled with an 'expert' or 'inexpert' label), but each simulator drive is high dimensional and too densely sampled (each drive consists of 100 variables sampled at 60Hz). Our results show that using an intermediate number of neurons in the LSTM networks and using data filtering (only considering one out of each 10 samples) obtains better results, and that using Auto-encoders works worse than using manual feature selection.Ph.D., Information Studies -- Drexel University, 201

Drexel Libraries E-Repository and Archives