Search CORE

132 research outputs found

Usenetfs: A Stackable File System for Large Article Directories

Author: Badulescu Ion
Zadok Erez
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1998
Field of study

The Internet has grown much in popularity in the past few years. Numerous users read USENET newsgroups daily for entertainment, work, study, and more. USENET News servers have seen a gradual increase in the traffic exchanged between them, to a point where the hardware and software supporting the servers is no longer capable of meeting demand, at which point the servers begin 'dropping' articles they could not process. The rate of this increase has been faster than software or hardware improvements were able to keep up,resulting in much time and effort spent by administrators upgrading their news systems. One of the primary reasons for the slowness of news servers has been the need to process many articles in very large flat directories representing newsgroups such as control. cancel and misc.jobs.offered. A large portion of the resources is spent on processing articles in these few newsgroups. Most Unix directories are organized as a linear unsorted sequence of entries. Large news groups can have hundreds of thousands of articles in one directory, resulting in significant delays processing any single article. Usenetfs is a file system that rearranges the directory structure from being flat to one with small directories containing fewer articles. By breaking the structure into smaller directories, it improves the performance of looking for, creating, or deleting files, since these operations occur on smaller directories. Usenetfs takes advantage of article numbers; knowing that file names representing articles are composed of digits helps to bound the size of the smaller directories. Usenetfs improves overall performance by at least 22\%for average news servers; common news server operations such as looking up, adding, and deleting articles are sped up by as much as several orders of magnitude. Usenetfs was designed and implemented as a stackable Vnode layer loadable kernel module. It operates by 'encapsulating' a client file system with a layer of directory management. To the process performing directory operations through a mounted Usenetfs, all directories appear flat; but when inspecting the underlying storage that it manages, small directories are visible. Usenetfs is small and is transparent to the user. It requires no change to News software, to other file systems, or to the rest of the operating system. Usenetfs is more portable than other native kernel-based file systems because it interacts with the Vnode interface which is similar on many different platforms

CiteSeerX

Columbia University Academic Commons

Discovery and Hot Replacement of Replicated Read-Only File Systems, with Application to Mobile Computing

Author: Zadok Erez
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1993
Field of study

We describe a mechanism for replacing files, including open files, of a read-only file system while the file system remains mounted; the act of replacement is transparent to the user. Such a “hot replacement“ mechanism can improve fault-tolerance, performance, or both. Our mechanism monitors, from the client side, the latency of operations directed at each file system. When latency degrades, the client automatically seeks a replacement file system that is equivalent to but hopefully faster than the current file system. The files in the replacement file system then take the place of those in the current file system. This work has particular relevance to mobile computers, which in some cases might move over a wide area. Wide area movement can be expected to lead to highly variable response time, and give rise to three sorts of problems: increased latency, increased failures, and decreased scalability. If a mobile client moves through regions having partial replicas of common file systems, then the mobile client can depend on our mechanism to provide increased fault tolerance and more uniform performance

CiteSeerX

Columbia University Academic Commons

Recommended from our members

A File System Component Compiler

Author: Zadok Erez
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1997
Field of study

File System development is a difficult and time consuming task, the results of which are rarely portable across operating systems. Several proposals to improve the vnode interface to allow for more flexible file system design and implementation have been made in recent years, but none is used in practice because they require costly fundamental changes to kernel interfaces, only operating systems vendors can make those changes, are still non-portable, tend to degrade performance, and do not appear to provide immediate return on such an investment. This proposal advocates a language for describing file systems, called FiST. The associated translator can generate portable C code — kernel resident or not — that implements the described file system. No kernel source code is needed and no existing vnode interface must change. The performance of the file systems automatically generated by FiST can be within a few percent of comparable hand-written file systems. The main benefits to automation are that development and maintenance costs are greatly reduced, and that it becomes practical to prototype, implement, test, debug, and compose a vastly larger set of such file systems with different properties. The proposed thesis will describe the language and its translator, use it to implement a few file systems on more than one platform, and evaluate the performance of the automatically generated code

Columbia University Academic Commons

Cryptfs: A Stackable Vnode Level Encryption File System

Author: Badulescu Ion
Shender Alex
Zadok Erez
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1998
Field of study

Data encryption has become an increasingly important factor in everyday work. Users seek a method of securing their data with maximum comfort and minimum additional requirements on their part; they want a security system that protects any files used by any of their applications, without resorting to application-specific encryption methods. Performance is an important factor to users since encryption can be time consuming. Operating system vendors want to provide this functionality but without incurring the large costs of developing a new file system. This paper describes the design and implementation of Cryptfs -- a file system that was designed as a stackable Vnode layer loadable kernel module. Cryptfs operates by 'encapsulating' a client file system with a layer of encryption transparent to the user. Being kernel resident, Cryptfs performs better than user-level or NFS based file servers such as CFS and TCFS. It is 2 to 37 times faster on micro-benchmarks such as read and write; this translates to 12-52\%application speedup, as exemplified by a large build. Cryptfs offers stronger security by basing its keys on process session IDs as well as user IDs, and by the fact that kernel memory is harder to access. Working at and above the vnode level, Cryptfs is more portable than a file system which works directly with native media such as disks and networks. Cryptfs can operate on top of any other native file system such as UFS/FFS and NFS. Finally, Cryptfs requires no changes to client file systems or remote servers

CiteSeerX

Columbia University Academic Commons

Recommended from our members

Benchmarking File System Benchmarking: It IS Rocket Science

Author: Bhanage Saumitra
Seltzer Margo I.
Tarasov Vasily
Zadok Erez
Publication venue: USENIX Association
Publication date: 07/10/2011
Field of study

The quality of file system benchmarking has not improved in over a decade of intense research spanning hundreds of publications. Researchers repeatedly use a wide range of poorly designed benchmarks, and in most cases, develop their own ad-hoc benchmarks. Our community lacks a definition of what we want to benchmark in a file system. We propose several dimensions of file system benchmarking and review the wide range of tools and techniques in widespread use. We experimentally show that even the simplest of benchmarks can be fragile, producing performance results spanning orders of magnitude. It is our hope that this paper will spur serious debate in our community, leading to action that can improve how we evaluate our file and storage systems.Engineering and Applied Science

Harvard University - DASH

Recommended from our members

Performance of Size-Changing Algorithms in Stackable File Systems

Author: Andersen Johan M.
Badulescu Ion
Nieh Jason
Zadok Erez
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2000
Field of study

Stackable file systems can provide extensible file system functionality with minimal performance overhead and development cost. However, previous approaches are limited in the functionality they provide. In particular, they do not support size-changing algorithms, which are important and useful for many applications, such as compression and security. We propose fast index files, a technique for efficient support of size-changing algorithms in stackable file systems. Fast index files provide a page mapping between file system layers in a way that can be used with any size-changing algorithm. Index files are designed to be recoverable if lost and add less than 0.1\% disk space overhead. We have implemented fast indexing using portable stackable templates, and we have used this system to build several example file systems with size-changing algorithms. We demonstrate that fast index files have very low overhead for typical workloads, only 2.3\% over other stacked file systems. Our system can deliver much better performance on size-changing algorithms than user-level applications, as much as five times faster

Columbia University Academic Commons

Virtual Machine Workloads: The Case for New NAS Benchmarks

Author: Hildebrand Dean
Kuenning Geoffrey H.
Tarasov Vasily
Zadok Erez
Publication venue: Scholarship @ Claremont
Publication date: 01/01/2013
Field of study

Network Attached Storage (NAS) and Virtual Machines (VMs) are widely used in data centers thanks to their manageability, scalability, and ability to consolidate resources. But the shift from physical to virtual clients drastically changes the I/O workloads to seen on NAS servers, due to guest file system encapsulation in virtual disk images and the multiplexing of request streams from different VMs. Unfortunately, current NAS workload generators and benchmarks produce workloads typical to physical machines. This paper makes two contributions. First, we studied the extent to which virtualization is changing existing NAS workloads. We observed significant changes, including the disappearance of file system meta-data operations at the NAS layer, changed I/O sizes, and increased randomness. Second, we created a set of versatile NAS benchmarks to synthesize virtualized workloads. This allows us to generate accurate virtualized workloads without the effort and limitations associated with setting up a full virtualized environment. Our experiments demonstrate that relative error of our virtualized benchmarks, evaluated across 11 parameters, averages less than 10%

Scholarship@Claremont

ICE: An Interactive Configuration Explorer for High Dimensional Categorical Parameter Spaces

Author: Cao Zhen
Estro Tyler
Mueller Klaus
Tyagi Anjul
Zadok Erez
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2020
Field of study

There are many applications where users seek to explore the impact of the settings of several categorical variables with respect to one dependent numerical variable. For example, a computer systems analyst might want to study how the type of file system or storage device affects system performance. A usual choice is the method of Parallel Sets designed to visualize multivariate categorical variables. However, we found that the magnitude of the parameter impacts on the numerical variable cannot be easily observed here. We also attempted a dimension reduction approach based on Multiple Correspondence Analysis but found that the SVD-generated 2D layout resulted in a loss of information. We hence propose a novel approach, the Interactive Configuration Explorer (ICE), which directly addresses the need of analysts to learn how the dependent numerical variable is affected by the parameter settings given multiple optimization objectives. No information is lost as ICE shows the complete distribution and statistics of the dependent variable in context with each categorical variable. Analysts can interactively filter the variables to optimize for certain goals such as achieving a system with maximum performance, low variance, etc. Our system was developed in tight collaboration with a group of systems performance researchers and its final effectiveness was evaluated with expert interviews, a comparative user study, and two case studies.Comment: 10 pages, Published by IEEE at VIS 2019 (Vancouver, BC, Canada

arXiv.org e-Print Archive

Crossref

Recommended from our members

MEF: Malicious Email Filter: A UNIX Mail Filter that Detects Malicious Windows Executables

Author: Bhattacharyya Manasi
Eskin Eleazar
Schultz Matthew G.
Stolfo Salvatore
Zadok Erez
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2001
Field of study

We present Malicious Email Filter, MEF, a freely distributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server. The system has three capabilities: detection of known and unknown malicious attachments, tracking the propagation of malicious attachments and efficient model update algorithms. The system filters multiple malicious attachments in an email by using detection models obtained from data mining over known malicious attachments. It leverages preliminary research in data mining applied to malicious executables which allows the detection of previously unseen, malicious attachments. In addition, the system provides a method for monitoring and measurement of the spread of malicious attachments. Finally, the system also allows for the efficient propagation of detection models from a central server. These updated models can be downloaded by a system administrator and easily incorporated into the current model. The system will be released under GPL in June 2001

Columbia University Academic Commons