Search CORE

5 research outputs found

AravTewari/Git-Tutorial: Initial Release

Author: Arav Tewari
Publication venue
Publication date: 20/01/2023
Field of study

Full Changelog: https://github.com/AravTewari/Git-Tutorial/commits/v1.0.

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

AravTewari/Git-Tutorial: Initial Release

Author: Arav Tewari
Publication venue
Publication date: 20/01/2023
Field of study

Full Changelog: https://github.com/AravTewari/Git-Tutorial/commits/v1.0.

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Analysis of Failures and Risks in Deep Learning Model Converters: A Case Study in the ONNX Ecosystem

Author: Davis James C
Jajal Purvish
Jiang Wenxin
Tewari Arav
Thiruvathukal George K.
Woo Joseph
Publication venue: Loyola eCommons
Publication date: 04/03/2023
Field of study

Many software engineers develop, fine-tune, and deploy deep learning (DL) models. They use DL models in a variety of development frameworks and deploy to a range of runtime environments. In this diverse ecosystem, engineers use DL model converters to move models from frameworks to runtime environments. Conversion errors compromise model quality and disrupt deployment. However, failure modes and patterns of DL model converters are unknown. This knowledge gap adds engineering risk in DL interoperability technologies. In this paper, we conduct the first failure analysis on DL model converters. Specifically, we characterize failures in model converters associated with ONNX (Open Neural Network eXchange). We analyze failures in the ONNX converters for two major DL frameworks, PyTorch and TensorFlow. The symptoms, causes, and locations of failures are reported for N=200 issues. We also evaluate why models fail by converting 5,149 models, both real-world and synthetically generated instances. Through the course of our testing, we find 11 defects (5 new) across torch.onnx, tf2onnx, and the ONNXRuntime. We evaluated two hypotheses about the relationship between model operators and converter failures, falsifying one and with equivocal results on the other. We describe and note weaknesses in the current testing strategies for model converters. Our results motivate future research on making DL software simpler to maintain, extend, and validate

Loyola eCommons

PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages

Author: Davis James C
Jajal Purvish
Jiang Wenxin
Pareek Bhavesh
Schorlemmer Taylor R.
Synovic Nicholas
Tewari Arav
Thiruvathukal George K.
Publication venue: Loyola eCommons
Publication date: 04/02/2023
Field of study

Due to the cost of developing and training deep learning models from scratch, machine learning engineers have begun to reuse pre-trained models (PTMs) and fine-tune them for downstream tasks. PTM registries known as “model hubs” support engineers in distributing and reusing deep learning models. PTM packages include pre-trained weights, documentation, model architectures, datasets, and metadata. Mining the information in PTM packages will enable the discovery of engineering phenomena and tools to support software engineers. However, accessing this information is difficult — there are many PTM registries, and both the registries and the individual packages may have rate limiting for accessing the data. We present an open-source dataset, PTMTorrent, to facilitate the evaluation and understanding of PTM packages. This paper describes the creation, structure, usage, and limitations of the dataset. The dataset includes a snapshot of 5 model hubs and a total of 15,913 PTM packages. These packages are represented in a uniform data schema for cross-hub mining. We describe prior uses of this data and suggest research opportunities for mining using our dataset

Loyola eCommons

The Francis Crick Institute