Machine Learning Hub for Tapis

Abstract

"Machine learning is indispensable for extracting insights from intricate datasets, expediting data analysis, and enabling cross-disciplinary decision-making. However, the complexity of machine learning models can hinder non-technical users, necessitating for user-friendly tools. Within the Cloud and Interactive Computing group (CIC) at the Texas Advanced Computing Center (TACC), we are actively developing a Machine Learning Hub (ML Hub) API for the Tapis Framework. Comprising accessible microservices, each with an independent REST API implemented in Python's Flask and codified with OpenAPI v3 definitions, our research aims to enhance the experiences of developers, scientists, and researchers. The integration of Hugging Face's API into ML Hub provides open-source pre-trained models for state-of-the-art AI capabilities. Currently, ML Hub's Models Overview and Models Download functions offer a gateway for non-technical users to explore and download machine learning models, authenticated using a JSON Web Token (JWT) from the Tapis Authenticator API. Future developments encompass implementing the Inference Client and Training Engine, seamlessly integrating with the Tapis UI in React and Typescript. Key features of ML Hub: 1. Models Overview: A portal showcasing top Hugging Face models with filtering options. 2. Models Download: Users can obtain specific models, with options to either download a binary file of the model or a zip file containing the model's repository, cached in a version-aware manner. 3. Inference Client: Facilitating server initiation for machine learning model inference on TACC's HPC cluster, enabling rapid prototyping. 4. Training Engine: Enabling users to fine-tune models and showcase them on TACC's HPC cluster, removing technical complexities. This research contributes to the broader discourse on democratizing machine learning's potential, by providing user- friendly access to state-of-the-art models and addressing non-technical users' challenges. We hope that this project will foster innovative collaboration and user engagement, paving the way for an inclusive and impactful future in machine learning research."Texas Advanced Computing Center (TACC

    Similar works