2 research outputs found
MLFriend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data
Most automation in machine learning focuses on model selection and hyper
parameter tuning, and many overlook the challenge of automatically defining
predictive tasks. We still heavily rely on human experts to define prediction
tasks, and generate labels by aggregating raw data. In this paper, we tackle
the challenge of defining useful prediction problems on event-driven
time-series data. We introduce MLFriend to address this challenge. MLFriend
first generates all possible prediction tasks under a predefined space, then
interacts with a data scientist to learn the context of the data and recommend
good prediction tasks from all the tasks in the space. We evaluate our system
on three different datasets and generate a total of 2885 prediction tasks and
solve them. Out of these 722 were deemed useful by expert data scientists. We
also show that an automatic prediction task discovery system is able to
identify top 10 tasks that a user may like within a batch of 100 tasks.Comment: 12 page
A Level-wise Taxonomic Perspective on Automated Machine Learning to Date and Beyond: Challenges and Opportunities
Automated machine learning (AutoML) is essentially automating the process of
applying machine learning to real-world problems. The primary goals of AutoML
tools are to provide methods and processes to make Machine Learning available
for non-Machine Learning experts (domain experts), to improve efficiency of
Machine Learning and to accelerate research on Machine Learning. Although
automation and efficiency are some of AutoML's main selling points, the process
still requires a surprising level of human involvement. A number of vital steps
of the machine learning pipeline, including understanding the attributes of
domain-specific data, defining prediction problems, creating a suitable
training data set etc. still tend to be done manually by a data scientist on an
ad-hoc basis. Often, this process requires a lot of back-and-forth between the
data scientist and domain experts, making the whole process more difficult and
inefficient. Altogether, AutoML systems are still far from a "real automatic
system". In this review article, we present a level-wise taxonomic perspective
on AutoML systems to-date and beyond, i.e., we introduce a new classification
system with seven levels to distinguish AutoML systems based on their level of
autonomy. We first start with a discussion on how an end-to-end Machine
learning pipeline actually looks like and which sub-tasks of Machine learning
Pipeline has indeed been automated so far. Next, we highlight the sub-tasks
which are still done manually by a data-scientist in most cases and how that
limits a domain expert's access to Machine learning. Then, we introduce the
novel level-based taxonomy of AutoML systems and define each level according to
their scope of automation support. Finally, we provide a road-map of future
research endeavor in the area of AutoML and discuss some important challenges
in achieving this ambitious goal.Comment: 35 pages, survey article, 3 figure