109,991 research outputs found

    End-user feature engineering in the presence of class imbalance

    Get PDF
    Intelligent user interfaces, such as recommender systems and email classifiers, use machine learning algorithms to customize their behavior to the preferences of an end user. Although these learning systems are somewhat reliable, they are not perfectly accurate. Traditionally, end users who need to correct these learning systems can only provide more labeled training data. In this paper, we focus on incorporating new features suggested by the end user into machine learning systems. To investigate the effects of user-generated features on accuracy we developed an auto- coding application that enables end users to assist a machine-learned program in coding a transcript by adding custom features. Our results show that adding user-generated features to the machine learning algorithm can result in modest improvements to its F1 score. Further improvements are possible if the algorithm accounts for class imbalance in the training data and deals with low-quality user-generated features that add noise to the learning algorithm. We show that addressing class imbalance improves performance to an extent but improving the quality of features brings about the most beneficial change. Finally, we discuss changes to the user interface that can help end users avoid the creation of low-quality features.Keywords: Feature Engineering, Class Imbalance, machine learning, artificial intelligence, end-user programming, HC
    • …
    corecore