11 research outputs found

    Roger that! Learning How Laypersons Teach New Functions to Intelligent Systems

    Get PDF
    Intelligent systems are rather smart today but still limited to built-in functionality. To break through this barrier, future systems must allow users to easily adapt the system by themselves. For humans the most natural way to communicate is talking. But what if users want to extend the systems’ functionality with nothing but natural language? Then intelligent systems must understand how laypersons teach new skills. To grasp the semantics of such teaching sequences, we have defined a hierarchical classification task. On the first level, we consider the existence of a teaching intent in an utterance; on the second, we classify the distinct semantic parts of teaching sequences: declaration of a new function, specification of intermediate steps, and superfluous information. We evaluate twelve machine learning techniques with multiple configurations tailored to this task ranging from classical approaches such as naı̈ve-bayes to modern techniques such as bidirectional LSTMs and task-oriented adaptations. On the first level convolutional neural networks achieve the best accuracy (96.6%). For the second task, bidirectional LSTMs are the most accurate (98.8%). With the additional adaptations we are able to improve both classifications distinctly (up to 1.8%)

    Towards Programming in Natural Language: Learning New Functions from Spoken Utterances

    Get PDF
    Systems with conversational interfaces are rather popular nowadays. However, their full potential is not yet exploited. For the time being, users are restricted to calling predefined functions. Soon, users will expect to customize systems to their needs and create own functions using nothing but spoken instructions. Thus, future systems must understand how laypersons teach new functionality to intelligent systems. The understanding of natural language teaching sequences is a first step toward comprehensive end-user programming in natural language. We propose to analyze the semantics of spoken teaching sequences with a hierarchical classification approach. First, we classify whether an utterance constitutes an effort to teach a new function or not. Afterward, a second classifier locates the distinct semantic parts of teaching efforts: declaration of a new function, specification of intermediate steps, and superfluous information. For both tasks we implement a broad range of machine learning techniques: classical approaches, such as Naïve Bayes, and neural network configurations of various types and architectures, such as bidirectional LSTMs. Additionally, we introduce two heuristic-based adaptations that are tailored to the task of understanding teaching sequences. As data basis we use 3168 descriptions gathered in a user study. For the first task convolutional neural networks obtain the best results (accuracy: 96.6%); bidirectional LSTMs excel in the second (accuracy: 98.8%). The adaptations improve the first-level classification considerably (plus 2.2% points)

    Global Entropy Based Greedy Algorithm for discretization

    Get PDF
    Discretization algorithm is a crucial step to not only achieve summarization of continuous attributes but also better performance in classification that requires discrete values as input. In this thesis, I propose a supervised discretization method, Global Entropy Based Greedy algorithm, which is based on the Information Entropy Minimization. Experimental results show that the proposed method outperforms state of the art methods with well-known benchmarking datasets. To further improve the proposed method, a new approach for stop criterion that is based on the change rate of entropy was also explored. From the experimental analysis, it is noticed that the threshold based on the decreasing rate of entropy could be more effective than a constant number of intervals in the classification such as C5.0

    Analítica de variables asociadas a la generación de reclamos en la distribución directa

    Get PDF
    52 páginasUno de los actores principales en el modelo de venta directa son las promotoras comerciales. Las empresas dedicadas a este modelo prestan especial atención a las reclamaciones por productos faltantes posterior al proceso de entrega de sus órdenes de pedido. Identificar las variables que ocasionan estas reclamaciones por parte de la promotora permiten obtener un activo valioso y competitivo, de donde se podrán generar análisis y acciones de analítica predictiva para evitar el reclamo y mejorar el nivel de servicio. Esta investigación presenta un panorama general de la logística del proceso de entrega de los productos a la promotora comercial en una empresa de venta directa y propone un modelo predictivo de clasificación supervisada para encontrar las futuras promotoras reclamantes. La implementación de este modelo permitió identificar que los días disponibles para venta es la variable protagonista en el comportamiento de la promotora y genera información beneficiosa para mitigar los reclamos y los costos que estos conllevan.One of the main actors in the direct sales model is commercial promoters. The companies dedicated to this model take special attention to claims for missing products after the delivery process of their order forms. Identifying the variables that cause these claims allows obtaining a valuable and competitive asset, from which analyzes, and predictive analytical actions can be generated to avoid the claim and improve the level of service. This research presents an overview of the logistics of the product delivery process to the commercial promoter at the company and proposes a predictive model of supervised classification to find future claimant promoters. The implementation of this model allowed us to identify that the days available for sale is the leading variable in the behavior of the developer and generates beneficial information to mitigate the claims and the costs.Maestría en Diseño y Gestión de ProcesosMagíster en Diseño y Gestión de Proceso

    The Impact of Overfitting and Overgeneralization on the Classification Accuracy in Data Mining

    Get PDF
    Current classification approaches usually do not try to achieve a balance between fitting and generalization when they infer models from training data. Such approaches ignore the possibility of different penalty costs for the false-positive, false-negative, and unclassifiable types. Thus, their performances may not be optimal or may even be coincidental. This dissertation analyzes the above issues in depth. It also proposes two new approaches called the Homogeneity-Based Algorithm (HBA) and the Convexity-Based Algorithm (CBA) to address these issues. These new approaches aim at optimally balancing the data fitting and generalization behaviors of models when some traditional classification approaches are used. The approaches first define the total misclassification cost (TC) as a weighted function of the three penalty costs and their corresponding error rates. The approaches then partition the training data into regions. In the HBA, the partitioning is done according to some homogeneous properties derivable from the training data. Meanwhile, the CBA employs some convex properties to derive regions. A traditional classification method is then used in conjunction with the HBA and CBA. Finally, the approaches apply a genetic approach to determine the optimal levels of fitting and generalization. The TC serves as the fitness function in this genetic approach. Real-life datasets from a wide spectrum of domains were used to better understand the effectiveness of the HBA and CBA. The computational results have indicated that both the HBA and CBA might potentially fill a critical gap in the implementation of current or future classification approaches. Furthermore, the results have also shown that when the penalty cost of an error type was changed, the corresponding error rate followed stepwise patterns. The finding of stepwise patterns of classification errors can assist researchers in determining applicable penalties for classification errors. Thus, the dissertation also proposes a binary search approach (BSA) to produce those patterns. Real-life datasets were utilized to demonstrate for the BSA

    SEMINAR NASIONAL INOVASI TEKNOLOGI DAN ILMU KOMPUTER ( 2021 ) TEMA: “Prospek Menjadi Technopreneur Dimasa Pandemi”

    Get PDF
    Kegiatan Seminar Nasional Inovasi Teknologi dan Ilmu Komputer (SNITIK 2021) merupakan kegiatan yang rutin diadakan Fakultas Teknologi dan Ilmu Komputer, Universitas Prima Indonesia (FTIK UNPRI). Pada awalnya seminar ini dinamakan Semnas FTIK dan dilaksanakan selama 4 tahun, setelah itu namanya diubah menjadi SNITIK dengan ruang lingkup yang lebih luas. Di tahun ketujuh dilaksanakannya Seminar ini, diangkat tema “Prospek Menjadi Technopreneur Dimasa Pandemi.”. Dampak Pandemi Covid-19 sangat mempengaruhi beberapa sektor industri dan usaha global. Selama masa pandemi Covid-19, kebanyakan Customer lebih sering belanja secara online karena dianggap lebih mudah dan praktis. Hal ini yang menunjukkan lapangan usaha sekarang sangat berhubungan erat dengan teknologi. Sehingga perlunya memanfaatkan teknologi dalam mengembangkan model bisnis baru untuk menciptakan peluang usaha. Kondisi ini mendorong industri menggunakan sumber daya manusia lulusan perguruan tinggi yang kompeten dan memiliki jiwa techopreneur

    Synthese von Methodendefinitionen aus natürlichsprachlichen Äußerungen

    Get PDF
    corecore