Search CORE

10 research outputs found

Debugging Machine Learning Pipelines

Author: Freire Juliana
Lourenço Raoni
Shasha Dennis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce incorrect results. Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought, and is both time-consuming and error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our source code and experimental data will be available for reproducibility and enhancement.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Scipedia

Open Repository and Bibliography - Luxembourg

Розробка алгоритму мінімізації булевих функцій для візуально-матричної форми аналітичного методу

Author: Solomko Mykhailo
Publication venue: 'Private Company Technology Center'
Publication date: 26/02/2021
Field of study

This research has established the possibility of improving the effectiveness of the visual-matrix form of the analytical Boolean function minimization method by identifying reserves in a more complex algorithm for the operations of logical absorption and super-gluing the variables in terms of logical functions. An improvement in the efficiency of the Boolean function minimization procedure was also established, due to selecting, according to the predefined criteria, the optimal stack of logical operations for the first and second binary matrices of Boolean functions. When combining a sequence of logical operations using different techniques for gluing variables such as simple gluing and super-gluing, there are a small number of cases when function minimization is more effective if an operation of simply gluing the variables is first applied to the first matrix. Thus, a short analysis is required for the primary application of operations in the first binary matrix. That ensures the proper minimization efficiency regarding the earlier unaccounted-for variants for simplifying the Boolean functions by the visual-matrix form of the analytical method. For a series of cases, the choice of the optimal stack is also necessary for the second binary matrix. The experimental study has confirmed that the visual-matrix form of the analytical method, whose special feature is the use of 2-(n, b)-design and 2-(n, x/b)-design systems in the first matrix, improves the process efficiency, as well as the reliability of the result of Boolean function minimization. This simplifies the procedure of searching for a minimal function. Compared to analogs, that makes it possible to improve the productivity of the Boolean function minimization process by 100‒200 %. There is reason to assert the possibility of improving the efficiency of the Boolean function minimization process by the visual-matrix form of the analytical method, through the use of more complex logical operations of absorbing and super-gluing the variables. Also, by optimally combining the sequence of logical operations of super-gluing the variables and simply gluing the variables, based on the selection, according to the established criteria, of the stack of logical operations in the first binary matrix of the assigned functionПроведенными исследованиями установлена возможность увеличения эффективности визуально-матричной формы аналитического метода минимизации булевых функций путем выявления резервов более сложного алгоритма проведения логических операций поглощения и супер-склеивания переменных в термах логических функций. Установлено также увеличение эффективности процедуры минимизации булевых функций путем выбора, по установленным критериям, оптимального стека логических операций для первой и второй бинарных матриц булевых функций. При комбинировании последовательности логических операций с использованием различных способов склеивания переменных ‒ простого и супер-склеивания существует небольшое число случаев, когда минимизация функции более эффективна, если в первой матрице сначала применить операцию простого склеивания переменных. Таким образом, необходим краткий анализ для первоочередного применения операций в первой бинарной матрицы. Это обеспечивает надлежащую эффективность минимизации к ранее не учтенным вариантам упрощения булевых функций визуально-матричной формой аналитического метода. Для ряда случаев выбор оптимального стека нужен и для второй бинарной матрицы. Экспериментальными исследованиями подтверждено, что визуально-матричная форма аналитического метода, особенностью которой является использование систем 2-(n, b)-design и 2-(n, x/b)-design в первой матрице, повышает эффективность процесса и достоверность результата минимизации булевых функций. При этом упрощается процедура поиска минимальной функции. По сравнению с аналогами это позволяет повысить производительность процесса минимизации булевых функций на 100–200%. Есть основания утверждать о возможности увеличения эффективности процесса минимизации булевых функций визуально-матричной формой аналитического метода, путем использования более сложных логических операций поглощения и супер-склеивания переменных. А также с помощью оптимального комбинирования последовательности логических операций супер-склеивания переменных и простого склеивания переменных, на основании выбора, по установленным критериям, стека логических операций в первой бинарной матрице заданной функции.Проведеними дослідженнями встановлена можливість збільшення ефективності візуально-матричної форми аналітичного методу мінімізації булевих функцій шляхом виявлення резервів більш складнішого алгоритму проведення логічних операцій поглинання та супер-склеювання змінних у термах логічних функцій. Встановлено також збільшення ефективності процедури мінімізації булевих функцій шляхом вибору, за встановленими критеріями, оптимального стеку логічних операцій для першої та другої бінарних матриць булевих функцій. При комбінуванні послідовності логічних операцій з використанням різних способів склеювання змінних ‒ простого та супер-склеювання існує невелике число випадків, коли мінімізація функції є більш ефективна, якщо у першій матриці спочатку застосувати операцію простого склеювання змінних. Таким чином, необхідний короткий аналіз для першочергового застосування операцій у першій бінарній матриці. Це забезпечує належну ефективність мінімізації до раніш не врахованих варіантів спрощення булевих функцій візуально-матричною формою аналітичного методу. Для ряду випадків вибір оптимального стеку потрібний і для другої бінарної матриці. Експериментальними дослідженнями підтверджено, що візуально-матрична форма аналітичного методу, особливістю якої є використання систем 2-(n, b)-design та 2-(n, x/b)-design у першій матриці, підвищує ефективність процесу та достовірність результату мінімізації булевих функцій. При цьому спрощується процедура пошуку мінімальної функції. У порівнянні з аналогами це дає змогу підвищити продуктивність процесу мінімізації булевих функцій на 100–200 %. Є підстави стверджувати про можливість збільшення ефективності процесу мінімізації булевих функцій візуально-матричною формою аналітичного методу, шляхом використання більш складних логічних операцій поглинання та супер-склеювання змінних. А також за допомогою оптимального комбінування послідовності логічних операцій супер-склеювання змінних та простого склеювання змінних, на підставі вибору, за встановленими критеріями, стеку логічних операцій у першій бінарній матриці заданої функції

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Eastern-European Journal of Enterprise Technologies

BugDoc: Algorithms to Debug Computational Processes

Author: Alvaro Peter
Attariyan Mona
Bergstra J.
Bergstra James
Chen Ang
Dolatnia Nima
Galhotra Sainyam
Godefroid Patrice
Holler Christian
Hutter F.
Johnson Brittany
Lee Kang Wook
Liblit Ben
Lourencco Raoni
Meliou Alexandra
Snoek Jasper
Snoek Jasper
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/04/2020
Field of study

Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our experimental data and processing software is available for use, reproducibility, and enhancement.Comment: To appear in SIGMOD 2020. arXiv admin note: text overlap with arXiv:2002.0464

arXiv.org e-Print Archive

Crossref

Scipedia

Open Repository and Bibliography - Luxembourg

Recommended from our members

Automated synthesis of data extraction and transformation programs

Author: Yaghmazadeh Navid
Publication venue
Publication date: 27/08/2018
Field of study

Due to the abundance of data in today’s data-rich world, end-users increasingly need to perform various data extraction and transformation tasks. While many of these tedious tasks can be performed in a programmatic way, most end-users lack the required programming expertise to automate them and end up spending their valuable time in manually performing various data- related tasks. The field of program synthesis aims to overcome this problem by automatically generating programs from informal specifications, such as input-output examples or natural language. This dissertation focuses on the design and implementation of new systems for automating important classes of data transformation and extraction tasks. It introduces solutions for automating data manipulation tasks on fully- structured data formats like relational tables, or on semi-structured formats such as XML and JSON documents. First, we describe a novel algorithm for synthesizing hierarchical data transformations from input-output examples. A key novelty of our approach is that it reduces the synthesis of tree transformations to the simpler problem of synthesizing transformations over the paths of the tree. We also describe a new and effective algorithm for learning path transformations that combines logical SMT-based reasoning with machine learning techniques based on decision trees. Next, we present a new methodology for learning programs that migrate tree-structured documents to relational table representations from input-output examples. Our approach achieves its goal by decomposing the synthesis task to two subproblems of (A) learning the column extraction logic, and (B) learning the row extraction logic. We propose a technique for learning column extraction programs using deterministic finite automata, and a new algorithm for predicate learning which combines integer linear programing and logic minimization. Finally, we address the problem of automating data extraction tasks from natural language. Specifically, we focus on data retrieval from relational databases and describe a novel approach for learning SQL queries from English descriptions. The method we describe is fully automatic and database-agnostic (i.e., does not require customization for each database). Our method combines semantic parsing techniques from the NLP community with novel programming languages ideas involving probabilistic type inhabitation and automated sketch repair.Computer Science

Texas ScholarWorks

BugDoc: Iterative debugging and explanation of pipeline

Author: DE PAULA LOURENCO Raoni
Freire Juliana
Shasha Dennis
Simon Eric
Weber Gabriel
Publication venue: Springer Science and Business Media Deutschland GmbH
Publication date: 01/01/2023
Field of study

peer reviewedApplications in domains ranging from large-scale simulations in astrophysics and biology to enterprise analytics rely on computational pipelines. A pipeline consists of modules and their associated parameters, data inputs, and outputs, which are orchestrated to produce a set of results. If some modules derive unexpected outputs, the pipeline can crash or lead to incorrect results. Debugging these pipelines is difficult since there are many potential sources of errors including: bugs in the code, input data, software updates, and improper parameter settings. We present BugDoc, a system that automatically infers the root causes and derive succinct explanations of failures for black-box pipelines. BugDoc does so by using provenance from previous runs of a given pipeline to derive hypotheses for the errors, and then iteratively runs new pipeline configurations to test these hypotheses. Besides identifying issues associated with computational modules in a pipeline, we also propose methods for: “opportunistic group testing” to identify portions of data inputs that might be responsible for failed executions (what we call), helping users narrow down the cause of failure; and “selective instrumentation” to determine nodes in pipelines that should be instrumented to improve efficiency and reduce the number of iterations to test. Through a case study of deployed workflows at a software company and an experimental evaluation using synthetic pipelines, we assess the effectiveness of BugDoc and show that it requires fewer iterations to derive root causes and/or achieves higher quality results than previous approaches

Open Repository and Bibliography - Luxembourg

Space Communications: Theory and Applications. Volume 3: Information Processing and Advanced Techniques. A Bibliography, 1958 - 1963

Author: Bickford L. C.
Filipowsky R. F.
Publication venue
Publication date
Field of study

Annotated bibliography on information processing and advanced communication techniques - theory and applications of space communication

NASA Technical Reports Server

New Approaches for Memristive Logic Computations

Author: Aljafar Muayad Jaafar
Publication venue: PDXScholar
Publication date: 06/06/2018
Field of study

Over the past five decades, exponential advances in device integration in microelectronics for memory and computation applications have been observed. These advances are closely related to miniaturization in integrated circuit technologies. However, this miniaturization is reaching the physical limit (i.e., the end of Moore\u27s Law). This miniaturization is also causing a dramatic problem of heat dissipation in integrated circuits. Additionally, approaching the physical limit of semiconductor devices in fabrication process increases the delay of moving data between computing and memory units hence decreasing the performance. The market requirements for faster computers with lower power consumption can be addressed by new emerging technologies such as memristors. Memristors are non-volatile and nanoscale devices and can be used for building memory arrays with very high density (extending Moore\u27s law). Memristors can also be used to perform stateful logic operations where the same devices are used for logic and memory, enabling in-memory logic. In other words, memristor-based stateful logic enables a new computing paradigm of combining calculation and memory units (versus von Neumann architecture of separating calculation and memory units). This reduces the delays between processor and memory by eliminating redundant reloading of reusable values. In addition, memristors consume low power hence can decrease the large amounts of power dissipation in silicon chips hitting their size limit. The primary focus of this research is to develop the circuit implementations for logic computations based on memristors. These implementations significantly improve the performance and decrease the power of digital circuits. This dissertation demonstrates in-memory computing using novel memristive logic gates, which we call volistors (voltage-resistor gates). Volistors capitalize on rectifying memristors, i.e., a type of memristors with diode-like behavior, and use voltage at input and resistance at output. In addition, programmable diode gates, i.e., another type of logic gates implemented with rectifying memristors are proposed. In programmable diode gates, memristors are used only as switches (unlike volistor gates which utilize both memory and switching characteristics of the memristors). The programmable diode gates can be used with CMOS gates to increase the logic density. As an example, a circuit implementation for calculating logic functions in generalized ESOP (Exclusive-OR-Sum-of-Products) form and multilevel XOR network are described. As opposed to the stateful logic gates, a combination of both proposed logic styles decreases the power and improves the performance of digital circuits realizing two-level logic functions Sum-of-Products or Product-of-Sums. This dissertation also proposes a general 3-dimentional circuit architecture for in-memory computing. This circuit consists of a number of stacked crossbar arrays which all can simultaneously be used for logic computing. These arrays communicate through CMOS peripheral circuits

PDXScholar (Portland State University)

Entorno evolutivo de diseño automático en ingeniería

Author: Lamas Rodríguez Adolfo
Publication venue
Publication date: 01/01/2004
Field of study

[Resumen] Se ha desarrollado un entorno computacional de diseño automático, modular, escalable e interactivo que permite la integración de las etapas que conforman el proceso de diseño, eliminando al diseñador humano de las fases de búsqueda de soluciones y toma de decisiones y limitando su actuación a la especificación del problema y la evaluación subjetiva de propuestas cuando esto resulte necesario. De este modo, se mejora la creatividad al eliminarse las restricciones ficticias y la limitación en la exploración de posibles soluciones introducidas por el diseñador humano, se hacen los problemas más manejables y se abre la posibilidad de obtener soluciones variadas de mayor calidad. La integración de múltiples simuladores e interfaces de usuario en el entorno posibilita la creación de una macro función de calidad que pondera las evaluaciones técnicas (ingenieros expertos y simuladores) y subjetivas (usuarios finales). Esta selección y ponderación relativa de las aptitudes de cada uno de los simuladores se realizan en función de los criterios existentes expresados en términos del dominio del problema (rendimiento económico, capacidad de producción, etc.) y no del dominio de implementación (consumos eléctricos, parámetros de control, etc.) como ocurre en el diseño tradicional. Al mismo tiempo y para problemas de relativa complejidad, es necesario una elevada potencia de cálculo que requiere una alta escalabilidad del sistema. Se ha desarrollado, para ello, una implementación contemplando su distribución computacional con distintos niveles de granularidad según cada problema en concreto. Debido a su modularidad, el entorno es fácilmente adaptable a los distintos problemas por medio de la introducción de módulos de evaluación y/o la modificación de parámetros en la metodología de búsqueda. Para demostrar la versatilidad del entorno modular se han implementado procesos de diseño de sistemas reales en tre

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas