114,751 research outputs found
On the definition of a general learning system with user-defined operators
In this paper, we push forward the idea of machine learning systems whose
operators can be modified and fine-tuned for each problem. This allows us to
propose a learning paradigm where users can write (or adapt) their operators,
according to the problem, data representation and the way the information
should be navigated. To achieve this goal, data instances, background
knowledge, rules, programs and operators are all written in the same functional
language, Erlang. Since changing operators affect how the search space needs to
be explored, heuristics are learnt as a result of a decision process based on
reinforcement learning where each action is defined as a choice of operator and
rule. As a result, the architecture can be seen as a 'system for writing
machine learning systems' or to explore new operators where the policy reuse
(as a kind of transfer learning) is allowed. States and actions are represented
in a Q matrix which is actually a table, from which a supervised model is
learnt. This makes it possible to have a more flexible mapping between old and
new problems, since we work with an abstraction of rules and actions. We
include some examples sharing reuse and the application of the system gErl to
IQ problems. In order to evaluate gErl, we will test it against some structured
problems: a selection of IQ test tasks and some experiments on some structured
prediction problems (list patterns)
A New Automatic Method to Adjust Parameters for Object Recognition
To recognize an object in an image, the user must apply a combination of
operators, where each operator has a set of parameters. These parameters must
be well adjusted in order to reach good results. Usually, this adjustment is
made manually by the user. In this paper we propose a new method to automate
the process of parameter adjustment for an object recognition task. Our method
is based on reinforcement learning, we use two types of agents: User Agent that
gives the necessary information and Parameter Agent that adjusts the parameters
of each operator. Due to the nature of reinforcement learning the results do
not depend only on the system characteristics but also on the user favorite
choices
A Multi-Agents Architecture to Learn Vision Operators and their Parameters
In a vision system, every task needs that the operators to apply should be
{\guillemotleft} well chosen {\guillemotright} and their parameters should be
also {\guillemotleft} well adjusted {\guillemotright}. The diversity of
operators and the multitude of their parameters constitute a big challenge for
users. As it is very difficult to make the {\guillemotleft} right
{\guillemotright} choice, lack of a specific rule, many disadvantages appear
and affect the computation time and especially the quality of results. In this
paper we present a multi-agent architecture to learn the best operators to
apply and their best parameters for a class of images. Our architecture
consists of three types of agents: User Agent, Operator Agent and Parameter
Agent. The User Agent determines the phases of treatment, a library of
operators and the possible values of their parameters. The Operator Agent
constructs all possible combinations of operators and the Parameter Agent, the
core of the architecture, adjusts the parameters of each combination by
treating a large number of images. Through the reinforcement learning
mechanism, our architecture does not consider only the system opportunities but
also the user preferences.Comment: IJCSI, May 201
Declarative Data Analytics: a Survey
The area of declarative data analytics explores the application of the
declarative paradigm on data science and machine learning. It proposes
declarative languages for expressing data analysis tasks and develops systems
which optimize programs written in those languages. The execution engine can be
either centralized or distributed, as the declarative paradigm advocates
independence from particular physical implementations. The survey explores a
wide range of declarative data analysis frameworks by examining both the
programming model and the optimization techniques used, in order to provide
conclusions on the current state of the art in the area and identify open
challenges.Comment: 36 pages, 2 figure
Interactive Program Synthesis
Program synthesis from incomplete specifications (e.g. input-output examples)
has gained popularity and found real-world applications, primarily due to its
ease-of-use. Since this technology is often used in an interactive setting,
efficiency and correctness are often the key user expectations from a system
based on such technologies. Ensuring efficiency is challenging since the highly
combinatorial nature of program synthesis algorithms does not fit in a 1-2
second response expectation of a user-facing system. Meeting correctness
expectations is also difficult, given that the specifications provided are
incomplete, and that the users of such systems are typically non-programmers.
In this paper, we describe how interactivity can be leveraged to develop
efficient synthesis algorithms, as well as to decrease the cognitive burden
that a user endures trying to ensure that the system produces the desired
program. We build a formal model of user interaction along three dimensions:
incremental algorithm, step-based problem formulation, and feedback-based
intent refinement. We then illustrate the effectiveness of each of these forms
of interactivity with respect to synthesis performance and correctness on a set
of real-world case studies
A Tensor Based Data Model for Polystore: An Application to Social Networks Data
In this article, we show how the mathematical object tensor can be used to
build a multi-paradigm model for the storage of social data in data warehouses.
From an architectural point of view, our approach allows to link different
storage systems (polystore) and limits the impact of ETL tools performing model
transformations required to feed different analysis algorithms. Therefore,
systems can take advantage of multiple data models both in terms of query
execution performance and the semantic expressiveness of data representation.
The proposed model allows to reach the logical independence between data and
programs implementing analysis algorithms. With a concrete case study on
message virality on Twitter during the French presidential election of 2017, we
highlight some of the contributions of our model
Learning Tensors in Reproducing Kernel Hilbert Spaces with Multilinear Spectral Penalties
We present a general framework to learn functions in tensor product
reproducing kernel Hilbert spaces (TP-RKHSs). The methodology is based on a
novel representer theorem suitable for existing as well as new spectral
penalties for tensors. When the functions in the TP-RKHS are defined on the
Cartesian product of finite discrete sets, in particular, our main problem
formulation admits as a special case existing tensor completion problems. Other
special cases include transfer learning with multimodal side information and
multilinear multitask learning. For the latter case, our kernel-based view is
instrumental to derive nonlinear extensions of existing model classes. We give
a novel algorithm and show in experiments the usefulness of the proposed
extensions
Learning to Match for Multi-criteria Document Relevance
In light of the tremendous amount of data produced by social media, a large
body of research have revisited the relevance estimation of the users'
generated content. Most of the studies have stressed the multidimensional
nature of relevance and proved the effectiveness of combining the different
criteria that it embodies. Traditional relevance estimates combination methods
are often based on linear combination schemes. However, despite being
effective, those aggregation mechanisms are not effective in real-life
applications since they heavily rely on the non-realistic independence property
of the relevance dimensions. In this paper, we propose to tackle this issue
through the design of a novel fuzzy-based document ranking model. We also
propose an automated methodology to capture the importance of relevance
dimensions, as well as information about their interaction. This model, based
on the Choquet Integral, allows to optimize the aggregated documents relevance
scores using any target information retrieval relevance metric. Experiments
within the TREC Microblog task and a social personalized information retrieval
task highlighted that our model significantly outperforms a wide range of
state-of-the-art aggregation operators, as well as a representative learning to
rank methods.Comment: 9 page
Designing a GUI for Proofs - Evaluation of an HCI Experiment
Often user interfaces of theorem proving systems focus on assisting
particularly trained and skilled users, i.e., proof experts. As a result, the
systems are difficult to use for non-expert users. This paper describes a paper
and pencil HCI experiment, in which (non-expert) students were asked to make
suggestions for a GUI for an interactive system for mathematical proofs. They
had to explain the usage of the GUI by applying it to construct a proof sketch
for a given theorem. The evaluation of the experiment provides insights for the
interaction design for non-expert users and the needs and wants of this user
group
Helix: Holistic Optimization for Accelerating Iterative Machine Learning
Machine learning workflow development is a process of trial-and-error:
developers iterate on workflows by testing out small modifications until the
desired accuracy is achieved. Unfortunately, existing machine learning systems
focus narrowly on model training---a small fraction of the overall development
time---and neglect to address iterative development. We propose Helix, a
machine learning system that optimizes the execution across
iterations---intelligently caching and reusing, or recomputing intermediates as
appropriate. Helix captures a wide variety of application needs within its
Scala DSL, with succinct syntax defining unified processes for data
preprocessing, model specification, and learning. We demonstrate that the reuse
problem can be cast as a Max-Flow problem, while the caching problem is
NP-Hard. We develop effective lightweight heuristics for the latter. Empirical
evaluation shows that Helix is not only able to handle a wide variety of use
cases in one unified workflow but also much faster, providing run time
reductions of up to 19x over state-of-the-art systems, such as DeepDive or
KeystoneML, on four real-world applications in natural language processing,
computer vision, social and natural sciences
- …