57 research outputs found
Structure Selection from Streaming Relational Data
Statistical relational learning techniques have been successfully applied in
a wide range of relational domains. In most of these applications, the human
designers capitalized on their background knowledge by following a
trial-and-error trajectory, where relational features are manually defined by a
human engineer, parameters are learned for those features on the training data,
the resulting model is validated, and the cycle repeats as the engineer adjusts
the set of features. This paper seeks to streamline application development in
large relational domains by introducing a light-weight approach that
efficiently evaluates relational features on pieces of the relational graph
that are streamed to it one at a time. We evaluate our approach on two social
media tasks and demonstrate that it leads to more accurate models that are
learned faster
When and where to transfer for Bayesian network parameter learning
This work is supported by the European Research Council (ERC-2013-AdG339182-BAYES-KNOWLEDGE) and the European Union’s Horizon 2020 research and innovation programme under grant agreement No 640891. YZ is supported by China Scholarship Council (CSC)/Queen Mary Joint PhD scholarships and National Natural Science Foundation of China (61273322, 71471174)
MTFuzz: Fuzzing with a Multi-Task Neural Network
Fuzzing is a widely used technique for detecting software bugs and
vulnerabilities. Most popular fuzzers generate new inputs using an evolutionary
search to maximize code coverage. Essentially, these fuzzers start with a set
of seed inputs, mutate them to generate new inputs, and identify the promising
inputs using an evolutionary fitness function for further mutation. Despite
their success, evolutionary fuzzers tend to get stuck in long sequences of
unproductive mutations. In recent years, machine learning (ML) based mutation
strategies have reported promising results. However, the existing ML-based
fuzzers are limited by the lack of quality and diversity of the training data.
As the input space of the target programs is high dimensional and sparse, it is
prohibitively expensive to collect many diverse samples demonstrating
successful and unsuccessful mutations to train the model. In this paper, we
address these issues by using a Multi-Task Neural Network that can learn a
compact embedding of the input space based on diverse training samples for
multiple related tasks (i.e., predicting for different types of coverage). The
compact embedding can guide the mutation process by focusing most of the
mutations on the parts of the embedding where the gradient is high. \tool
uncovers previously unseen bugs and achieves an average of more
edge coverage compared with 5 state-of-the-art fuzzer on 10 real-world
programs.Comment: ACM Joint European Software Engineering Conference and Symposium on
the Foundations of Software Engineering (ESEC/FSE) 202
Lifted graphical models: a survey
Lifted graphical models provide a language for expressing dependencies between different types of entities, their attributes, and their diverse relations, as well as techniques for probabilistic reasoning in such multi-relational domains. In this survey, we review a general form for a lifted graphical model, a par-factor graph, and show how a number of existing statistical relational representations map to this formalism. We discuss inference algorithms, including lifted inference algorithms, that efficiently compute the answers to probabilistic queries over such models. We also review work in learning lifted graphical models from data. There is a growing need for statistical relational models (whether they go by that name or another), as we are inundated with data which is a mix of structured and unstructured, with entities and relations extracted in a noisy manner from text, and with the need to reason effectively with this data. We hope that this synthesis of ideas from many different research groups will provide an accessible starting point for new researchers in this expanding field
Learning, Probability and Logic: Toward a Unified Approach for Content-Based Music Information Retrieval
Within the last 15 years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or ameliorate multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to tackle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the usual way to represent uncertainty in knowledge, while logical representation being the usual way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field
- …