9 research outputs found
Semi-Supervised Learning using Differentiable Reasoning
We introduce Differentiable Reasoning (DR), a novel semi-supervised learning
technique which uses relational background knowledge to benefit from unlabeled
data. We apply it to the Semantic Image Interpretation (SII) task and show that
background knowledge provides significant improvement. We find that there is a
strong but interesting imbalance between the contributions of updates from
Modus Ponens (MP) and its logical equivalent Modus Tollens (MT) to the learning
process, suggesting that our approach is very sensitive to a phenomenon called
the Raven Paradox. We propose a solution to overcome this situation
Teaching the old dog new tricks: Supervised learning with constraints
Methods for taking into account external knowledge in Machine Learning models have the potential to address outstanding issues in data-driven AI methods, such as improving safety and fairness, and can simplify training in the presence of scarce data. We propose a simple, but effective, method for injecting constraints at training time in supervised learning, based on decomposition and bi-level optimization: a master step is in charge of enforcing the constraints, while a learner step takes care of training the model. The process leads to approximate constraint satisfaction. The method is applicable to any ML approach for which the concept of label (or target) is well defined (most regression and classification scenarios), and allows to reuse existing training algorithms with no modifications. We require no assumption on the constraints, although their properties affect the shape and complexity of the master problem. Convergence guarantees are hard to provide, but we found that the approach performs well on ML tasks with fairness constraints and on classical datasets with synthetic constraints
Teaching the Old Dog New Tricks: Supervised Learning with Constraints
Adding constraint support in Machine Learning has the potential to address
outstanding issues in data-driven AI systems, such as safety and fairness.
Existing approaches typically apply constrained optimization techniques to ML
training, enforce constraint satisfaction by adjusting the model design, or use
constraints to correct the output. Here, we investigate a different,
complementary, strategy based on "teaching" constraint satisfaction to a
supervised ML method via the direct use of a state-of-the-art constraint
solver: this enables taking advantage of decades of research on constrained
optimization with limited effort. In practice, we use a decomposition scheme
alternating master steps (in charge of enforcing the constraints) and learner
steps (where any supervised ML model and training algorithm can be employed).
The process leads to approximate constraint satisfaction in general, and
convergence properties are difficult to establish; despite this fact, we found
empirically that even a na\"ive setup of our approach performs well on ML tasks
with fairness constraints, and on classical datasets with synthetic
constraints
T-Norms Driven Loss Functions for Machine Learning
Neural-symbolic approaches have recently gained popularity to inject prior
knowledge into a learner without requiring it to induce this knowledge from
data. These approaches can potentially learn competitive solutions with a
significant reduction of the amount of supervised data. A large class of
neural-symbolic approaches is based on First-Order Logic to represent prior
knowledge, relaxed to a differentiable form using fuzzy logic. This paper shows
that the loss function expressing these neural-symbolic learning tasks can be
unambiguously determined given the selection of a t-norm generator. When
restricted to supervised learning, the presented theoretical apparatus provides
a clean justification to the popular cross-entropy loss, which has been shown
to provide faster convergence and to reduce the vanishing gradient problem in
very deep structures. However, the proposed learning formulation extends the
advantages of the cross-entropy loss to the general knowledge that can be
represented by a neural-symbolic method. Therefore, the methodology allows the
development of a novel class of loss functions, which are shown in the
experimental results to lead to faster convergence rates than the approaches
previously proposed in the literature