283 research outputs found
iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling
Researchers in the Digital Humanities and journalists need to monitor,
collect and analyze fresh online content regarding current events such as the
Ebola outbreak or the Ukraine crisis on demand. However, existing focused
crawling approaches only consider topical aspects while ignoring temporal
aspects and therefore cannot achieve thematically coherent and fresh Web
collections. Especially Social Media provide a rich source of fresh content,
which is not used by state-of-the-art focused crawlers. In this paper we
address the issues of enabling the collection of fresh and relevant Web and
Social Web content for a topic of interest through seamless integration of Web
and Social Media in a novel integrated focused crawler. The crawler collects
Web and Social Media content in a single system and exploits the stream of
fresh Social Media content for guiding the crawler.Comment: Published in the Proceedings of the 15th ACM/IEEE-CS Joint Conference
on Digital Libraries 201
Deep Tree Transductions - A Short Survey
The paper surveys recent extensions of the Long-Short Term Memory networks to
handle tree structures from the perspective of learning non-trivial forms of
isomorph structured transductions. It provides a discussion of modern TreeLSTM
models, showing the effect of the bias induced by the direction of tree
processing. An empirical analysis is performed on real-world benchmarks,
highlighting how there is no single model adequate to effectively approach all
transduction problems.Comment: To appear in the Proceedings of the 2019 INNS Big Data and Deep
Learning (INNSBDDL 2019). arXiv admin note: text overlap with
arXiv:1809.0909
Energetic BEM for the numerical analysis of 2D Dirichlet damped wave propagation exterior problems
Abstract Time-dependent problems modeled by hyperbolic partial differential equations can be reformulated in terms of boundary integral equations and solved via the boundary element method. In this context, the analysis of damping phenomena that occur in many physics and engineering problems is a novelty. Starting from a recently developed energetic space-time weak formulation for 1D damped wave propagation problems rewritten in terms of boundary integral equations, we develop here an extension of the so-called energetic boundary element method for the 2D case. Several numerical benchmarks, whose numerical results confirm accuracy and stability of the proposed technique, already proved for the numerical treatment of undamped wave propagation problems in several dimensions and for the 1D damped case, are illustrated and discussed
Combining learning and constraints for genome-wide protein annotation
BackgroundThe advent of high-throughput experimental techniques paved the way to genome-wide computational analysis and predictive annotation studies. When considering the joint annotation of a large set of related entities, like all proteins of a certain genome, many candidate annotations could be inconsistent, or very unlikely, given the existing knowledge. A sound predictive framework capable of accounting for this type of constraints in making predictions could substantially contribute to the quality of machine-generated annotations at a genomic scale.ResultsWe present Ocelot, a predictive pipeline which simultaneously addresses functional and interaction annotation of all proteins of a given genome. The system combines sequence-based predictors for functional and protein-protein interaction (PPI) prediction with a consistency layer enforcing (soft) constraints as fuzzy logic rules. The enforced rules represent the available prior knowledge about the classification task, including taxonomic constraints over each GO hierarchy (e.g. a protein labeled with a GO term should also be labeled with all ancestor terms) as well as rules combining interaction and function prediction. An extensive experimental evaluation on the Yeast genome shows that the integration of prior knowledge via rules substantially improves the quality of the predictions. The system largely outperforms GoFDR, the only high-ranking system at the last CAFA challenge with a readily available implementation, when GoFDR is given access to intra-genome information only (as Ocelot), and has comparable or better results (depending on the hierarchy and performance measure) when GoFDR is allowed to use information from other genomes. Our system also compares favorably to recent methods based on deep learning
Application of Energetic BEM to 2D Elastodynamic Soft Scattering Problems
Abstract
Starting from a recently developed energetic space-time weak formulation of the Boundary Integral Equations related to scalar wave propagation problems, in this paper we focus for the first time on the 2D elastodynamic extension of the above wave propagation analysis. In particular, we consider elastodynamic scattering problems by open arcs, with vanishing initial and Dirichlet boundary conditions and we assess the efficiency and accuracy of the proposed method, on the basis of numerical results obtained for benchmark problems having available analytical solution
T-norms driven loss functions for machine learning
Injecting prior knowledge into the learning process of a neural architecture is one of the main challenges currently faced by the artificial intelligence community, which also motivated the emergence of neural-symbolic models. One of the main advantages of these approaches is their capacity to learn competitive solutions with a significant reduction of the amount of supervised data. In this regard, a commonly adopted solution consists of representing the prior knowledge via first-order logic formulas, then relaxing the formulas into a set of differentiable constraints by using a t-norm fuzzy logic. This paper shows that this relaxation, together with the choice of the penalty terms enforcing the constraint satisfaction, can be unambiguously determined by the selection of a t-norm generator, providing numerical simplification properties and a tighter integration between the logic knowledge and the learning objective. When restricted to supervised learning, the presented theoretical framework provides a straight derivation of the popular cross-entropy loss, which has been shown to provide faster convergence and to reduce the vanishing gradient problem in very deep structures. However, the proposed learning formulation extends the advantages of the cross-entropy loss to the general knowledge that can be represented by neural-symbolic methods. In addition, the presented methodology allows the development of novel classes of loss functions, which are shown in the experimental results to lead to faster convergence rates than the approaches previously proposed in the literature
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off
Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts-particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce
Integrating Learning and Reasoning with Deep Logic Models
Deep learning is very effective at jointly learning feature representations
and classification models, especially when dealing with high dimensional input
patterns. Probabilistic logic reasoning, on the other hand, is capable to take
consistent and robust decisions in complex environments. The integration of
deep learning and logic reasoning is still an open-research problem and it is
considered to be the key for the development of real intelligent agents. This
paper presents Deep Logic Models, which are deep graphical models integrating
deep learning and logic reasoning both for learning and inference. Deep Logic
Models create an end-to-end differentiable architecture, where deep learners
are embedded into a network implementing a continuous relaxation of the logic
knowledge. The learning process allows to jointly learn the weights of the deep
learners and the meta-parameters controlling the high-level reasoning. The
experimental results show that the proposed methodology overtakes the
limitations of the other approaches that have been proposed to bridge deep
learning and reasoning
- …