26,063 research outputs found
Probing the topological properties of complex networks modeling short written texts
In recent years, graph theory has been widely employed to probe several
language properties. More specifically, the so-called word adjacency model has
been proven useful for tackling several practical problems, especially those
relying on textual stylistic analysis. The most common approach to treat texts
as networks has simply considered either large pieces of texts or entire books.
This approach has certainly worked well -- many informative discoveries have
been made this way -- but it raises an uncomfortable question: could there be
important topological patterns in small pieces of texts? To address this
problem, the topological properties of subtexts sampled from entire books was
probed. Statistical analyzes performed on a dataset comprising 50 novels
revealed that most of the traditional topological measurements are stable for
short subtexts. When the performance of the authorship recognition task was
analyzed, it was found that a proper sampling yields a discriminability similar
to the one found with full texts. Surprisingly, the support vector machine
classification based on the characterization of short texts outperformed the
one performed with entire books. These findings suggest that a local
topological analysis of large documents might improve its global
characterization. Most importantly, it was verified, as a proof of principle,
that short texts can be analyzed with the methods and concepts of complex
networks. As a consequence, the techniques described here can be extended in a
straightforward fashion to analyze texts as time-varying complex networks
Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality
In an era where accumulating data is easy and storing it inexpensive, feature
selection plays a central role in helping to reduce the high-dimensionality of
huge amounts of otherwise meaningless data. In this paper, we propose a
graph-based method for feature selection that ranks features by identifying the
most important ones into arbitrary set of cues. Mapping the problem on an
affinity graph-where features are the nodes-the solution is given by assessing
the importance of nodes through some indicators of centrality, in particular,
the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance
of a feature as a function of the importance of its neighbors. Ranking central
nodes individuates candidate features, which turn out to be effective from a
classification point of view, as proved by a thoroughly experimental section.
Our approach has been tested on 7 diverse datasets from recent literature
(e.g., biological data and object recognition, among others), and compared
against filter, embedded and wrappers methods. The results are remarkable in
terms of accuracy, stability and low execution time.Comment: Preprint version - Lecture Notes in Computer Science - Springer 201
On Sub-Propositional Fragments of Modal Logic
In this paper, we consider the well-known modal logics ,
, , and , and we study some of their
sub-propositional fragments, namely the classical Horn fragment, the Krom
fragment, the so-called core fragment, defined as the intersection of the Horn
and the Krom fragments, plus their sub-fragments obtained by limiting the use
of boxes and diamonds in clauses. We focus, first, on the relative expressive
power of such languages: we introduce a suitable measure of expressive power,
and we obtain a complex hierarchy that encompasses all fragments of the
considered logics. Then, after observing the low expressive power, in
particular, of the Horn fragments without diamonds, we study the computational
complexity of their satisfiability problem, proving that, in general, it
becomes polynomial
Continuous Average Straightness in Spatial Graphs
The Straightness is a measure designed to characterize a pair of vertices in
a spatial graph. It is defined as the ratio of the Euclidean distance to the
graph distance between these vertices. It is often used as an average, for
instance to describe the accessibility of a single vertex relatively to all the
other vertices in the graph, or even to summarize the graph as a whole. In some
cases, one needs to process the Straightness between not only vertices, but
also any other points constituting the graph of interest. Suppose for instance
that our graph represents a road network and we do not want to limit ourselves
to crossroad-to-crossroad itineraries, but allow any street number to be a
starting point or destination. In this situation, the standard approach
consists in: 1) discretizing the graph edges, 2) processing the
vertex-to-vertex Straightness considering the additional vertices resulting
from this discretization, and 3) performing the appropriate average on the
obtained values. However, this discrete approximation can be computationally
expensive on large graphs, and its precision has not been clearly assessed. In
this article, we adopt a continuous approach to average the Straightness over
the edges of spatial graphs. This allows us to derive 5 distinct measures able
to characterize precisely the accessibility of the whole graph, as well as
individual vertices and edges. Our method is generic and could be applied to
other measures designed for spatial graphs. We perform an experimental
evaluation of our continuous average Straightness measures, and show how they
behave differently from the traditional vertex-to-vertex ones. Moreover, we
also study their discrete approximations, and show that our approach is
globally less demanding in terms of both processing time and memory usage. Our
R source code is publicly available under an open source license
Evolutionary Multi-Objective Design of SARS-CoV-2 Protease Inhibitor Candidates
Computational drug design based on artificial intelligence is an emerging
research area. At the time of writing this paper, the world suffers from an
outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus
replication is via protease inhibition. We propose an evolutionary
multi-objective algorithm (EMOA) to design potential protease inhibitors for
SARS-CoV-2's main protease. Based on the SELFIES representation the EMOA
maximizes the binding of candidate ligands to the protein using the docking
tool QuickVina 2, while at the same time taking into account further objectives
like drug-likeliness or the fulfillment of filter constraints. The experimental
part analyzes the evolutionary process and discusses the inhibitor candidates.Comment: 15 pages, 7 figures, submitted to PPSN 202
Modal Logics with Hard Diamond-free Fragments
We investigate the complexity of modal satisfiability for certain
combinations of modal logics. In particular we examine four examples of
multimodal logics with dependencies and demonstrate that even if we restrict
our inputs to diamond-free formulas (in negation normal form), these logics
still have a high complexity. This result illustrates that having D as one or
more of the combined logics, as well as the interdependencies among logics can
be important sources of complexity even in the absence of diamonds and even
when at the same time in our formulas we allow only one propositional variable.
We then further investigate and characterize the complexity of the
diamond-free, 1-variable fragments of multimodal logics in a general setting.Comment: New version: improvements and corrections according to reviewers'
comments. Accepted at LFCS 201
- …