26,063 research outputs found

    Probing the topological properties of complex networks modeling short written texts

    Get PDF
    In recent years, graph theory has been widely employed to probe several language properties. More specifically, the so-called word adjacency model has been proven useful for tackling several practical problems, especially those relying on textual stylistic analysis. The most common approach to treat texts as networks has simply considered either large pieces of texts or entire books. This approach has certainly worked well -- many informative discoveries have been made this way -- but it raises an uncomfortable question: could there be important topological patterns in small pieces of texts? To address this problem, the topological properties of subtexts sampled from entire books was probed. Statistical analyzes performed on a dataset comprising 50 novels revealed that most of the traditional topological measurements are stable for short subtexts. When the performance of the authorship recognition task was analyzed, it was found that a proper sampling yields a discriminability similar to the one found with full texts. Surprisingly, the support vector machine classification based on the characterization of short texts outperformed the one performed with entire books. These findings suggest that a local topological analysis of large documents might improve its global characterization. Most importantly, it was verified, as a proof of principle, that short texts can be analyzed with the methods and concepts of complex networks. As a consequence, the techniques described here can be extended in a straightforward fashion to analyze texts as time-varying complex networks

    Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

    Full text link
    In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph-where features are the nodes-the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data and object recognition, among others), and compared against filter, embedded and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.Comment: Preprint version - Lecture Notes in Computer Science - Springer 201

    On Sub-Propositional Fragments of Modal Logic

    Get PDF
    In this paper, we consider the well-known modal logics K\mathbf{K}, T\mathbf{T}, K4\mathbf{K4}, and S4\mathbf{S4}, and we study some of their sub-propositional fragments, namely the classical Horn fragment, the Krom fragment, the so-called core fragment, defined as the intersection of the Horn and the Krom fragments, plus their sub-fragments obtained by limiting the use of boxes and diamonds in clauses. We focus, first, on the relative expressive power of such languages: we introduce a suitable measure of expressive power, and we obtain a complex hierarchy that encompasses all fragments of the considered logics. Then, after observing the low expressive power, in particular, of the Horn fragments without diamonds, we study the computational complexity of their satisfiability problem, proving that, in general, it becomes polynomial

    Continuous Average Straightness in Spatial Graphs

    Full text link
    The Straightness is a measure designed to characterize a pair of vertices in a spatial graph. It is defined as the ratio of the Euclidean distance to the graph distance between these vertices. It is often used as an average, for instance to describe the accessibility of a single vertex relatively to all the other vertices in the graph, or even to summarize the graph as a whole. In some cases, one needs to process the Straightness between not only vertices, but also any other points constituting the graph of interest. Suppose for instance that our graph represents a road network and we do not want to limit ourselves to crossroad-to-crossroad itineraries, but allow any street number to be a starting point or destination. In this situation, the standard approach consists in: 1) discretizing the graph edges, 2) processing the vertex-to-vertex Straightness considering the additional vertices resulting from this discretization, and 3) performing the appropriate average on the obtained values. However, this discrete approximation can be computationally expensive on large graphs, and its precision has not been clearly assessed. In this article, we adopt a continuous approach to average the Straightness over the edges of spatial graphs. This allows us to derive 5 distinct measures able to characterize precisely the accessibility of the whole graph, as well as individual vertices and edges. Our method is generic and could be applied to other measures designed for spatial graphs. We perform an experimental evaluation of our continuous average Straightness measures, and show how they behave differently from the traditional vertex-to-vertex ones. Moreover, we also study their discrete approximations, and show that our approach is globally less demanding in terms of both processing time and memory usage. Our R source code is publicly available under an open source license

    Evolutionary Multi-Objective Design of SARS-CoV-2 Protease Inhibitor Candidates

    Full text link
    Computational drug design based on artificial intelligence is an emerging research area. At the time of writing this paper, the world suffers from an outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus replication is via protease inhibition. We propose an evolutionary multi-objective algorithm (EMOA) to design potential protease inhibitors for SARS-CoV-2's main protease. Based on the SELFIES representation the EMOA maximizes the binding of candidate ligands to the protein using the docking tool QuickVina 2, while at the same time taking into account further objectives like drug-likeliness or the fulfillment of filter constraints. The experimental part analyzes the evolutionary process and discusses the inhibitor candidates.Comment: 15 pages, 7 figures, submitted to PPSN 202

    Modal Logics with Hard Diamond-free Fragments

    Full text link
    We investigate the complexity of modal satisfiability for certain combinations of modal logics. In particular we examine four examples of multimodal logics with dependencies and demonstrate that even if we restrict our inputs to diamond-free formulas (in negation normal form), these logics still have a high complexity. This result illustrates that having D as one or more of the combined logics, as well as the interdependencies among logics can be important sources of complexity even in the absence of diamonds and even when at the same time in our formulas we allow only one propositional variable. We then further investigate and characterize the complexity of the diamond-free, 1-variable fragments of multimodal logics in a general setting.Comment: New version: improvements and corrections according to reviewers' comments. Accepted at LFCS 201
    • …
    corecore