11,441 research outputs found

    A Survey on Graph Kernels

    Get PDF
    Graph kernels have become an established and widely-used technique for solving classification tasks on graphs. This survey gives a comprehensive overview of techniques for kernel-based graph classification developed in the past 15 years. We describe and categorize graph kernels based on properties inherent to their design, such as the nature of their extracted graph features, their method of computation and their applicability to problems in practice. In an extensive experimental evaluation, we study the classification accuracy of a large suite of graph kernels on established benchmarks as well as new datasets. We compare the performance of popular kernels with several baseline methods and study the effect of applying a Gaussian RBF kernel to the metric induced by a graph kernel. In doing so, we find that simple baselines become competitive after this transformation on some datasets. Moreover, we study the extent to which existing graph kernels agree in their predictions (and prediction errors) and obtain a data-driven categorization of kernels as result. Finally, based on our experimental results, we derive a practitioner's guide to kernel-based graph classification

    Regularized Regression Problem in hyper-RKHS for Learning Kernels

    Full text link
    This paper generalizes the two-stage kernel learning framework, illustrates its utility for kernel learning and out-of-sample extensions, and proves {asymptotic} convergence results for the introduced kernel learning model. Algorithmically, we extend target alignment by hyper-kernels in the two-stage kernel learning framework. The associated kernel learning task is formulated as a regression problem in a hyper-reproducing kernel Hilbert space (hyper-RKHS), i.e., learning on the space of kernels itself. To solve this problem, we present two regression models with bivariate forms in this space, including kernel ridge regression (KRR) and support vector regression (SVR) in the hyper-RKHS. By doing so, it provides significant model flexibility for kernel learning with outstanding performance in real-world applications. Specifically, our kernel learning framework is general, that is, the learned underlying kernel can be positive definite or indefinite, which adapts to various requirements in kernel learning. Theoretically, we study the convergence behavior of these learning algorithms in the hyper-RKHS and derive the learning rates. Different from the traditional approximation analysis in RKHS, our analyses need to consider the non-trivial independence of pairwise samples and the characterisation of hyper-RKHS. To the best of our knowledge, this is the first work in learning theory to study the approximation performance of regularized regression problem in hyper-RKHS.Comment: 25 pages, 3 figure

    Cultural influences on word meanings revealed through large-scale semantic alignment

    Get PDF
    If the structure of language vocabularies mirrors the structure of natural divisions that are universally perceived, then the meanings of words in different languages should closely align. By contrast, if shared word meanings are a product of shared culture, history and geography, they may differ between languages in substantial but predictable ways. Here, we analysed the semantic neighbourhoods of 1,010 meanings in 41 languages. The most-aligned words were from semantic domains with high internal structure (number, quantity and kinship). Words denoting natural kinds, common actions and artefacts aligned much less well. Languages that are more geographically proximate, more historically related and/or spoken by more-similar cultures had more aligned word meanings. These results provide evidence that the meanings of common words vary in ways that reflect the culture, history and geography of their users

    Improved Coreference Resolution Using Cognitive Insights

    Get PDF
    Coreference resolution is the task of extracting referential expressions, or mentions, in text and clustering these by the entity or concept they refer to. The sustained research interest in the task reflects the richness of reference expression usage in natural language and the difficulty in encoding insights from linguistic and cognitive theories effectively. In this thesis, we design and implement LIMERIC, a state-of-the-art coreference resolution engine. LIMERIC naturally incorporates both non-local decoding and entity-level modelling to achieve the highly competitive benchmark performance of 64.22% and 59.99% on the CoNLL-2012 benchmark with a simple model and a baseline feature set. As well as strong performance, a key contribution of this work is a reconceptualisation of the coreference task. We draw an analogy between shift-reduce parsing and coreference resolution to develop an algorithm which naturally mimics cognitive models of human discourse processing. In our feature development work, we leverage insights from cognitive theories to improve our modelling. Each contribution achieves statistically significant improvements and sum to gains of 1.65% and 1.66% on the CoNLL-2012 benchmark, yielding performance values of 65.76% and 61.27%. For each novel feature we propose, we contribute an accompanying analysis so as to better understand how cognitive theories apply to real language data. LIMERIC is at once a platform for exploring cognitive insights into coreference and a viable alternative to current systems. We are excited by the promise of incorporating our and further cognitive insights into more complex frameworks since this has the potential to both improve the performance of computational models, as well as our understanding of the mechanisms underpinning human reference resolution

    Randomized low-rank approximation for symmetric indefinite matrices

    Get PDF
    The Nystr\"om method is a popular choice for finding a low-rank approximation to a symmetric positive semi-definite matrix. The method can fail when applied to symmetric indefinite matrices, for which the error can be unboundedly large. In this work, we first identify the main challenges in finding a Nystr\"om approximation to symmetric indefinite matrices. We then prove the existence of a variant that overcomes the instability, and establish relative-error nuclear norm bounds of the resulting approximation that hold when the singular values decay rapidly. The analysis naturally leads to a practical algorithm, whose robustness is illustrated with experiments.Comment: 24 pages, 6 figure

    The adoption of e-business technology by SMEs

    Get PDF
    The paper examines the key factors influencing the adoption ofe-business technology by SMEs. To this end, the paper draws on a rangeof literatures on the diffusion of new information and communicationtechnologies (ICTs), many of which have hitherto been treated asseparate. The reasons for this are two-fold. First, e-businesstechnologies are the latest in a line of new ICT technologies. Whenexploited successfully, ICTs have increased firm competitiveness eitherby raising the efficiency of internal communication and organisationand/or supply chain relationships, or by facilitating the development ofnew/improved products and services. Second, it is hypothesised that manyof the factors affecting the successful adoption of new technologies aregeneric in nature. With regards to SMEs specifically, consideration ofearlier research may assist us in identifying a set of enablers andbarriers to e-business adoption. Hence, by explicitly acknowledging thecontext and prior history of research in the area, we are able to mapout the dimensions of future theoretical and empirical research ine-business adoption by SMEs. In addition to drawing together factors identified by existing research,the paper highlights the implications of network externalities for thetiming of technology investments and the returns that accrue to earlyand late adopters. It also draws attention to a number of problemsassociated with the analytical concept of ‘the SME’ when it is appliedto this area. The research proceeds by clearly defining thetechnological and organisational characteristics of the e-business modeland a brief consideration of the trends in adoption in the UK vis-à-visadoption in the other G7 countries. Together these set up a detailedconsideration of the internal and external factors influencing adoption.A qualitative approach, in the form of a detailed case study, is thenused to explore the potential usefulness of the factors that have beenidentified. The results of these findings are then drawn together in theconcluding section of the paper.economics of technology ;

    Tree Edit Distance Learning via Adaptive Symbol Embeddings

    Full text link
    Metric learning has the aim to improve classification accuracy by learning a distance measure which brings data points from the same class closer together and pushes data points from different classes further apart. Recent research has demonstrated that metric learning approaches can also be applied to trees, such as molecular structures, abstract syntax trees of computer programs, or syntax trees of natural language, by learning the cost function of an edit distance, i.e. the costs of replacing, deleting, or inserting nodes in a tree. However, learning such costs directly may yield an edit distance which violates metric axioms, is challenging to interpret, and may not generalize well. In this contribution, we propose a novel metric learning approach for trees which we call embedding edit distance learning (BEDL) and which learns an edit distance indirectly by embedding the tree nodes as vectors, such that the Euclidean distance between those vectors supports class discrimination. We learn such embeddings by reducing the distance to prototypical trees from the same class and increasing the distance to prototypical trees from different classes. In our experiments, we show that BEDL improves upon the state-of-the-art in metric learning for trees on six benchmark data sets, ranging from computer science over biomedical data to a natural-language processing data set containing over 300,000 nodes.Comment: Paper at the International Conference of Machine Learning (2018), 2018-07-10 to 2018-07-15 in Stockholm, Swede

    Innovation for resilience

    Get PDF
    Last decade is characterized by different types of crises and shocs in the socioeconomic systems, creating a turbulent context and calling for a better understanding of what the dynamic perspective of change is. For countries, regions and cities a better understanding of governance urges and calls for action
    corecore