2,650 research outputs found

    Multiplex Communities and the Emergence of International Conflict

    Full text link
    Advances in community detection reveal new insights into multiplex and multilayer networks. Less work, however, investigates the relationship between these communities and outcomes in social systems. We leverage these advances to shed light on the relationship between the cooperative mesostructure of the international system and the onset of interstate conflict. We detect communities based upon weaker signals of affinity expressed in United Nations votes and speeches, as well as stronger signals observed across multiple layers of bilateral cooperation. Communities of diplomatic affinity display an expected negative relationship with conflict onset. Ties in communities based upon observed cooperation, however, display no effect under a standard model specification and a positive relationship with conflict under an alternative specification. These results align with some extant hypotheses but also point to a paucity in our understanding of the relationship between community structure and behavioral outcomes in networks.Comment: arXiv admin note: text overlap with arXiv:1802.0039

    ParaPhraser: Russian paraphrase corpus and shared task

    Get PDF
    The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.Peer reviewe

    Multimedia information technology and the annotation of video

    Get PDF
    The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

    ISTRAŽIVANJE O POVEZIVANJU ENTITETA ZA SPECIFIČNE DOMENE S HETEROGENIM INFORMACIJSKIM MREŽAMA

    Get PDF
    Entity linking is a task of extracting information that links the mentioned entity in a collection of text with their similar knowledge base as well as it is the task of allocating unique identity to various entities such as locations, individuals and companies. Knowledgebase (KB) is used to optimize the information collection, organization and for retrieval of information. Heterogeneous information networks (HIN) comprises multiple-type interlinked objects with various types of relationship which are becoming increasingly most popular named bibliographic networks, social media networks as well including the typical relational database data. In HIN, there are various data objects are interconnected through various relations. The entity linkage determines the corresponding entities from unstructured web text, in the existing HIN. This work is the most important and it is the most challenge because of ambiguity and existing limited knowledge. Some HIN could be considered as a domain-specific KB. The current Entity Linking (EL) systems aimed towards corpora which contain heterogeneous as web information and it performs sub-optimally on the domain-specific corpora. The EL systems used one or more general or specific domains of linking such as DBpedia, Wikipedia, Freebase, IMDB, YAGO, Wordnet and MKB. This paper presents a survey on domain-specific entity linking with HIN. This survey describes with a deep understanding of HIN, which includes datasets,types and examples with related concepts.Povezivanje entiteta je zadatak izvlačenja podataka koji povezuju spomenuti entitet u zbirci teksta sa njihovom sličnom bazom znanja, kao i zadatak dodjeljivanja jedinstvenog identiteta različitim entitetima, kao Å”to su lokacije, pojedinci i tvrtke. Baza znanja (BZ) koristi se za optimizaciju prikupljanja, organizacije i pronalaženja informacija. Heterogene mreže informacija (HMI) obuhvaćaju viÅ”estruke međusobno povezane objekte različitih vrsta odnosa koji postaju sve popularniji i nazivaju se bibliografskim mrežama, mrežama druÅ”tvenih medija, uključujući tipične podatke relacijske baze podataka. U HMI-u postoje razni podaci koji su međusobno povezani kroz različite odnose. Povezanost entiteta određuje odgovarajuće entitete iz nestrukturiranog teksta na webu u postojećem HMI-u. Ovaj je rad najvažniji i najveći izazov zbog nejasnoće i postojećeg ograničenog znanja. Neki se HMI mogu smatrati BZ-om specifičnim za domenu. Trenutni sustav povezivanja entiteta (PE) usmjeren je prema korpusima koji sadrže heterogene informacije kao web informacije i oni djeluju suptimalno na korpusima specifičnim za domenu. PE sustavi koristili su jednu ili viÅ”e općih ili specifičnih domena povezivanja, kao Å”to su DBpedia, Wikipedia, Freebase, IMDB, YAGO, Wordnet i MKB. U ovom radu predstavljeno je istraživanje o povezivanju entiteta specifičnog za domenu sa HMI-om. Ovo istraživanje opisuje s dubokim razumijevanjem HMI-a, Å”to uključuje skupove podataka, vrste i primjere s povezanim konceptima

    Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

    Full text link
    Despite substantial interest in applications of neural networks to information retrieval, neural ranking models have only been applied to standard ad hoc retrieval tasks over web pages and newswire documents. This paper proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network) a novel neural ranking model specifically designed for ranking short social media posts. We identify document length, informal language, and heterogeneous relevance signals as features that distinguish documents in our domain, and present a model specifically designed with these characteristics in mind. Our model uses hierarchical convolutional layers to learn latent semantic soft-match relevance signals at the character, word, and phrase levels. A pooling-based similarity measurement layer integrates evidence from multiple types of matches between the query, the social media post, as well as URLs contained in the post. Extensive experiments using Twitter data from the TREC Microblog Tracks 2011--2014 show that our model significantly outperforms prior feature-based as well and existing neural ranking models. To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models.Comment: AAAI 2019, 10 page

    Paraphrastic Representations at Scale

    Full text link
    We present a system that allows users to train their own state-of-the-art paraphrastic sentence representations in a variety of languages. We also release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese. We train these models on large amounts of data, achieving significantly improved performance from the original papers proposing the methods on a suite of monolingual semantic similarity, cross-lingual semantic similarity, and bitext mining tasks. Moreover, the resulting models surpass all prior work on unsupervised semantic textual similarity, significantly outperforming even BERT-based models like Sentence-BERT (Reimers and Gurevych, 2019). Additionally, our models are orders of magnitude faster than prior work and can be used on CPU with little difference in inference speed (even improved speed over GPU when using more CPU cores), making these models an attractive choice for users without access to GPUs or for use on embedded devices. Finally, we add significantly increased functionality to the code bases for training paraphrastic sentence models, easing their use for both inference and for training them for any desired language with parallel data. We also include code to automatically download and preprocess training data.Comment: Published as a demo paper at EMNLP 202
    • ā€¦
    corecore