Search CORE

3,844 research outputs found

Recommended from our members

Verifying baselines for crisis event information classification on Twitter

Author: Crow Justin Michael
Publication venue: 'Virginia Tech Libraries'
Publication date: 01/01/2020
Field of study

Social media are rich information sources during and in the aftermath of crisis events such as earthquakes and terrorist attacks. Despite myriad challenges, with the right tools, significant insight can be gained which can assist emergency responders and related applications. However, most extant approaches are incomparable, using bespoke definitions, models, datasets and even evaluation metrics. Furthermore, it is rare that code, trained models, or exhaustive parametrisation details are made openly available. Thus, even confirmation of self-reported performance is problematic; authoritatively determining the state of the art (SOTA) is essentially impossible. Consequently, to begin addressing such endemic ambiguity, this paper seeks to make 3 contributions: 1) the replication and results confirmation of a leading (and generalisable) technique; 2) testing straightforward modifications of the technique likely to improve performance; and 3) the extension of the technique to a novel and complimentary type of crisis-relevant information to demonstrate it’s generalisability

Sussex Research Online

NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings

Author: Akoju Sushma A.
Dangi Ankit
Romero Oscar J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/06/2019
Field of study

Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44\% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations.Comment: This paper will appear on SCC'19 (IEEE International Conference on Services Computing) on July 1

arXiv.org e-Print Archive

Crossref

Information Extraction in Illicit Domains

Author: Banko M.
Bauer F.
Chakrabarti S.
Kushmerick N.
Mikolov T.
Sahlgren M.
Wick M.
Zouaq A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/03/2017
Field of study

Extracting useful entities and attribute values from illicit domains such as human trafficking is a challenging problem with the potential for widespread social impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. In this paper, we propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains. Our approach uses raw, unlabeled text from an initial corpus, and a few (12-120) seed annotations per domain-specific attribute, to learn robust IE models for unobserved pages and websites. Empirically, we demonstrate that our approach can outperform feature-centric Conditional Random Field baselines by over 18\% F-Measure on five annotated sets of real-world human trafficking datasets in both low-supervision and high-supervision settings. We also show that our approach is demonstrably robust to concept drift, and can be efficiently bootstrapped even in a serial computing environment.Comment: 10 pages, ACM WWW 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Detecting Important Life Events on Twitter Using Frequent Semantic and Syntactic Subgraphs

Author: Alani Harith
Briggs Pam
Dickinson Thomas
Fernandez Miriam
Mulholland Paul
Thomas Lisa
Publication venue
Publication date: 01/01/2016
Field of study

Identifying global events from social media has been the focus of much research in recent years. However, the identification of personal life events poses new requirements and challenges that have received relatively little research attention. In this paper we explore a new approach for life event identification, where we expand social media posts into both semantic, and syntactic networks of content. Frequent graph patterns are mined from these networks and used as features to enrich life-event classifiers. Results show that our approach significantly outperforms the best performing baseline in accuracy (by 4.48% points) and F-measure (by 4.54% points) when used to identify five major life events identified from the psychology literature: Getting Married, Having Children, Death of a Parent, Starting School, and Falling in Love. In addition, our results show that, while semantic graphs are effective at discriminating the theme of the post (e.g. the topic of marriage), syntactic graphs help identify whether the post describes a personal event (e.g. someone getting married)

Open Research Online (The Open University)

Dynamics of private social networks

Author: Mifsud Jonathan
Montebello Matthew
Publication venue: Malta Chamber of Scientists
Publication date: 01/01/2014
Field of study

Social networks, have been a significant turning point in ways individuals and companies interact. Various research has also revolved around public social networks, such as Twitter and Facebook. In most cases trying to understand what's happening in the network such predicting trends, and identifying natural phenomenon. Seeing the growth of public social networks several corporations have sought to build their own private networks to enable their staff to share knowledge, and expertise. Little research has been done in regards to the value private networks give to their stake holders. This is primarily due to the fact as their name implies, these networks are private, thus access to internal data is limited to a trusted few. This paper looks at a particular online private social network, and seeks to investigate the research possibilities made available, and how this can bring value to the organisation which runs the network. Notwithstanding the limitations of the network, this paper seeks to explore the connections graph between members of the network, as well as understanding the topics discussed within the network. The findings show that by visualising a social network one can assess the success or failure of their online networks. The Analysis conducted can also identify skill shortages within areas of the network, thus allowing corporations to take action and rectify any potential problems.peer-reviewe

OAR@UM