CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Authors
M Bain
B Heap
+3 more
A Krzywicki
S Schmeidl
W Wobcke
Publication date
10 November 2018
Publisher
'Springer Science and Business Media LLC'
Doi
Cite
Abstract
© Springer Nature Switzerland AG 2018. Text documents often contain information relevant for a particular domain in short “snippets”. The social science field of peace and conflict studies is such a domain, where identifying, classifying and tracking drivers of conflict from text sources is important, and snippets are typically classified by human analysts using an ontology. One issue in automating this process is that snippets tend to contain infrequent “rare” terms which lack class-conditional evidence. In this work we develop a method to enrich a bag-of-words model by complementing rare terms in the text to be classified with related terms from a Word Vector model. This method is then combined with standard linear text classification algorithms. By reducing sparseness in the bag-of-words, these enriched models perform better than the baseline classifiers. A second issue is to improve performance on “small” classes having only a few examples, and here we show that Paragraph Vectors outperform the enriched models
Similar works
Full text
Available Versions
UNSWorks
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:unsworks.library.unsw.edu....
Last time updated on 02/09/2020
Crossref
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 10/08/2021